Misrepresentation of Data via Statistics

Huff indicates that misrepresentation of data via statistics may be unintentional.  Perhaps this is truly valid in terms of oversight, but that does not make the validity justified.  As both a researcher or an analyst, it is appropriate to examine the data set or sampling process, and consider flaws in the study before purporting results.  Huff (1954) explains that “not all the statistical information that you may come-upon can be tested with the sureness of chemical analysis…but you can prod the stuff with five simple questions, and by finding the answers avoid learning a remarkable lot that isn’t so” (p. 122).  These issues include bias, sample size, use of raw figures as definitive measures, unproved assumptions, or impressively precise figures that defy common sense.

 

Last night I was watching Hulu and a commercial from a phone carrier discussed this issue of misrepresented statistics.  Carrier A had boasted fastest speeds, so Carrier B was fighting back to clarify that Carrier A had the fastest speeds over a small Midwestern city, but when looking at nationwide data, Carrier B was indeed the fastest and most reliable.  This is a perfect example of sample size creating a purposeful misrepresentation – customers outside of the area in question might have been misled to switch their service only to experience slower connection times.

 

As Huff (1954) explained, averages can also be misleading.  Consider the issue of raw scores (relating to one of Huff’s five questions) vs growth modeling.  Modeling teacher performance based solely on SOL scores (one test, on one day) might be flawed, especially considering the expected difference in performance between higher-level and lower-level students.  Most school districts in Virginia have recognized this, and implemented SMART Goals or other quantitative measures of assessment to provide a more holistic representation of individual teacher success.  For example, the evaluative measure for my grade-level team last year was Junior performance on the Writing SOL.  If 80% or more of students passed the assessment, our goal was considered met.  But in the instance of a teacher falling short of that measure, the district considered SGA pre- and post-tests as adequate proof of student growth.  This also accounted for variance in the number of students taking a test – say, a teacher with a 100 student class load vs one with only a 35 student class load.

 

Likewise, statistical interpretation without adequate explanation can be incredibly misleading.  Suppose you have an educator who teaches a handful of AP students and receives a 100% pass rate, and an educator who teaches a full load of standard students with an 85% pass rate.  Add to the mix an inclusion teacher with 100 students, 70 of whom pass (not statistically impressive at first compared to Teacher A, but wonderful once you learn the students are SPED).  To receive a full-picture view of this school and its subsequent educators, one must consider bias (cognitive levels), sample size, raw figures, unproved assumptions (two schools in VA have a 84% pass rate, but one has a higher number of SPED students or other needs), an so on.

 

As researchers, it is our duty to consider the factors that play into statistics before utilizing them, and by doing so we can avoid unintentional misrepresentations.

 

Huff, D. 1954. How to lie with statistics. New York, NY: W.W. Norton & Co., Inc.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: