It is quite valuable to consider descriptive and inferential statistics when interpreting quantitative data. Descriptive statistics includes concepts such as variance, median and average, which provide summary views of a dataset. These are “essential for understanding quantitative and mixed-method studies” (McMillan, 2007, p. 120).

However, for an individual classroom or teacher, perhaps even a department, these concepts may be less worthwhile compared to other methods of examining data, such as a histogram. Although a histogram is less easily translatable into simple text or a specific achievement goal, this histogram allows teachers and administrators a good summary view of overall achievement while still maintaining a decent picture of disaggregated test scores. Some descriptive statistics might be more helpful for comparing data by question or topic rather than by student. Certainly, it is nice to be able to quantify how many standard deviations a particular student’s total achievement might differ from the average or median, but a good teacher should already note a student’s relative performance intuitively. Descriptive statistics could also be very helpful in comparing aggregate question (or standard, or strand) achievement. This would answer such questions as: “On what areas did the class as a whole underperform?”

McMillan (2007) provides support for such methods. He provides an example of a frequency distribution being utilized to examine critical thinking skills of a particular group. When this is displayed via a frequency polygon or histogram, it can allow observations such as an identification of particularly high or low scores.

At a district level, measures of central tendency are a bit more useful. Data for this larger group is probably big and noisy, so histograms or graphs of individual achievement can become impractical. With this amount of data, percentile rankings and scatterplots considering specific correlations come into play.

Evidence-based test content is a “representative sample of larger domain [which] demonstrates the extent to which the sample of items or questions in the instrument is representative of some appropriate universe or domain of content or tasks” (McMillan, 2007, p. 132). This includes an expert examination of the representativeness of questions, strengthening the validity of the evidence. Standardized instruments provide more validity than those that are “locally devised” (McMillan, 2007, p. 136), making the case for state-wide evaluations such as the SOLs.

Finally, regression analysis could be used to further analyze the student data based on demographics, teacher, school funding and a variety of other factors. Predictive statistics from such regressions can be useful in determining what does and does not contribute to achievement. Also, if the model seems to “fit” the district well, these models could help identify teacher quality (albeit controversially) or help explain achievement results in other ways. The regression also reports descriptive statistics on itself, including r-squared (shows how much of the results model is actually explaining), modified r-square (penalizes regular r-squared if the model has too many variables; i.e., if there is too much variance, the model is not really as good as it appears), and t-statistics (shows whether the predictive coefficients are likely to not just be random correlations). These are a few of most the important results of statistical models, and it is important to know what they mean so that we can evaluate whether a model’s explanation of results is likely to be worth considering.

McMillan, J. H. (2007). Educational Research: Fundamentals for the Consumer 6th ed.

Pearson.

## Leave a Reply