Summary data and raw data

Data that summarize all observations in a category are called summary data. The summary could be the sum of the observations, the frequency of their occurrence, their mean value, and so on. This is in contrast to raw data where each row in the worksheet represents an individual observation. Examples of summary data:

·    Mean weight of all babies born at the community hospital for each month of the past year

·    Average test scores of all children in the third grade at each of the four local elementary schools

·    Total number of production errors for each of four bottling plants

·    Quarterly revenue for your company

This type of data is generally recorded as a column of labels and a corresponding column of summary values, as in the example below:

Plant

Production Errors

Jamestown

106

Clinton

127

Albany

186

Buffalo

155

By contrast, raw data lists each observation separately. Recorded as raw data, the example above might look like this:

Plant

Albany

Albany

Buffalo

Clinton

Buffalo

Jamestown

...

In this example of raw data, "Jamestown" would be listed 106 times, "Clinton" 127 times, and so on.

Depending on your situation, it may make more sense to collect data in one of these two formats. Keep in mind that not all analyses accept data in summary form.