Global searching is not enabled.
Skip to main content
Page

Range And Measures Of Tendency

Completion requirements
View
Range

The range is defined as the difference between the maximum and minimum data values.

Example: Farmer Green has measured the height (in meters) of his calves and has the following data set, arranged from smallest to biggest:

0,7; 0,8; 1,3; 1,5; 1,5

The range of heights is calculated by deducting the lowest from the highest.

1,5m minus 0,7m = 0,8m

Measures of Central Tendency

Most measurements are distributed randomly in the range that they were collected in, but most of them concentrate around some sort of “central tendency” or “average”. Three measures of central tendency are the mean, median and mode.

The Main Characteristics of the Mode, the Median, and the Mean

Fact no.

Mode

Median

Mean

1

It is the value that appears the most often.

It is the middle-most value.

Add all the values and divide the total by the number of items.

2

A distribution may have 2 or more modes. On the other hand, there can also be no mode

Each array has one and only one median.

An array has one and only one mean.

3

It cannot be manipulated algebraically.

It cannot be manipulated algebraically.

Means may be manipulated algebraically.

4

Individual values need to know to calculate the mode

Individual values need to know to calculate the mode

You can calculate it even if you do not know the individual values. You need to know the total and sample size

5

Values must be arranged from smallest to biggest.

Values must be arranged from smallest to biggest.

Values need not be ordered or grouped for this calculation.

6

Tells you what score occurs the most often.

Provides a better measure of location than the mean when there are some extremely large or small observations. Median income is used as the measure for the SA household income.

Very easy to calculate and used the most often when there is a list of numbers.

Quartiles and the 5-Number Summary

Sometimes data are not spread equally (i.e. they are skewed), making the mean senseless. In this case the median is a better representation of the central tendency. To get an idea of the spread of data, we calculate quartiles and percentiles.

Example: Consider the following list of data: 1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11

The median is the middle most value.

The median = 6       1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11

The first quartile is the value that lies in the middle of the group of data below the median.

The first quartile = 3      1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11

You can quite clearly see that the median is the same as the second quartile.

The second quartile = 6      1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11

The third quartile is the value that lies in the middle of the group of data above the median.

The third quartile is 9        1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11

The inter-quartile range is the 3rd quartile – 2nd quartile.

Here it is 9-3 = 6.

The Inter-quartile range tells us that 50% 0f data lie between the values of 3 and 9.

We can summarise the data set in a 5-number system:

  • Minimum value of the data set
  • 1st quartile
  • 2nd quartile = median
  • 3rd quartile
  • Maximum value of the data set

Example: Let us prepare a 5-number summary for the data set above:

Minimum value of the data set

1

1st quartile

3

2nd quartile = median

6

3rd quartile

9

Maximum value of the data set

11

Click here to download a handout with examples of the application of Quartiles.

Cumulative Frequency Tables

You saw in the previous activity that the calculations of large data sets is very cumbersome. There is a quicker way, namely by using cumulative frequency tables. First, let us revise frequency tables.

Example: In a certain rural area, a survey was done to see how many dogs people had. The results are listed below. Now arrange these data in a frequency table:

3; 5; 1; 3; 7; 5; 6; 5; 9; 5; 2; 4; 4; 5; 5; 8

Step 1: Draw up a tally table – indicating a tally for every time the value occurs

Step 2: The tallies are counted and recorded in the frequency column.

Frequency table:

Below is an example of a cumulative frequency table.

The cumulative frequency is calculated by adding the previous cumulative frequency total to the current frequency. The very last total will always balance with the frequency colum total.

Example: Cumulative frequency table:

To calculate the median, we use a simple formula. There is an even number of data, thus we know that the median is going to lie between two values. To find out between which values it lies, we say (16+1)/2 = 8,5. This means that the median lies between score number 8 and score number 9. Let us look at the table we constructed in more detail:

3; 5; 1; 3; 7; 5; 6; 5; 9; 5; 2; 4; 4; 5; 5; 8

Hence

1st quartile

First quartile calculation: 1st quartile lies between (16+1)/4 = 4,25

i.e. between items 4 and 5

From the table we can see that item 4 is 3 and item 5 is 4

Thus, the 1st quartile is (3+4)/2 = 3,5

2nd quartile = median

5

3rd quartile

3rd quartile calculation: The 3rd quartile lies between 3(16+1)/4 = 12,75

i.e. between items 12 and 13.

The 3rd quartile is 5,5.

Cumulative Frequency Graphs (Ogives)

Let us plot the ogive of the data in the table above.

Ogive for Vet visits in the past year:

Why Ogives are useful:

  • An ogive usually has the shape of a stretched-out S. The better the S shape, the more the data are centred around the middle scores.
  • Just by looking at the graph you can tell that the number of farmers interviewed was 16.
  • We can use the ogive to estimate quartiles:
    • Median: take (16+1)/2 = 8,5.
    • Draw a horizontal line on the graph at 8,5.
    • Draw a vertical line down from the point where your line cuts the graph’s line.
    • Read off the x-value: it is approximately 4,5.
    • 1st quartile: Take (16+1)/4 = 4,25
    • Draw a horizontal line on the graph at 4,25.
    • Draw a vertical line down from the point where your line cuts the graph’s line.
    • Read off the x-value: it is approximately 3.
    • 3rd quartile: Take 3(16+1)/4 = 12,75
    • Draw a horizontal line on the graph at 12,75.
    • Draw a vertical line down from the point where your line cuts the graph’s line.
    • Read off the x-value: it is approximately 6

Variance and Standard Deviation

The measure of how measures are scattered around the mean are variance and standard deviation.

Step 1: Calculate MEAN

Example: A farmer wanted to see which plot of land was better for growing trees, plot A or plot B. He measured the height of 7 young trees on each plot.

Plot A: 364cm, 372cm, 364cm, 368cm, 370cm, 368cm, 370cm

Plot B: 304cm, 388cm, 332cm, 432cm, 400cm, 352cm, 368cm

We calculate the mean of each group:

Plot A: 2576/7 = 368cm

Plot B: 2576/7 = 368cm

The means for both plots were the same, so we cannot gain much insight that way.

Step 2: We now calculate the deviation from the mean for each data point:

Add up all the deviations for Plot A: 4 – 4 + 4 + 0 – 2 + 0 – 2 = 0

Add up all the deviations for Plot B: 65 – 20 + 36 – 64 – 32 + 16 + 0 = 0

This is also not helping so we need to go one step further.

Step 3: Determine square of deviation, also called the Variance.

We re-write the table and add a column where we square the deviations from the mean.

  1. For the first value Plot A (364) the deviation from the mean was 4. We square this number to obtain 16. We do this for every single value.
  2. Now we add up all the squares of the deviation from the mean to give us a total. For Plot A the total is 56. For plot B it is 9872.
  3. Now we calculate the mean of the square of the deviations i.e. For Plot A we take 56 and divide by 7 (there were 7 measurements in the sample) = 8. For Plot B we divide 9872/7 = 1410

This is called the variance. Variance is defined as the mean of the square of the deviations from the mean.

Step 4: Determine the standard deviation.

If we now take the square root of the variance, we get the standard deviation.

The standard deviation for Plot A is 2,83 and for Plot B it is 37.55.

The standard deviation is defined as the square root of the mean .

Let us summarise the steps:

  • Add up all the data values
  • Find the mean.
  • Calculate the deviations from the mean for each value.
  • Square the deviations from the mean.
  • Add up all the square deviations from the mean and divide by the number of values. This gives you the variance.
  • Take the square root of the variance to obtain the standard deviation.

Let us take a closer look at the farmer’s result for his trees.

Both the variance and the standard deviation show us that there was very little variation in the trees from Plot A. The trees from Plot B, however, had hugely different growths. If the farmer needs to sell fairly uniform trees to logging companies, he would be better off to plant on Plot A. He could do further analysis to see why the trees on Plot B are so very different to one another, or he could use the land for another purpose.