Handling data --- Data --- Pictograms --- Graphs --- Mean, median, mode --- Sorting

Imagine that we have gathered some data. It's a set of numbers. What can we do with it? We can make a pictogram or sort the data. But is there anything we can do with the numbers themselves to find something out about them?

The most obvious thing to do is to find the smallest number, or **minimum**, and the largest number or **maximum**. This will give us the range of the numbers.

However, this doesn't tell us anything about the numbers in between. Are most of the numbers close to the minimum, or most close to the maximum? Are the minimum or maximum exceptional? In some sports competitions with judges, they throw out the highest and minimum scores in case they don't match the rest of the scores!

One way to start understanding how the numbers are positioned within the minimum and maximum is to find the **median**. This is the middle number. To get this, you need to sort the data into order, and count how many there are. If there are an odd number, it's easy to find the middle one. If there are an even number, then you must find the mean of the two middle numbers.

You can also find the middle of the first half of numbers, or **lower quartile**, and the the middle of the second half of numbers, or **upper quartile**. Box-and-whisker plots use medians and quartiles.

When you collect data, you often find that some of it is the same. If you measure how tall a group of people are to the nearest centimetre, you are likely to find several people of the same height. Then there will be less people a centimetre lower, or higher, and so on. Obvious this number which happens more than any other in the set is important. This is the **mode**. When you make a make a pictogram, the mode will stick out most.

Both the median and the mode are single numbers which exist in the set (unless you have to calculate the median). The median is only interested in having the same number of numbers below it and above it. It isn't interested in what those numbers are. The mode is only really interested in the numbers that are the same. It ignores totally all single numbers. So while these are valuable, it would get useful to have a number which described this set of numbers, which used every number in the set, and took the same interest in the value of each. This is the **mean**

The arithmetic mean is often called the average. You calculate it by adding up all the numbers in the set and dividing by the number of numbers. So if there are **5** numbers in the set, you add them up and divide by **5**.

The geometric mean is a different kind of mean. You multiply the numbers together (instead of adding them) and take the appropriate root. So for **5** numbers in the set, you multiply them and take the **5**th root. These aren't the only means either. But 'mean' by itself means arithmetic mean.

It's easy to get median, mode and mean confused, partly because they start with *m*. But they are also ways of describing a number which is somewhere in the middle of the set. The median **is** the middle number, but may be closer to the minimum than it is to the maximum (or vice versa), if most of the numbers are clustered round one end. The mode is the most common number in the set, so it could be anywhere in the set. However, many sets have a normal distribution, which means that the mode is is the middle. Since the mean is a calculation which uses every number in the set, it reflects the value of each number. But one high or low number may affect the mean (although not by much if there are a lot of numbers).

Here is a use for medians. A box-and-whisker plot is a way of finding out about the distribution of a set of numbers. Are they spread out or clumped together? Where are the clumps?

The numbers have been listed in order.

They are marked on the number line as red squares.

© Jo Edkins 2006 - Return to Numbers index