Mode, median, mean
In this article we shall explore the three measures of location used to describe the centre of a set of data, the mode, median, and mean.
We can use the measure of location to describe the centre of a set of data by using a single number. This number is known as the measure of location or the average.
There are three measures of location that can be used to describe data, these are the mode, median, and the mean. In this article we shall be using these measures to describe data.
Mode
For each of the following sets of data find the mode;
- [IMAGE]
- [IMAGE]
- [IMAGE]
- [IMAGE]
- Mode = 9
- 9 occurs twice, the other numbers occur just once.
- Mode = 27 and 33
Both 27 and 33 occur twice. There are two modes for this data. This is called bimodal.
- There is no mode. All the numbers occur once.
Median
Now we shall explore the median. Below is a quick summary when working with median.
- Median is the middle value when the data is put in order.
- If there are n observations; divide n by 2. When n/2 is a whole number find the mid-point of the corresponding term and the term above. When n/2 is not a number round the number up and pick the corresponding term.
- [IMAGE]
- [IMAGE]
- We first need to put the numbers in order. So we have;
[IMAGE]Median = 6
We can see the middle number easily from the list. We could have also found it by calculating its position;
[IMAGE]
There are 7 values therefore;
[IMAGE]
Remember when the term is not a whole number we round the number up and pick the corresponding term. The median is therefore the fourth term when the list is ordered. - Again we have to put the numbers in order.
[IMAGE]
There are 10 values; therefore 10/2 = 5th term. This is a whole number, so we take the mid-point of the fifth and sixth term.
[IMAGE]
Mean
Now we shall explore the mean. Below are the key points you need to remember when working with the mean.
- Mean is the sum of all the observations divided by the total number of observations
- The symbol Σ (the Greek letter s) can be used to represent the words “sum of” and the symbol _ can be used to represent the words “the mean of the observations” x1, x2, x3, x4
- The mean is given by the formula
[IMAGE]
_ is the symbol for the mean of a sample. Μ is the symbol used for the mean of a population.
[IMAGE]
Answer:
[IMAGE]
Combining means
Means for different sets of data can be combined to find a new mean. If set A of size n1 has mean x1 and set B of size n2 has mean x2, then the mean of the combined set of data A and B is;
[IMAGE]
[IMAGE]
[IMAGE]
- Find the mean and median of the data
-
There was an error in the survey and the figure 48.6 should be 18.6
Write down what effect this will have on the median and mean.
-
Mean
[IMAGE]
Median
To find the median we know median is the middle number when the observations have been arranged in order.
[IMAGE]Looking at the mean and median answers we have found above we can see that; the mean is higher than all but one piece of data. It is therefore not a good measure of the data. The median is more representative. - The median will stay the same but the mean will decrease to 17.9
To find the new mean we simply subtract (48.6 – 18.6) from the Σx and divide by seven again;
[IMAGE]
Choosing the right measure of location
In this section we shall explore how to choose the right measure of location to use in each situation. Below is a quick summary and tips of when to use mode, median or mean.
Mode
Mode is used when data is qualitative, or quantitative and you may have a single mode or a bimodal. Mode is not a very useful measure of location if all the observations occur once.
Median
Median is used for quantitative data. The median is not affected by extreme values so it is useful when you have extreme values in the observations.
Mean
Mean is used for quantitative data and takes in account all the observations in the data. It therefore gives a true measure of the data. It is not very useful when you have extreme values. The result will be affected by extreme values.
In statistics you’ll be required to decide the best measure of location to use in particular situations. The measure that you’ll choose will depend on what you’re trying to achieve.
- Find the mode, median and mean of all nine workers and their manager.
In each of the following situations state and write down a reason why, which on mean, mode and median you use.
- When asked the typical hourly rate of pay for the company.
- When trying to persuade a prospective employee to work for the company.
-
Mode = 10
Median = 10
[IMAGE]
- The value that most people get is £10. Therefore in a situation the best measure to choose is the mode or median.
- To impress a prospective employee we must use the highest measure of location therefore we must use the mean of £13.5 because it is a higher value.
Frequency distribution tables
You will be required to calculate the mean, mode, and median for data presented in a frequency distribution table. When data has been summarised as a frequency distribution we use the following formula;
[IMAGE]
[IMAGE]
Find the following
- the mode
- the median
- the mean
-
Mode = 17.5
To find the mode we find the observation with the greatest frequency
- Adding a cumulative frequency column will be helpful in finding the median and mean.
[IMAGE]
The total number of observations is the toal number of the frequencies i.e; the last value in the cumulative frequency column. We know that;
[IMAGE]
We therefore check the cumulatice frequency with an observation containing 50. This is 17;
Mean = 17. - [IMAGE]
We must multiply each observation with the frequency and add them to find Σfx and then divide by the total frequency;
[IMAGE]
Grouped data
In this section we shall explore how to calculate the mean, modal class and median for grouped data.
When the data is presented as a grouped frequency table, specific data values are lost. This means when we work out the mean, median and the modal class we’re working out estimates rather than specific values. Below is a quick summary of what you should know when working with grouped data.
Summary
- When a sample of data is summarised as a grouped frequency distribution;
[IMAGE]
…where x is the mid-point of the group. - We use interpolation to find the median. We first divide n by 2 and apply interpolation to find the median observation
- The modal class is the class with the highest frequency.
[IMAGE]
- Find the modal class
- Estimate the mean length of the pine cone.
- Estimate the median length of pine cone.
- Modal class = 44 – 46
- To find the mean we must multiply the frequency with the mid-pont of each interval.
[IMAGE] - [IMAGE]
The median therefore lies between 44 – 46 but we don’t know the exact value. We can estimate the median by using interpolation. Let m represent median. We setup a diagram shown below.
[IMAGE]
a and b are the class boundaries and the bottom values c and d are the cumulative frequency.We use the fact that the two fractions a/b and c/d are equivalent
[IMAGE]
…therefore;
[IMAGE]
Median = 44.3
[IMAGE]
Estimate
- the mean of correct answers
- the median number of correct answers
- State and give a reason whether the mean or the median is a better representative of the number of correct answers
- [IMAGE]
- [IMAGE]
- There is an extreme value. Mean is affected by extreme values therefore the median is a better representation of the number of correct answers.
Coding
Coding can be used to make the numbers easier to work with when the data values are large. There are many different ways of coding which may normally come in the form;
[IMAGE]
…where a and b are numbers to be chosen.
- Find the mean of the following lengths, x mm.
[IMAGE] - Using the following coding to find the mean of the data
- [IMAGE]
- [IMAGE]
- [IMAGE]
-
- We must use the coding to find the coded data;
[IMAGE]Mean of coded data = 14
…therefore mean of original data can be found by;
[IMAGE]Mean of original data = 140
We put the mean of the coded data equal to the coding and solve to find the mean of the original data. -
[IMAGE]
We have used the coding to find the coded value for each observation for example for the first observation 10.
[IMAGE]Mean of coded data = 40
To find the mean of original data we subsitute in the mean of coded data into y in the coding and find x;
[IMAGE]Mean of original data = 140
To find the mean of original data we subsitute in the mean of coded data into y in the coding and find x;
[IMAGE] -
[IMAGE]
Mean of coded data = 4
Mean of original data
[IMAGE]
Mean of original data = 140
- We must use the coding to find the coded data;
Data is coded using;
[IMAGE]
The mean of the coded data is 24.2. Find the mean of the original data.
[IMAGE]
Mean of original data is 1108.