Skewness
In this article we shall explore how to compare data sets, deciding which measures of spread and location to use when describing data, and interpreting measures in the context of the question.
We use a variety of measures to compare and describe data sets. When comparing and data it is important to have at least two of the first relevant information from the following list.
- A measure of location
- A measure of spread
- Skewness
Comparing data
In a certain year the values of daily sales were recorded for a shop. The following table shows the data.
Sales | Number of days |
---|---|
1 – 200 | 180 |
201 – 400 | 70 |
401 – 700 | 53 |
701 – 1000 | 17 |
1001 – 2000 | 4 |
- Draw a histogram to represent these data
- The shop owner wants to compare the daily sales. Sate with reason whether the shop own should use the median and the interquartile range or the mean and the standard deviation to compare the daily sales.
- To draw a histogram we will need to calculate the frequency density and class width for each group.
Calculating class width
To calculate the class width for each group we must use the class boundaries. For example; for the 201 – 400 group;
class width = 500.5 – 200.5 = 300Calculating frequency density
To calculate the frequency density for each group we use the following formula
Frequency density = frequency/class widthWe use a table by adding the class with and frequency density as show below;
Sales Number of days Class width Frequency density 1 – 200 180 200 0.9 201 – 400 70 200 0.35 401 – 700 53 300 0.18 701 – 1000 17 300 0.057 1001 – 2000 4 1000 0.004 We can now draw a histogram.
[IMAGE] - The data us skewed therefore median and interquartile range.
The company runs two manufacturing lines A and B which make rods 3cm in diameter. Random samples are taken from each of the lines A and B and the diameters measured. The result are shown in the table below.
Mean diameter | Standard deviation of diameter | |
---|---|---|
A | 3 | 0.015 |
B | 3 | 0.05 |
-
The company wishes to close one of the lines down. State with reason which of the lines you would recommend to close down.
-
We must think about the context of the question. Having a very small standard deviation is not always a good thing
In manufacturing the rods; the rods should be as uniform as possible i.e the diameters should be the same for all created rods. In this case a larger standard deviation means a greater difference/spread in the diameters of the rods created while a small standard deviation signifies very low variation. Therefore should be closed as has the greatest standard deviation as is unreliable
In two examinations (maths and physics) students scored marks out of 75. The mean and standard deviation for both examinations were calculated. The results are shown in the table below.
Mean marks | Standard deviation | |
---|---|---|
Maths | 65 | 18 |
Physics | 65 | 6 |
- State with a resoan which of the papers is better for enabling an examination board to set fair grade boundaries
-
We must think about the context of the question. Examinations boards decide what grade boundaries to use depending on whether the students found the papers difficult. A low standard deviation will indicate that all students had similar reactions and responses to the questions which would make it difficult to decide whether they found it difficulty or not.
Think about it if a number of students sat an examination an all happened to get the same mark. Lets same all students got 10 out of 75 or better all students got 75 out of 75 the standard deviation for all the marks would be zero since the mean mark is the same mark that everyone scores. There would be no way of telling whether all the students who sat the paper were just clear to get 75 or just worse to get 10 we need a way of telling how clear the students were. Having a high variation in the data i.e different marks ranging from 0 to 75 we can be sure there was a mixture of clear and worse students in the examination. students getting very different marks (between 0 and 70) will result in a high standard deviation.
Standard deviation is the measure of the spread of observations from the mean
Having a high standard deviation will indicates neither all students were clever nor worse in the exams. So the examination board can be sure when making judgement whether the paper was hard or easy
Therefore maths is a better paper to make a judgement from since the results have the highest standard deviation.