In the previous post, we learned that the t statistic can be used in a one-sample hypothesis test for means, and the sample must come from a normally distributed population. Therefore, we usually perform a “normality test” to check whether the sample comes from a normally distributed population.
Broadly speaking, there are two ways to find out: visual methods and formal tests. Visual methods are performed by drawing a histogram and a normal P-P plot. Formal tests are conducted by applying statistical tests such as the Shapiro-Wilk test and the Kolmogorov-Smirnov test.
Let’s see what a typical histogram and normal P-P plot will look like if the sample comes from a normally distributed population.
Figure 1
Figure 1 shows a histogram that resembles a normal curve. A comparison of the shape of the histogram above with the normal curve can be observed in Figure 2.
Figure 2
If the histogram of the sample data resembles a normal curve, we can assume that the sample comes from a normally distributed population. To make a better conclusion, we need to look at the normal P-P plot of the data.
Figure 3
Figure 3 shows the normal P-P plot of the data above. As can be seen in the figure, quite a lot of sample points lie on the diagonal line. Samples that come from normally distributed populations cause the normal P-P plot to show a distribution of points around the diagonal line. In general, if many sample points are on the diagonal line or many points almost touch the diagonal line, it is a “good sign” that the sample comes from a normally distributed population. However, be careful if around the diagonal line, the sample points form a pattern similar to the letter S or an inverted S, it is quite likely that the population from which the sample comes (i.e., the original population) is not normally distributed.
Figure 4
The diagram in Figure 4 is called a box and whisker plot for the data. The fact that the median line is at the center of the box and both whiskers have almost the same length strengthens the possibility that the original population is normally distributed. The only thing that makes it doubtful is the presence of outliers above the upper whisker and below the lower whisker. Ideally, there are no outliers in the diagram. To further confirm the conclusions we make, the examination is continued with formal tests.
In addition to the visual method, normality tests can also be performed with formal tests such as the Kolmogorov-Smirnov test and the Shapiro-Wilk test. The following is the SPSS output showing the results of both tests.
In both tests, the null hypothesis is “the population is normally distributed”. From the table above, the statistic test value D(200) = 0.055 is obtained with a p-value = 0.200. Because the p-value > 0.05, we do not reject the null hypothesis. The sampling results do not support the statement that the population is not normally distributed. From this table, the statistic test value W = 0.995 is also obtained with a p-value = 0.747. Because the p-value > 0.05 we do not reject the null hypothesis. There is insufficient evidence to reject that the original population of the sample is normally distributed.
From the various examinations that have been carried out above, it is quite likely that the sample comes from a normally distributed population. Moreover, with a sample size of more than 50, the formal tests above have sufficient power to conclude deviations from normality (significant) if the original population is not normally distributed. With a sample size that large, both tests are unable to refute the null hypothesis that the population is normally distributed. Therefore, both formal tests tend to support conclusions based on visual methods.
Below are some notes or guidelines for investigating whether the sample comes from a normally distributed population.
About the certainty of making a correct conclusion
There is no type of normality test that can conclude with 100% certainty that the sample comes from a normally distributed population. For example, even though the visual and the formal methods suggest the original population of the sample is normally distributed, there is still a chance of error in this conclusion.
About the selection of methods
We should not rely on only one of the visual and formal methods. Let them complement each other and consider important notes regarding the conclusions. (Read more about sample size.) If the results of the visual and formal methods agree, then we can be quite confident in our decision. But if they do not agree, we might want to consider the sample size. Andy Field, in his book Discovering Statistics Using IBM SPSS Statistics 5th Edition (2018), even states: “If your sample size is large, don’t use the significance test of normality, don’t even worry too much about normality at all. In small samples, see if your significance test is significant, but don’t be lulled into a false sense of security if the test is not significant.” This opinion is quite extreme. Another way that is not so extreme is to give different weights of attention in making conclusions. For example, if the sample is large, we pay more attention to the visual method but also think about the “suggestions” of formal tests in concluding; if the sample is small, we do a formal test. If the test results are significant, then the visual method is optional. But if the test results are not significant, remember Field’s “advice” above to ‘not be lulled into a false sense of security’. If this happens, then the visual method becomes mandatory in making a final decision on whether the null hypothesis is rejected. (Remember, the null hypothesis is that the population is normally distributed.)
About sample size
The sample size should not be too large or too small. Formal tests are less able to detect deviations from the normal distribution when the sample is too small. On the other hand, these tests are too sensitive to detect such deviations, so that the test can conclude deviations from the normal distribution even though the sample comes from a normally distributed population. Here are some statements regarding sample size. 1) “The Shapiro-Wilk test is more appropriate for small sample sizes (<50 samples), but it can also handle sample sizes up to 2000.” (Source: https://statistics.laerd.com/spss-tutorials/testing-for-normality-using-spss-statistics.php). 2) “We use the Shapiro-Wilk test when we have a small sample size (N < 50) and the Kolmogorov-Smirnov test when we have a large sample size (N > 50).” (Source: https://www.onlinespss.com/how-to-run-normality-test-in-spss/). 3) “The Shapiro–Wilk test is more appropriate for small sample sizes (<50 samples), although it can also be used for larger sample sizes, while the Kolmogorov–Smirnov test is used for n ≥ 50.” (Source: https://pmc.ncbi.nlm.nih.gov/articles/PMC6350423/).
That is the introduction to the normality test. A more in-depth discussion of this will be described in another article on this website.
The presentation file (.pptx) can be downloaded here.
QUESTIONS
- What is meant by the normality test in this article?
- Why, in certain cases, should a normality test be performed?
- What test can be used to test normality?
- Using a Normal P-P Plot, what are the characteristics of sample points that appear to come from a normally distributed population?
- What is the null hypothesis in a formal test of normality?
- What are the criteria for rejecting the null hypothesis in a formal test?
- For small sample sizes, why is it necessary to use visual methods in addition to formal methods?