In the previous post Introduction to The Measures in Statistics we have been introduced to the concept of measures of location. One of such measures is the mean, or more specifically the arithmetic mean. In the present post we elaborate more on the formulae used to find the mean. It is worth pointing out that the selection of the proper formula depends on how the data in hand are presented. Throughout this post, to shorten the phrase arithmetic mean we continue to use the terms mean or average.

 

Case 1: The data are not grouped

In this case, we are given n data, namely X1, X2, X3, …, Xn. To calculate the mean, apply the following formula.

\bar{X} = \frac{X_1 + X_2 + X_3 + \cdots X_n}{n}

 

Example 1

A sample of 5 customers of a post office revealed the following waiting time (in minutes): 3, 5, 10, 6, 6. Find the mean waiting time.

Answer

In this case n = 5. Let x1`= 3 minutes, x2 = 5 minutes, x3 = 10 minutes, x4 = x5 = 6 minutes. The mean waiting time is:

 

Case 2: n data consist of k distinct values with k < n

Suppose that we have n data with k distinct values x1, x2, …, xk, where k < n. For every i ∈ {1, 2, 3, …, k} let the frequency of occurrence of xi be fi. The average can be determined by applying the formula below.

\bar{X} = \frac{\sum_{i=1}^k f_i \cdot x_i}{n} where n = \sum_{i=1}^k f_i

This kind of average is a special case of the weighted mean, where each fi acts as the weight of xi.

 

Example 2

A sample of 20 students revealed that 8, 7, and 5 of them got the IQ scores of 90, 100, and 120, respectively. Find the average IQ score.

Answer

The data can be presented in the following table.

The average IQ score is \bar{X} = \frac{8 \cdot 90 + 7 \cdot 100 + 5 \cdot 120}{20}=101.

The calculation can also be tabulated as follows.

 

Case 3: The data are presented in a frequency distribution table

In a frequency distribution table, data are classified into several classes. To each class there is assigned a symbol called class interval. In addition, every class interval has end numbers, which are called class limits. The smaller number is called the lower class limit and the larger is called the upper class limit.  More specifically, each class interval is of the form “LCL – UCL” where LCL and UCL denote the lower class limit and the upper class limit, respectively. The table also shows the class frequency of each class, which is the number of observations in each class. A typical example of the frequency distribution table is shown in Example 3. To calculate the mean of the data presented as such, we can apply the following formula.

\bar{X} = \frac{\sum_{i=1}^k f_i \cdot M_i}{n} …………………………………………………………………………………………………………………………………………………………………………… (*)

where n = \sum_{i=1}^k f_i and for every i ∈ {1, 2, …, k} M_i = \frac{UCL_i + LCL_i}{2}

In the formula, fi is the frequency of class i, whereas UCLi and LCLi are the the upper and lower class limits of class i, respectively. The notation Mi means the class mark of class i.

 

Example 3

The following table shows the distribution of the distance from home to work travelled by 50 employees every day. Find the average distance.

Answer

In the frequency distribution table there are 5 classes. The class interval of the first class is 1-3 and its class mark is M_1 = \frac{1+3}{2} \: km = 2 \: km. The class interval of the second class is 4-6 and its class mark is M_2 = \frac{4+6}{2} \: km = 5 \: km. Continuing this way, we have the following.

To apply (*) we can extend the table to the right, adding the column for fi⋅Mi. See the table below.

(For brevity, we have deleted the columns for UCLi and LCLi as they are no longer needed.)

Now, get the sums of all fi and fi⋅Mi. This gives:

As the table shows, n = \sum_{i=1}^5 f_i = 50 and \sum_{i=1}^5 f_i \cdot M_i = 313. By (*), we have \bar{X} = \frac{313}{50} \: km = 6.26 \: km.

So, the average distance from home to work travelled by the employees every day is 6.26 kilometers.

 

References

Lind, D.A., W. G. Marchal, S. A. Wathen, Statistical Techniques in Business and Economics 10th Ed., McGraw-Hill Irwin, 1999

Spiegel, M. R., Theory and Problems of Statistics, McGraw-Hill Inc., 1981

 

Leave a Reply

Your email address will not be published. Required fields are marked *