Research in communication science frequently tests whether a variable influences another variable. For example: 1) Does the number of social media posts affect audience engagement? 2) Does the length of a message affect the listener’s ability to remember it? 3) Does the credibility of a spokesperson affect persuasiveness? To answer these questions, we can apply regression models in statistics. The simplest regression model in statistics is the simple linear regression.
Simple linear regression has at least two goals: testing the effect of a variable on another variable and predicting the value of one variable based on the value of another variable. The variable that is hypothesized to be influenced is called the dependent variable or the variable being explained, while the variable that is hypothesized to affect another variable is called the independent variable or explanatory variable.
In simple linear regression, it is assumed that the relationship between the dependent variable (name it Y) and the independent variable (name it X) is expressed in the population regression function (PRF) as follows: E(Y|X) = β0 + β1X. In the equation, β0 and β1 are called the intercept and the slope, respectively. E(Y|X) is the average value of Y when X has a certain value. For example, let Y represent monthly expenditure (in million IDR), X represent monthly income (in million IDR), and suppose that the population regression equation is E(Y|X) = 2 + 0.3X. If a family’s income is 10 million IDR per month, then the average family expenditure is 5 million IDR per month. [Note: Y = 2 + (0.3 x 10) = 5]. However, not all families with an income of 10 million IDR per month spend 5 million IDR per month. For each family i (whose income is X), the amount of expenditure is Yi = 2 + 0.3X + ui. In the equation, ui is called the stochastic disturbance or error term. This value measures the deviation of the actual value of Y from its average. In this example, if a family’s income is 10 million IDR per month and their expenditure is 7 million IDR per month, then for this family, u = 2 million IDR per month. However, another family with the same income of 10 million IDR per month and 4 million IDR per month expenditure, u = – 1 million IDR per month. Thus, the value of u is different for each family at the same income level. In simple linear regression, it is assumed that ui is normally distributed with a mean of zero.
Testing the influence of a variable on another variable is useful. If the influence is significant, we can manipulate the value of the dependent variable by manipulating or changing the value of the explanatory variable. Suppose that the length of a press release significantly affects the amount of media coverage. Consequently, we can increase media coverage by changing the length of the press release.
TEST OF SIGNIFICANCE OF INFLUENCE
In this test, there are two important equations involved, namely the population regression function (PRF) and the sample regression function (SRF) in stochastic form.
PRF in stochastic form:
SRF in stochastic form:
It is impossible for us to determine the actual value of the beta coefficients in the PRF. These values can only be estimated using the beta coefficients in the SRF. These values are obtained by processing the sampling results. With the help of software, such as SPSS, we will get the values of and
.
The following are the steps for hypothesis testing to check for influence.
H0: β1 = 0 [X does not affect Y.]
H1: β1 ≠ 0 [X affects Y.]
Significance level: α = 0.05
Test statistic: with degrees of freedom ν = n – 2 and
is the standard error of
.
H0 rejection criterion: Reject H0 if the p-value < α.
Calculate/obtain the t-value from the samples and its p-value, then make a decision based on the criterion above.
EXAMPLE
A marketing manager is considering whether an additional budget is required for promotional activities to generate more sales. To decide, he collected some samples of sales and promotion costs (in million IDR) in 8 marketing areas for the product, and got the following results.
At a significance level of 0.05, do promotional activities affect sales?
Answer
Let X = promotional expenditures and Y = sales.
PRF:
SRF:
H0: β1 = 0 [Promotional activities do not affect sales.]
H1: β1 ≠ 0 [Promotional activities affect sales.]
Significance level: α = 0.05
Test statistic: with degrees of freedom ν = n – 2
Criterion for rejecting H0: Reject H0 if the p-value < 0.05.
SPSS Output:
From the table above, the following are obtained.
As a result, the sample regression function is
The p-value for promotional expenditures is 0.040 < 0.05.
As p-value < α, reject H0.
Conclusion: Promotional activities have a significant effect on sales.
PROBLEM
What is the relationship between the amount spent per week on recreation and the number of family members? Do larger families spend more money on recreation? A sample of 10 families in the Chicago area showed the following figures for the number of family members and the amount spent on recreation per week.
Can we conclude that the number of family members has a “codirectional” effect on the amount spent on recreation? Note: “codirectional” here means that more family members result in more spending on recreation.