Regression Analysis Results
The shape of the distribution of Examination scores indicates that-
- Most of the students has scored in the interval 90-99. Only 25% students have scored less than 70.
- The distribution of examination score is left-skewed as the right tail of the distribution is more elongated than its left tail.
- For this distribution, the value of mode is greater than median while mean has the least value.
Ans 2.
Table 2: Simple linear regression model of ‘Unit Price’ and ‘Supply’
Ans 2. a)
The total frequency of the sample is 41 (=40+1).
Ans 2. b)
The level of significance of the regression model is assumed to be 5%. The outcome of the simple linear regression model shows that the p-value of ‘Unit Price (X)’ (0.175) is greater than the level of significance.
Therefore, the independent and dependent variables are insignificantly linearly associated to each other with 95% probability. No strong evidence for significant association between independent and dependent variables are found in this regression model.
Ans 2. c)
The statistic ‘R-square’ is also known as ‘Co-efficient of determination’. In this model, the statistic R2 is found to be 0.048. It indicates that the predictor variable ‘Unit Price (X)’ can only explain 4.8% variability of ‘Supply (Y)’.
Ans 2. d)
Pearson’s co-efficient of correlation (r) is 0.21908 that refers a weak positive correlation between ‘Unit price’ and ‘Supply’ as (r is in-between 0.0 to 0.3). Therefore, these two variables are not significantly linked.
The Pearson’s correlation coefficient (r) in the range of 0.0 to 0.3 refers a weak but positive correlation. Such kind of a weak and positive link or correlation is observed between ‘Unit price’ and ‘Supply’ as per the calculated co-efficient of correlation (r = 0.21908).
Ans 2. e)
The linear regression equation is estimated in the following,
‘Supply (Y)’ = 54.076 + 0.029* ‘Unit price (X)’ …………… (1)
Given that the Unit price = $50,000, therefore, Supply = (54.076 + (0.029*50000)) = 1504.076.
Hence, the estimated supply is found to be 1504.076 units.
Ans 3.
Table 3: One-way ANOVA table of four Programs
The hypotheses are given following at 5% level of significance-
Null hypothesis (H0): The average of scores of all the four programs are equal to each other.
Alternative hypothesis (HA): The average scores of all four programs are unequal to each other; that is, there exists at-least one inequality in the average scores of the four programs.
It is observed that the mean productivity level of workers of Program C is greater than other three programs. The p-value of the F-statistic (6.14) is 0.0056 which is less than the assumed significance level (0.05). Therefore, the null hypothesis can be with rejected with 95% confidence. The mean productivity level of all the four workers are unequal to each other.We can recommend that the line programmers of Program C are more acceptable because of its higher effectiveness than other three Programs A, B and D.
Ans 4.
Table 4: Multiple regression model of ‘Sales’ as well as ‘Price’ and ‘Advertising’
The estimated equation of the multiple regression model is-
‘Sales’ (y) = 3.597615086 + 41.32002219 * ‘Price’ (x1) + 0.013241819 * ‘Advertising’ (x2) ………. (2)
Ans 4. b)
The level of significance of the F-test in the multiple regression model is assumed to be 0.1. The regression model assumes ‘Weekly sales of products’ (y) as response factor whereas ‘Unit price of the competitor’s product’ (x1) and ‘Advertising expenditures’ (x2) as predictor factors. The calculated p-value of the multiple regression model is 0.052643614 that is less than 0.1. Therefore, the null hypothesis can be rejected with 90% confidence. As per the logic of hypothesis testing, the multiple regression model is found to be statistically significant at 10% significance level.
Ans 4. c)
The calculated p-values of the independent variables are respectively-
- ‘Price’ (036289) and
- ‘Advertising’ (0.969694).
The calculated p-value of predictor ‘Price’ is less than 0.05 and ‘Advertising’ is more than 0.05. Therefore, ‘Advertising’ is the insignificant variable and ‘Price’ is the significant variable.
Table 5: Simple linear regression model of ‘Sales’ and ‘Price’
Ans 4. d)
‘Advertising’ is proved to be an insignificant factor; hence, it is omitted from the multiple regression model.
As per the current regression model, ‘Sales’ is regarded as response factor and ‘Price’ is regarded as predictive factor. The estimated equation as per the new simple linear regression model-
‘Sales’ (Y) = 3.58178844 + 41.6030534 * ‘Price’ (X1) …………………………. (3)
Ans 4. e)
The value of slope (β) of the simple linear regression model is found to be 41.6030534. It shows a positive association between the predictor factor ‘Price’ and the response factor ‘Sales’. The positive relationship between dependent and independent variable is established as per the regression model. Also, the predictor variable ‘Sales’ increases by 41.6 units while price increases by 1 unit. Equally, ‘Sales’ decreases by 41.6 units while price decreases by 1 unit.