Testing for Differences in GPA Scores between Male and Female Students
The average (median) GPA scores for both the genders were found to be almost same. Medians for both the distributions were almost in the middle of the spread, indicating that both the distributions were almost normal in nature. The spread or Interquartile range of GPA scores for males was observed to slightly larger than that of the spread of GPA scores for females, indicating that middle 50% males varied greatly in GPA scores than that of the females. The lower 25% of the males obtained less marks compared to lower 25% of the females.
- The difference in average GPA scores between male and female students was compared with independent t-test at 5% level of significance. The null hypothesis assumed that there was no difference in average GPA scores between male and female students. Average GPA score for females (M = 4.75, SD = 1.18) was noted to be greater than that of males (M = 4.52, SD =1.40). The claim was tested with one tail t-test and no statistically significant difference between the average GPA score (t = 1.294, p = 0.099). Hence, the apparent claim of average GPA score for females to be greater than that of males was rejected at 5% level. The null hypothesis failed to get rejected at 5% level.
- The claim that students with higher socio-economic status (SES) tend to have stronger academic achievement was tested at 5% level of significance. It was hypothesized that average GPA scores for post-graduate and undergraduate were equal. The null hypothesis failed to get rejected at 5% level as average GPA of PG_SES (M = 5.10) was found to have no statistically significant difference (t = 0.94, p = 0.176) with average GPA of UG_SES (M = 4.89). The one tail (right) test was conducted at 5% level, and the both of the group were found to have similar GPA scores.
- The claim that GPA scores of students with undergraduate parents are higher (M = 4.89) than that of the students with parents having secondary or below qualification (M = 4.08) was tested at 5% level of significance. The null hypothesis assumed that there was no significant difference in GPA scores between students with parents as undergraduate and secondary level. The null hypothesis was rejected (t = – 4.291, p < 0.05) at 5% level, indicating that GPA scores of students with undergraduate parents are significantly higher than that of the students with parents having secondary or below qualification.
Task 2
- The correlation matrix has been provided in Table 1. It was observed that GPA score was positively associated with all the independent variables. The linear association was significant enough (r >= 0.3) between the GPA and other four independent quantitative variables.
Table 1: Correlation Matrix
- For finding the significant predictors of GPA scores, an ordinary least square regression model was constructed as below.
Table 2: Regression Model with All the Independent Variables
- HS_SCI was not a significant predictor (t = 0.852, p = 0.395) of GPA scores at 5% level of significance.
- HS_ENG was also not a significant predictor (t = 1.069, p = 0.286) of GPA scores at 5% level of significance.
- HS_MATH was a significant predictor (t = 3.054, p < 0.05) of GPA scores at 5% level of significance.
- ATAR was also not a significant predictor (t = -0.265, p = 0.791) of GPA scores at 5% level of significance.
Stepwise regression was conducted and the outputs have been presented in the following tables.
Step 1: Slope coefficient of HS_SCI reflected a statistically significant and positive linear (t = 5.46, p < 0.05) relation with GPA scores at 5% level of significance in a right tail test.
Step 2: Slope coefficient of HS_SCI reflected a statistically significant and positive linear (t = 3.29, p < 0.05) relation with GPA scores at 5% level of significance in a right tail test. The slope coefficient of HS_ENG was found to have a just statistically significant and positive linear (t = 2.051, p < 0.05) relation with GPA scores at 5% level of significance in a right tail test.
Step 3: Slope coefficient of HS_SCI was unable to have a statistically significant (t = 1.05, p = 0.295) relation with GPA scores at 5% level of significance in a two tail test. The slope coefficient of HS_ENG was also unable to have a statistically significant (t = 1.31, p = 0.192) relation with GPA scores at 5% level of significance in a two tail test. HS_MATH was the only significant predictor in the model (t = 4.74, p < 0.05) at 5% level. A right tail test was used to test the positive linear relation.
Step 4: Slope coefficient of HS_SCI was unable to have a statistically significant (t = 1.053, p = 0.293) relation with GPA scores at 5% level of significance in a two tail test. The slope coefficient of HS_ENG was also unable to have a statistically significant (t = 1.205, p = 0.229) relation with GPA scores at 5% level of significance in a two tail test. HS_MATH was the only significant predictor in the model (t = 4.44, p < 0.05) at 5% level. A right tail test was used to test the positive linear relation. Parents’ undergraduate education was observed to have no significant impact (t = 1.13, p = 0.13) on GPA sores at 5% level of significance, and a right tail test was used to check the positive linear slope coefficient of PARENT EDU.
Step 5: Slope coefficient of HS_SCI was unable to have a statistically significant (t = 1.131, p = 0.259) relation with GPA scores at 5% level of significance in a two tail test. The slope coefficient of HS_ENG was also unable to have a statistically significant (t = 0.949, p = 0.344) relation with GPA scores at 5% level of significance in a two tail test. HS_MATH was the only significant predictor in the model (t = 4.41, p < 0.05) at 5% level. A right tail test was used to test the positive linear relation. Parents’ undergraduate education was observed to have no significant impact (t = 1.11, p = 0.266) on GPA sores at 5% level of significance, and a right tail test was used to check the positive linear slope coefficient of PARENT EDU. Gender (Female as reference) was found to have no significant impact on GPA scores (t = 0.46, p = 0.646) at 5% level in a two tail test.
Testing for Differences in GPA Scores between Students with Different Parental Education Qualification
Step 6: Slope coefficient of HS_SCI was unable to have a statistically significant (t = 0.915, p = 0.361) relation with GPA scores at 5% level of significance in a two tail test. The slope coefficient of HS_ENG was also unable to have a statistically significant (t = 0.841, p = 0.401) relation with GPA scores at 5% level of significance in a two tail test. HS_MATH was the only significant predictor in the model (t = 2.88, p < 0.05) at 5% level. A right tail test was used to test the positive linear relation.
Parents’ undergraduate education was observed to have no significant impact (t = 1.12, p = 0.264) on GPA sores at 5% level of significance, and a right tail test was used to check the positive linear slope coefficient of PARENT EDU. Gender (Female as reference) was found to have no significant impact on GPA scores (t = 0.44, p = 0.641) at 5% level in a two tail test. ATAR was found to have a negative slope coefficient, and was found to have no significant linear impact on GPA scores (t = – 0.264, p = 0.396) at 5% level in a left tail t-test.
- There is no contradiction in GPA and ATAR having positive correlation, but, negative regression slope. The regression slope coefficient was negative in a multiple regression model with five more independent factors. ATAR has high positive correlation with HS_SCI, HS_ENG, and HS_MATH. Consequently, the cross impact of other variables impact the relation of ATAR and GPA, resulting in negative slope coefficient in the present model. Inclusion of ATAR in the regression model increases the coefficient of determination to 0.2209 from 0.2206 (step 5 model), indicating a marginal 0.14% increase in predictability of the GPA scores. Hence, the inclusion improves the overall fit, but the change seems to be statistically insignificant.
Task 3
Summary Report
The GPA scores of the students were found to have impact of the educational background (students’ social economic status). But, no significant difference in average GPA scores was noted for parents’ postgraduate education as compared to undergraduate education. However, parents with secondary or lower education was observed to have a negative impact on GPA scores, and the comparison with GPA scores of students with undergraduate parents yielded a statistically significant difference (t = – 4.29, p < 0.05). Gender of the students was not a significantly decisive factor for higher GPA scores (t = 1.294, p = 0.197), and average GPA scores were almost same for the two genders.
GPA scores were correlated with the independent impact factors in a positive direction. HS_SCI was noted to be a statistically significant predictor of GPA scores in a simple regression (t = 5.46, p < 0.05), and significant predictor (t = 3.29, p < 0.05) as well with HS_ENG as a predictor in a multiple regression model. Inclusion of HS_MATH made HS_SCI an insignificant impact factor (t = 1.05, p = 0.295), probably due to the huge statistical impact of Math on GPA scores. It was possible to conclude that students with higher marks of mathematics were significantly oriented towards the science courses. Hence, marks in science was a good predictor.
but was suppressed by impact of mathematics scores. Australian Tertiary Admission Rank was observed to have positive significant correlation with GPA scores, and had high positive correlation with scores in other three subjects. But, in the Multiple Regression Model, due to strong presence of Mathematics, significance of ATAR scores was insignificant. Hence, inclusion was ATAR was an option but not an essentiality.
The best regression model would be the model in step 3 with HS_SCI, HS_ENG, and HS_MATH as predictors. The model would be able to explain 21.5% variation in GPA and adjusted R-square due to inclusion of the independent factors would change by only 5%, to 20.5%. ATAR, PARENT EDU, and gender were excluded from the model. These variables have insignificant impact on the GPA scores of the students, especially in presence of mathematics as a predictor.