Research questions and variables
- Weekly minutes spent on Moderate to Vigorous Physical Activity variable histogram by the Overweight status of the student.
The data for both overweight status categories seems to be skewed to the right, which shows that few students were engaging into moderate to vigorous physical activities.
- A statistical test to be used in testing the above question.
The best statistical test to be used is independent t –test because ‘non-overweight’ and ‘overweight’ groups do not depend on each other for the event occurrence. In this case, their variances will be checked to determine whether criteria to be used is equal or unequal variances. In addition, the source dataset has been randomly obtained from the population, hence having minimal statistical errors.
- Independent t – test will be used in determining whether there is a difference between the minutes used by overweight and non-overweight students on moderate to vigorous physical activities weekly.
Step 1
Before conducting an independent t-test: –
- It is assumed that the two variables do not have equal variances. Therefore, this test will have to be tested to check whether the assumption is true.
Results
The p-value for the analysis is 0.8825 and a confidence interval of [0.6889, 1.3668]. The ratio of variances statistic was obtained as 0.9767.
Decision rule
The p-value is much higher compared to the significance level, 0.05, hence failing to reject the null hypothesis. Since the variances are not equal, we use the formula for independent t-test that specifies that variances are unequal.
Step 2
Null hypothesis: There is not a difference between minutes spend by non-overweight and overweight students on moderate to vigorous physical activities.
Alternative hypothesis: There is a significant difference between the time (minutes) spent by overweight and non-overweight students on MVPA.
The significance level (alpha) to be used in the test will be 0.05.
Step 3
Since it is an independent test, we will use the t-test distribution which can or may also be approximated suing the normal distribution because the sample sizes are greater than 100.
Step 4
Results
T statistic is 0.5655 with 278 degrees of freedom and a p-value of 0.5722.at 95% confidence interval, the Confidence level is [-0.484, 0.875].
Step 5
The p-value is greater than 0.05, hence failing to reject the hypothesis. There is a significance difference between time used by overweight and non-overweight students in MVPA.
Collecting data from the same individuals or units for purpose of checking whether there is a significant difference make the data paired. Therefore, the best statistical approach to test whether there is a difference between the weights of university students at enrolment and graduated will be paired t-test. In this case, we will use one-sided paired t-test because the alternative hypothesis is checking whether the Australian University students are heavier at graduation compared to enrolment period. Checking the assumptions such as correlation and variances will be critical to conducting the test (Mickey, Dunn and Clark, 2010).
- A 95% confidence intervals for the difference in average GPA between the overweight categories.
At 95% confidence level, the confidence interval is [0.1198872, 0.5636473] for the difference in average GPA for the two overweight groups.
- Interpretation of the confidence interval
The above C.I for the difference in GPA between the two overweight categories for the students indicates that the values of the mathematical difference will life within 0.1199 and 0.5636 range at 5% significance level. Therefore, if the same study was conducted severally using a different sample from the population, the difference in GPA for the two groups will in most of the times lie within the range.
- Using the confidence interval, it the test statistically significant?
Using the above confidence level, we can conclude that there a statistical significance between the values does not include the value zero’. Therefore, we conclude that the difference of GPA between the two groups is not equal to zero.
The minimum sample size (n) for comparing two sample means is calculated using the formula shown below:
- β = 0.2
- α = 0.05
- σ = 0.9
- difference = 0.5
- power = 1-β
20 students will be the sample size required to detect a significant mean GPA difference of 0.5 and achieve study power of 80% for the Australian University students (Mickey, Dunn and Clark, 2010).
- Showing the relationship between overweight and sex by the use of contingency table.
Sex |
male |
female |
Totals |
Not-overweight |
93 (54.28%) |
78 (45.62%) |
171 |
overweight |
35 (31.11%) |
74 (67.89%) |
109 |
- Does there seem to have any evidence of association with the data?
Yes. More female students are overweight as compared to their male counterparts. Also, among the students who are not overweight, approximately 55% are males showing that fewer females are not overweight compared to males.
- Are the chi-square requirements met?
The requirements of a chi-square test have been met because the data being used in the analysis is categorical and it has been presented in a contingency table.
- Concern: Does the proportion of overweight Australian University students differ by sex in the population.
Step 1
The key assumptions:
- The chi-square test assumes that the data in use is not related to avoiding obtaining void results.
- When the sample size is small and one of more cells of the contingency table has value < 5, then it would be recommended to use Fisher’s exact test to determine the association.
Step 2
Hypothesis
Null hypothesis: The difference in proportions of overweight students does not differ by gender of the student.
Alternative hypothesis: The difference in proportions of overweight students differ by gender.
Step 3
The sampling distribution to be used in this test is chi-square distribution (~χ) with 1 degree of freedom because it is a 2*2 contingency table.
Step 4
At one degree of freedom, the chi square value is 12.428 and a p-value of 0.000429.
Step 5
The chi-square value is 12.428 with 1 degree of freedom and a p-value of 0.0004229. The p-value is less than 0.001 showing that the test is highly significant. Therefore, the null hypothesis is rejected and concluding that there is a significant difference in proportions of overweight students by sex (Curtis and Youngquist, 2013).
- Margin of error = 0.1
- Confidence level = 95%
- Z value at 0.95 = 1.64
- P = 0.4
n = {(Z0.95)2 * (p(1-p))}/d2 (Mickey, Dunn and Clark, 2010)
= (1.642 * 0.6 *0.4)/0.12
= 64.55
= 65 students.
The minimum number of students to be included in detecting a significant difference between proportions of overweight university students by sex would be 65 individuals at 95% confidence level.
References
Curtis, K. and Youngquist, S. (2013). Part 21: Categorical Analysis: Pearson Chi-Square Test. Air Medical Journal, 32(4), pp.179-180.
Mickey, R., Dunn, O. and Clark, V. (2010). Applied statistics. 1st ed. Hoboken, N.J.: Wiley-Interscience.