Descriptive Statistics Analysis
An Australian business school intends to understand the difference in performance of students studying “Statistics for Business”. The key parameters identified for the study is the underlying gender of the students along with the international student status. Besides, the business school also intends to find ways based on the given study in order to improve the performance of students in the subject.
A sample data comprising of the marks of 60 current students has been provided. Both descriptive statistics and inferential statistics techniques have been applied to the given data. The objective of the descriptive statistics techniques is to summarise the key trends in marks across the gender and international student status. However, the descriptive statistics techniques do not offer any conclusion about the population. As a result, inferential statistics techniques such as hypothesis testing have been used in order to derive meaningful conclusion about the population based on the given sample data.
The objective of the given section is to analyse the data provided using suitable statistical tools in order to answer the various questions raised.
Class 1 marks – The average or mean marks for this class stands at 70.87. The median marks are 66.85 which imply that 50% of the students have marks lower than or equal to 66.85. Further, no two students have the same marks due to which the mode is not defined. Also, the shape of the marks distribution is asymmetric with a positive skew or tail on the right side as is apparent from the histogram. The dispersion in the marks is low to medium considering the various parameters such as standard deviation and range (Flick, 2015).
Class 2 marks – The average or mean marks for this class stands at 53.66. The median marks are 66.85 which imply that 50% of the students have marks lower than or equal to 51.52. Further, no two students have the same marks due to which the mode is not defined. Also, the shape of the marks distribution is asymmetric with a positive skew or tail on the right side as is apparent from the histogram. The dispersion in the marks is low to medium considering the various parameters such as standard deviation and range (Hair et. al., 2015).
Class 3 marks – The average or mean marks for this class stands at 43.80. The median marks are 66.85 which imply that 50% of the students have marks lower than or equal to 44.18. Further, no two students have the same marks due to which the mode is not defined. Also, the shape of the marks distribution is asymmetric with a negative skew or tail on the left side as is apparent from the histogram. The dispersion in the marks is quite low considering the various parameters such as standard deviation and range (Hastie, Tibshirani and Friedman, 2011).
Class 1 Marks
Marks of domestic students – The descriptive statistics highlight that the average marks scored by a domestic student is 55.58 which is marginally higher than the median marks of 54.37. Further, since no two students have scored the same marks, hence mode does not exist. The shape of the marks distribution would not be symmetric owing to existence of positive skew to the extent of 0.78. This leads to a rightward tail as is apparent from the histogram. The dispersion for the marks is medium considering the range and other aspects such as standard deviation keeping the mean in mind (Hillier, 2006).
Marks of international students – The descriptive statistics highlight that the average marks scored by an international student is 56.65 which is significantly higher than the median marks of 51.52. Further, since no two students have scored the same marks, hence mode does not exist. The shape of the marks distribution would not be symmetric owing to existence of positive skew to the extent of 0.87. This leads to a rightward tail as is apparent from the histogram. The dispersion for the marks is on the higher side considering the range and other aspects such as standard deviation and variance (Hastie, Tibshirani and Friedman, 2011).
- The objective here is to graphically represent the marks of only class 1 students that need to be distributed into two categories namely foreign and domestic students. The requisite graphs are highlighted as follows
The requisite descriptive statistics for the marks of domestic and international students in class 1 are as highlighted below
Domestic Students Class 1 Marks – The average marks scored by domestic students in class 1 amount to 68.76. The median marks for these students are slightly lower at 66.20. The non-coincidence of the above two values implies that the marks distribution is not symmetric. This is further supported by the fact that positive skew is present in the data which has resulted in a rightward tail. Also, the dispersion in the marks is medium to high considering the range of marks from 52 to 98. Besides, the variance and standard deviation values also support this conclusion (Eriksson and Kovalainen, 2015).
International Students Class 1 Marks – The average marks scored by international students in class 1 amount to 72.28. The median marks for these students are slightly lower at 71.15. The non-coincidence of the above two values implies that the marks distribution is not symmetric. This is further supported by the fact that positive skew is present in the data which has resulted in a rightward tail. Also, the dispersion in the marks is medium to high considering the range of marks from 51 to 98. Besides, the variance and standard deviation values also support this conclusion (Flick, 2015).
- The objective is to analyse using appropriate statistical tool if there exists any difference in the marks of male and female students. Since, the conclusion has to be drawn for the population based on the sample data, hence hypothesis testing needs to be performed using the provided sample data (Hillier, 2006).
Class 2 Marks
The requisite hypotheses are as highlighted below.
Null Hypothesis (H0): µM = µF i.e. the average marks of male and female candidates does not show any statistically significant difference
Alternative Hypothesis (HA): µM ≠ µF i.e. the average marks of male and female candidates does show statistically significant difference
The relevant test statistic for the given hypothesis test would be t since the population standard deviation is not known for either of the genders. Also, considering the alternative hypothesis, it is apparent that the given test would be two tailed. Besides, the two samples are independent; hence a two sample independent t test would be deployed (Lieberman et. al., 2013). The output of the test obtained from Excel is highlighted below.
From the above test output, it is apparent that the relevant two tail p value is 0.004. Assuming a level of significance as 5%, it is apparent that the p value is lesser than the level of significance. As a result, the available evidence is sufficient to cause rejection of null hypothesis and acceptance of alternate hypothesis. Thus, it may be concluded that based on the sample data, the difference between the marks of male and female is statistically significant. Hence, average marks across the two genders are different (Hair et. al., 2015).
- The objective of the given statistical test is to analyse if significant difference exists between the marks of students belonging to different classes. It is noteworthy that for the data collected, there are three classes which ought to be simultaneously compared. A t test cannot be used here since it is appropriate only when two samples need to be compared. Hence, the appropriate test for the given objective would be ANOVA or Analysis of Variance. Also, considering the given data, a single factor ANOVA ought to be conducted (Fehr and Grossman, 2013).
The relevant hypotheses are as stated below.
Null Hypothesis (H0): µclass1 = µclass2 = µclass3 i.e. the average marks across the three classes does not show any significant difference.
Alternative Hypothesis (HA): Atleast one of the classes has average marks which is not equal to the other class
The single factor ANOVA analysis has been performed through Excel and the relevant output has been pasted below.
From the above ANOVA output, it is apparent that the p value has come out to be 0.00. Assuming a level of significance as 5%, it is apparent that the p value is lesser than the level of significance. As a result, the available evidence is sufficient to cause rejection of null hypothesis and acceptance of alternate hypothesis. Thus, it may be concluded that based on the sample data, that the average marks across the three classes is not the same and there is atleast one class which tends to have average marks significant different from the other (Eriksson and Kovalainen, 2015).
The objective of this statistical test is to determine whether the performance of the domestic students and international students differs or not. Hypothesis test would be conducted in order to ascertain if there is any statistically significant difference in the marks of these two groups of students.
Class 3 Marks
The relevant hypotheses are as stated below.
Null Hypothesis (H0): µDomestic = µInternational i.e. the average marks of domestic students and international students does not exhibit any significance difference.
Alternative Hypothesis (HA): µDomestic ≠ µInternational i.e. the average marks of domestic students and international students does exhibit statistically significance difference.
The relevant test statistic for the given hypothesis test would be t since the population standard deviation is not known for either the domestic or international student marks. Also, considering the alternative hypothesis, it is apparent that the given test would be two tailed. Besides, the two samples are independent; hence a two sample independent t test would be deployed. The output of the test obtained from Excel is highlighted below (Harmon, 2011).
From the above test output, it is apparent that the relevant two tail p value is 0.799. Assuming a level of significance as 5%, it is apparent that the p value is greater than the level of significance. As a result, the available evidence is insufficient to cause rejection of null hypothesis and acceptance of alternate hypothesis. Thus, it may be concluded that based on the sample data, the difference between the marks of domestic and international students is not statistically significant. Hence, average marks of both the groups (i.e. domestic and international students) do not differ (Koch, 2013).
The objective of this statistical test is to determine whether the performance of the domestic students and international students belonging to class 1 differs or not. Hypothesis test would be conducted in order to ascertain if there is any statistically significant difference in the marks of these two groups of students studying in class 1.
The relevant hypotheses are as stated below.
Null Hypothesis (H0): µDomestic(1) = µInternational(1) i.e. the average marks of domestic students and international students in class 1 does not exhibit any significance difference.
Alternative Hypothesis (HA): µDomestic(1) ≠ µInternational(1) i.e. the average marks of domestic students and international students in class 1does exhibit statistically significance difference.
The relevant test statistic for the given hypothesis test would be t since the population standard deviation is not known for either the domestic or international student class 1 marks . Also, considering the alternative hypothesis, it is apparent that the given test would be two tailed. Besides, the two samples are independent; hence a two sample independent t test would be deployed. The output of the test obtained from Excel is highlighted below (Liebermen et. al., 2013).
Marks of Domestic Students
From the above test output, it is apparent that the relevant two tail p value is 0.636. Assuming a level of significance as 5%, it is apparent that the p value is greater than the level of significance. As a result, the available evidence is insufficient to cause rejection of null hypothesis and acceptance of alternate hypothesis. Thus, it may be concluded that based on the sample data, the difference between the class 1 marks of domestic and international students is not statistically significant. Hence, average class 1 marks of both the groups (i.e. domestic and international students) do not differ (Hillier, 2016).
Conclusion
From the above analyses, host of conclusions can be made regarding the marks scored by students in “Statistics for Business”. Firstly, there is a decreasing trend in the average marks as one move from class 1 to class 3. Also, neither of the marks distribution are symmetric since skew is present. Besides, the average marks scored by international students are greater than the corresponding average of the domestic students. This is also true for Class 1 marks whereby the international students on an average tend to outscore their domestic counterparts. Further, with regards to the difference in performance across gender, it may be concluded that the difference is statistically significant. The females tend to outperform their male counterparts based on the given sample and this difference is quite significant in statistical terms.
Also, based on the sample data and relevant inferential statistical technique in the form of hypothesis testing, it may be concluded that average marks across the different classes is not the same and show variation. Additionally, it has also been revealed that the difference found in the average marks of international and domestic students is not statistically significant and hence their respective performance can be expected to be comparable. The above result has also been validated when instead of the marks of all the classes, only the class 1 marks are taken into consideration.
Based on the descriptive statistics and also hypothesis test, it is apparent that the average marks across classes do not remain same and infact declines. This implies that there is a variation of performance of students across class implying that there is a significant difference either in terms of student quality or faculty quality. Also, considering the fact that average marks for the foreign and domestic students do not show any statistically significant difference, it implies that for the given classes there ought to be no preference for domestic or international students based on their expected performance. As a result, it may be recommended that the international student status should not be an avenue of focus for the business school considering that the influence of this on student’s marks in statistics subject is not significant.
However, the same cannot be said about gender since there is a significant difference in the average marks scored by males and females. More research would need to be done to indicate the contributory reasons in this regards. But in order to improve the performance, it might be beneficial to promote more academic interaction between the males and female students. This could potentially be achieved by taking measures which would enhance the interaction between the two genders and may lead to more uniform performance in the subject.
References
Eriksson, P. and Kovalainen, A. (2015) Quantitative methods in business research, 3rd ed. London: Sage Publications.
Fehr, F. H. and Grossman, G. (2013) An introduction to sets, probability and hypothesis testing. 3rd ed. Ohio: Heath.
Flick, U. (2015) Introducing research methodology: A beginner’s guide to doing a research project. 4th ed. New York: Sage Publications.
Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015) Essentials of business research methods. 2nd ed. New York: Routledge.
Harmon, M. (2011) Hypothesis Testing in Excel – The Excel Statistical Master, 7th ed. Florida: Mark Harmon.
Hastie, T., Tibshirani, R. and Friedman, J. (2011) The Elements of Statistical Learning. 4th ed. New York: Springer Publications.
Hillier, F. (2016) Introduction to Operations Research, 6th ed. New York: McGraw Hill Publications.
Koch, K.R. (2013) Parameter Estimation and Hypothesis Testing in Linear Models, 2nd ed. London: Springer Science & Business Media.
Lieberman, F. J., Nag, B., Hiller, F.S. and Basu, P. (2013) Introduction To Operations Research. 5th ed. New Delhi: Tata McGraw Hill Publishers.