Descriptive Statistical Analysis of Male and Female Heights in Different Countries
The prime objective of this study was to investigate if there were any relationship between the heights of male and female. It was hypothesized that the increased height of female population in a country would be associated with increase in height of male population.
For this purpose, average male and female heights from 199 different countries was collected. Data contained average height of male and female measured in both centimeters and feet for all 199 countries. The data was first analyzed through descriptive statistical processes. The relationship between the height of male and the female population was initially analyzed through a scatter plot and the Pearson’s product moment correlation. Further a regression analysis was performed to investigate whether the height of male population of a country could be predicted through the height of the female population. It was endeavored to develop a least squared regression line which could be used as a mathematical model to help predict the average height of male population in a country through the average height of its female population.
At first the to understand the characteristics of the data a descriptive statistical approach was implemented to analyze all the four variables in the data. The analysis mainly focused on the measures of center for each variable and the measures of spread. Measures of center included the mean, median and the mode and measures of spread included range, sample variance, standard deviation and coefficient of variation (Sarka, 2021). The statistical outcomes are reported in table 1.
Male Height in Cm |
Female Height in Cm |
Male Height in Ft |
Female Height in Ft |
|
Mean |
173.09 |
160.94 |
5.68 |
5.28 |
Coefficient of variation |
0.03 |
0.03 |
0.03 |
0.03 |
Median |
173.53 |
160.62 |
5.69 |
5.27 |
Mode |
170.67 |
164.73 |
5.6 |
5.29 |
Standard Deviation |
4.95 |
4.08 |
0.16 |
0.13 |
Sample Variance |
24.50 |
16.62 |
0.03 |
0.02 |
Range |
23.65 |
19.45 |
0.78 |
0.64 |
The average height of male in the sample of the countries was 173.09 cm. The average of a data is simply its arithmetic mean (George & Mallery, 2018). The median heigh of male for all countries in the sample was 173.53 cm The median is that observation in a data which occurs in the middle when all observation in the data have been arranged in ascending or descending order (Kaur, Stoltzfus & Yellapu, 2018). This indicated that the height of male population in 50 % of the countries was 173.53 cm or greater. The mode was 170.67 cm. Mode of a data is the observation which occurs the maximum number of times in a data (Conner & Johnson, 2017). Thus, the height of male population in majority of the countries was 170.67 cm. The sample variance of male height was 24.50. The standard deviation of the male heigh about the observed mean was 4.95 cm. Standard deviation is a measure of dispersion of the observations in a data about the mean of the data. The range of male height was computed to be 23.65 cm. Range is simply the difference between the maximum and minimum observation in a data (Haden, 2019). The coefficient of variation was 0.029.
In measurements of feet, the average male height was 5.68 ft. The sample variance was 0.03. The standard deviation about the mean male height was 0.16 ft. The median height was 5.69 ft. The modal male height in the sample was 5.6 ft which indicated that the height of the male population in most of the countries was 5.6 ft. The range in male height was 0.78 ft. The coefficient of variation was 0.029. the coefficient of variation is the ratio of the standard deviation to the mean (Pélabon et al., 2020).
Analysis of Relationship between Heights of Male and Female Population
The average height of the female population across the countries in the sample was 160.94 cm. The sample variance was 16.62. The standard deviation in height of the female heights in the countries about the observed mean was 4.08 cm. the median female height was 160.62 cm which indicated that female height of 50 % of the countries in the data was greater than or equivalent to 160.62 cm. The modal female height was 164.73 cm which indicated that most of the countries had female population of height 164.73 cm. The coefficient of variation for height of female population in the sample was 0.025.
In measurements of feet, the average female height was 5.28 ft. The sample variance was 0.02. The standard deviation about the mean female height was 0.13 ft. The median height was 5.27 ft. The modal female height in the sample was 5.29 ft which indicated that most of the countries had female population with height of 5.29 ft. The range in female height was 0.64 ft. The coefficient of variation was 0.029.
It was observed that average height of the male population in the sample was greater than the average height of female population.
Pearson Product Moment Correlation Coefficient Was Computed To Investigate The Relationship Between This Method Will Help Understand Strength Of Relationship between the height of males and females (Hazra & Gogtay, 2016). Also, the direction of the of the relationship could be inferred through this method (Schober, Boer & Schwarte, 2018).
Formula used for computing Pearson’s correlation coefficient can be given as follows –
Where, x and y are observation of two separate independent random variables.
The coefficient of determination (r2) is obtained by squaring the correlation coefficient (Chicco, Warrens & Jurman, 2021).
The correlation coefficients for all variables in the data with respect to each other are presented in the table below.
Table 2: Correlation Coefficients
Male Height in Cm |
Female Height in Cm |
Male Height in Ft |
Female Height in Ft |
|
Male Height in Cm |
1 |
|||
Female Height in Cm |
0.929 |
1 |
||
Male Height in Ft |
1 |
0.928 |
1 |
|
Female Height in Ft |
0.928 |
1 |
0.927 |
1 |
There was strong positive relationship between male height and female height when assessed for the measurements in centimeters. The correlation coefficient between the male and female height was 0.929. Correlation coefficient between two variables lying between 0.7 and 1 is considered to represent a strong correlation (Akoglu, 2018). The coefficient of determination was computed to be 0.8626. Thus, 86.26 % of the variation in the heigh of males could be explained by height of female. Similar observation was made when checked for the association between height of male and female in measurements of feet. The measurement of feet can be expressed as a linear transformation of the measurement of height in centimeter.
It is known that, 1 cm = 0.0328084 foot approximately.
The correlation coefficient between male and female height in measurements of feet was 0.927 indicating strong and positive association between the two.
Figure 1: Scatter plot showing the relationship between male height and female height measured in cm
Multiple R |
0.93 |
|||
R Square |
0.86 |
|||
Adjusted R Square |
0.86 |
|||
Standard Error |
1.84 |
|||
Observations |
199 |
|||
ANOVA |
||||
df |
SS |
MS |
F |
|
Regression |
1 |
4184.84 |
4184.84 |
1237.24 |
Residual |
197 |
666.33 |
3.38 |
|
Total |
198 |
4851.17 |
||
Coefficients |
Standard Error |
t Stat |
P-value |
|
Intercept |
-8.42 |
5.16 |
-1.63 |
0.10 |
Female Height in Cm |
1.13 |
0.03 |
35.17 |
0.00 |
The overall regression was statistically significant, F (1, 197) = 1237.24, p < .05. This indicated that the coefficient associated with independent variable was different from 0. The least square regression line was obtained using the given formula,
Regression Analysis to Develop Least Squared Regression Line
The equation was,
Male height = -8.42 + 1.13 (Female height)
In this case y was the male height which was the response variable and x was the female height which was the predictor. This above least squared regression equation can be used to predict average male height of a country using the average female height (Sarstedt & Mooi, 2019).
From the equation it can be said that increase of for every increase in female height by 1 cm, it could be expected that the male height in the country can increase by 1.13. The p-value associated with coefficient of female height was less than 0.05 (level of significance) which indicated that impact of female height on male height was statistically significant (Carroll & Ruppert, 2017).
Conclusion
The above report discussed all the analytic findings from the male and female height data across 199 different countries in the world. On an average the height of male population was found to be more than that of the female population in the sample. A correlation analysis was performed to investigate the association between height of male and female. It was found that male height and female height across the 199 countries were strongly and positively associated with one another. A scatter plot revealed that as height of females increased the height of males also increased. This indicated that there was strong linear relationship between them. A regression analysis was performed to investigate if the male height could be predicted through the height of females. The regression was statistically significant. Through the regression analysis a least square linear regression equation was obtained. Computations revealed that for every increase in female height by 1 cm, it could be expected that the male height in the country can increase by 1.13. the analysis supported the hypothesis that increased height of female population in a country would be associated with increase in height of male population.
References
Akoglu, H. (2018). User’s guide to correlation coefficients. Turkish journal of emergency medicine, 18(3), 91-93.
Carroll, R. J., & Ruppert, D. (2017). Transformation and Weighting in Regression. Chapman and Hall/CRC.
Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, e623.
Conner, B., & Johnson, E. (2017). Descriptive statistics. American Nurse Today, 12(11), 52-55.
George, D., & Mallery, P. (2018). Descriptive statistics. In IBM SPSS Statistics 25 Step by Step (pp. 126-134). Routledge.
Haden, P. (2019). Descriptive statistics. The Cambridge Handbook of Computing Education Research, 102-131.
Hazra, A., & Gogtay, N. (2016). Biostatistics series module 6: correlation and linear regression. Indian journal of dermatology, 61(6), 593.
Kaur, P., Stoltzfus, J., & Yellapu, V. (2018). Descriptive statistics. International Journal of Academic Medicine, 4(1), 60.
Pélabon, C., Hilde, C. H., Einum, S., & Gamelon, M. (2020). On the use of the coefficient of variation to quantify and compare trait variation. Evolution Letters, 4(3), 180-188.
Sarka, D. (2021). Descriptive statistics. In Advanced Analytics with Transact-SQL (pp. 3-29). Apress, Berkeley, CA.
Sarstedt, M., & Mooi, E. (2019). Regression analysis. In A Concise Guide to Market Research (pp. 209-256). Springer, Berlin, Heidelberg.
Schober, P., Boer, C., & Schwarte, L. A. (2018). Correlation coefficients: appropriate use and interpretation. Anesthesia & Analgesia, 126(5), 1763-1768.