Analysis Of Unemployment, GNI, And Life Expectancy In New Zealand

Data Extraction from World Bank

In this research light has been shed on the health and development conditions of New Zealand. A subset of the original dataset collected from World Bank has been used for the analysis. The subset has been chosen for the simplicity of the analysis. New Zealand has been chosen as it is a moderately populated country and it was of interest to understand the health conditions of a moderately populated country. The factors that has been considered for analysis are total unemployment, gross national income and life expectancy at birth. It is known that unemployment is factor to directly affect the gross national income of a country. It might also be affecting the life expectancy of birth of an infant. The relationships between these variables have been considered in this research.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

2. Data Setup

            The data has been extracted from World Bank. The format of the data was comma delimited (.csv). As the whole analysis has been performed in R Studio, the data has been imported to R from excel. There are 26 attributes in the data over 15 years from 2001 to 2015 and on the countries of East Asia and the Pacific. The necessity of the analysis has been kept in mind and thus the data was extracted as per the needs. 2015 has been eliminated from the data as there were a lot of missing information for that particular year. Table 2.1 shows the R-Codes for data Extraction.

            Various libraries have been used in R to run the analysis. These libraries are listed as follows:

  • dplyr: This library is used for filtering the data
  • ggplot2: This is a library that is used to plot the data
  • reshape2: With the help of this library, the data can be reshaped
  • cluster: This library is used for the k-mean clustering analysis

Table 2.1: R-Codes for the Libraries Used and the Extraction of the Data

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

# Libraries necessary for analysis

library(dplyr)

library(ggplot2)

library(reshape2)

library(cluster)

# Extracting the data

data <- read.csv(file.choose(), sep = ”,”, header = TRUE, na.strings = “..”)

attach(data)

data <- filter(data, Country.Code == “NZL”)

data <- subset(data, select=-c(Country.Name, Country.Code, ï..Series.Name, X2015..YR2015.))

data <- na.omit(data)

data <- melt(data, Series.Code = “Country.Code”)

 3. Exploratory Data Analysis

            Descriptive analysis has to be conducted at first for the chosen variables or attributes. At first, analysis has been conducted on the gross national income of New Zealand. It has been obtained from the analysis that the standard deviation of the GNI for New Zealand is 8741.171, which is quite high. Thus, it indicates that the gross national income of the country is not close to the average GNI and are quite scattered. The distribution of the income is shown with the help of a boxplot in figure 3.1. Negative skewness is observed from the figure. Thus, it can be said that the GNI is higher when the population is high.

Table 3.1: R-Codes to Obtain Summary Statistics for Gross National Income

# summary statistics

summary(NY.GNP.PCAP.CD)

sd(NY.GNP.PCAP.CD)

var(NY.GNP.PCAP.CD)

# boxplot

boxplot(NY.GNP.PCAP.CD, main = “Boxplot for Gross National Income”, xlab = “Gross National Income”, col = 5)

Exploratory Data Analysis on Gross National Income and Total Unemployment

 Table 3.2: Results of the Summary Statistics for Gross National Income

Min.: 13800

1st Qu.: 22470           

Median : 28370

Mean : 27500

3rd Qu. : 31610

Max. : 41670

St. Dev. : 8741.171 Variance : 76408069

                                    Figure 3.1: Boxplot showing the distribution of Gross National Income

            The second variable on which analysis has been conducted is the total unemployment of New Zealand. It has been obtained from the analysis that the standard deviation of the total unemployment for New Zealand is 1.136, which is quite less. Thus, it indicates that the total unemployment of the country is close to the average total unemployment and are not scattered. The distribution of the total unemployment is shown with the help of a histogram in figure 3.2. Symmetricity is observed from the figure.

Table 3.3: R-Codes to Obtain Summary Statistics for Total Unemployment

# Summary

summary(SL.UEM.TOTL.ZS)

sd(SL.UEM.TOTL.ZS)

var(SL.UEM.TOTL.ZS)

# histogram

hist(SL.UEM.TOTL.ZS, data=data, main = “Histogram for Total Unemployment”, xlab = “Total Unemployment”, col = 5)

Table 3.4: Results of the Summary Statistics for Total Unemployment

Min.: 3.700  1st Qu. : 4.050  Median : 5.350 Mean : 5.207 3rd Qu. : 6.175 Max. : 6.900 St. dev. : 1.136435 Variance : 1.291483

 

The third variable on which analysis has been conducted is the Life Expectancy at Birth of an Infant in New Zealand. It has been obtained from the analysis that the standard deviation of the Life Expectancy at Birth of an Infant in New Zealand is 0.9044, which is quite less. Thus, it indicates that the Life Expectancy at Birth of an Infant in the country is close to the average Life Expectancy at Birth of an Infant and are not scattered. The distribution of the Life Expectancy at Birth of an Infant is shown with the help of a boxplot in figure 3.3. Symmetricity is observed from the figure.

Table 3.5: R-Codes to Obtain Summary Statistics for Total Life expectancy at Birth

# summary

summary(SP.DYN.LE00.IN)

sd(SP.DYN.LE00.IN)

var(SP.DYN.LE00.IN)

# boxplot

boxplot(SP.DYN.LE00.IN, main = “Boxplot for Total Life Expectancy at Birth”, xlab = “Total Life Expectency at Birth”, col = 5)

 Table 3.6: Results of the Summary Statistics for Total Life expectancy at Birth

Min.: 78.69  1st Qu. : 79.62  Median : 80.25 Mean : 80.21    3rd Qu. : 80.85    Max. : 81.41  St. dev. : 0.9043788 Variance : 0.8179011

 Relationship between two variables are shown in the second part of this analysis. Three variables are involved in this study and the relationship between two at a time will be illustrated in this part. At first, relationship between GNI and unemployment are established. Correlation analysis is conducted for this. The correlation coefficient has been obtained as 0.421 which indicates that the relationship between the two variables are moderately positive. The scatterplot in figure 3.4 illustrates the relationship diagrammatically.

 Table 3.7: R-Codes for Scatterplot and Correlation between GNI and Unemployment

# scatterplot

plot(SL.UEM.TOTL.ZS~NY.GNP.PCAP.CD, data=data, main=”Scatterplot of Gross National Income and Total Unemployment”,xlab=”Gross National Income (Per Capita)”, ylab=”Total Unemployment (% of total labor force)”, col=2, pch=19)

# correlation coefficient

cor(SL.UEM.TOTL.ZS, NY.GNP.PCAP.CD)

 Next, the relationship between Life expectancy at birth of an infant and unemployment are established. Correlation analysis is conducted for this. The correlation coefficient has been obtained as 0.985 which indicates that the relationship between the two variables are strongly positive. The scatterplot in figure 3.5 illustrates the relationship diagrammatically.

K-means Clustering Analysis

 Table 3.7: R-Codes for Scatterplot and Correlation between Life expectancy at birth of an infant and Unemployment

# scatterplot

plot(NY.GNP.PCAP.CD ~ SP.DYN.LE00.IN, data=data, main=”Scatterplot of Gross National Income and Total Life Expectancy at Birth”,xlab=”Gross National Income (Per Capita)”, ylab=” Total Life Expectancy at Birth (years)”, col=2, pch=19)

# correlation coefficient

cor(NY.GNP.PCAP.CD, SP.DYN.LE00.IN)

  Advanced Analysis

            After the exploratory analysis, an advanced analysis will be conducted on the variables. Thus, clustering analysis and regression analysis will be conducted further for the purpose of the study. Relationship between GNI and life expectancy has been very high and that with GNI and unemployment was moderate as seen from the analysis conducted so far.

4.1 Cluster Analysis

            The relation obtained above is the main cause to run the clustering analysis. The stronger variables such as GNI and life expectancy has been chosen for clustering with k-means. The values of a data frame are grouped into different clusters according to the closeness to the cluster means (Guha and Mishra 2016).

            Table 4.1 provides the R-Codes that has been used to conduct the k-means clustering analysis (Celebi, Kingravi and Vela 2013). The analysis is represented diagrammatically in figure 4.1.

Table 4.1: R-Codes for Clustering Analysis

# Data extraction for k-means clustering

data <- read.csv(file.choose(), sep = “,”, header = TRUE, na.strings = “..”)

data3 <- filter(data, Series.Code %in% c(“NY.GNP.PCAP.CD” , “SP.DYN.LE00.IN”))

data3 <- subset(data3, select = -(X2015..YR2015.))

data3 <- melt(data3, Series.Code = c(“Series.Code”,”Country.Name”,”Country.Code”))

data4 <- dcast(data3, formula = Country.Code ~ Series.Code, mean)

data4 <- na.omit(data4)

View(data4)

#  Clustering

grpdata <- kmeans(data4[,c(“NY.GNP.PCAP.CD” , “SP.DYN.LE00.IN”)],centers = 3, nstart = 10)

grpdata

o = order(grpdata$cluster)

data.frame(data4$Country.Code[o], grpdata$cluster[o])

# plotting data

plot(data4$NY.GNP.PCAP.CD, data4$SP.DYN.LE00.IN, type=”n”, xlim=c(0,50000), main=”k-            means Clustering” ,xlab=”Gross National Income”, ylab=”Life Expectancy at Birth”)

text(x=data4$NY.GNP.PCAP.CD,y=data4$SP.DYN.LE00.IN,labels=data4$Country.Code,col=grp            data$cluster+1)

 2 Regression Analysis

            To establish the nature of the strength of the relationship obtained in the correlation analysis, the regression analysis has been performed. The value of the dependent variable is predicted with the help of the independent variable with the help of this analysis (Montgomery, Peck and Vining 2015). The relationship is denoted with the help of the following formula:

y = β0 + β1x + ε

Here, x and y are the independent and the dependent variables respectively. The scale parameter β0 represents the value of y in the absence of x and β1 indicates the amount of increase or decrease in the value of y for increase in the value of x (Kabacoff 2015).

            Regression between GNI and Unemployment are established at first with unemployment as the dependent variable. The results show that 17.74 percent of the variability can be explained by GNI (R-Square). The relationship is expressed with the following equation:

Total Unemployment = 3.701 + (0.00006 * GNI) + Error

# Regression for unemployment on GNI

Reg1 <- lm(formula = SL.UEM.TOTL.ZS ~NY.GNP.PCAP.CD, data = data)

summary(Reg1)

# plotting line

plot1 <- ggplot(data, aes(x= NY.GNP.PCAP.CD, y= SL.UEM.TOTL.ZS)) + geom_point(shape=1) +         scale_x_continuous(name = “Gross National Income”) + scale_y_continuous(name =   “Total Unemployment”)+ geom_smooth(method=lm) +theme_bw()+            ggtitle(“Regression of Total Unemployment on Gross National Income”)

plot1

Residuals:

    Min      1Q  Median      3Q     Max

-1.5421 -1.0199  0.2389  0.9129  1.1836

Coefficients:

                Estimate Std. Error t value Pr(>|t|)  

(Intercept)    3.701e+00  9.790e-01   3.780  0.00262 **

NY.GNP.PCAP.CD 5.477e-05  3.404e-05   1.609  0.13360  

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.073 on 12 degrees of freedom

Multiple R-squared:  0.1774,  Adjusted R-squared:  0.1089

F-statistic: 2.589 on 1 and 12 DF,  p-value: 0.1336

 Regression between Life Expectancy at Birth and Unemployment are established next with unemployment as the dependent variable. The results show that 96.97 percent of the variability can be explained by Life Expectancy at Birth (R-Square). The relationship is expressed with the following equation:

Life Expectancy at Birth = (7.741e+01) + (0.0001 * GNI) + Error

# Regression of Life Expectancy at birth on GNI

Reg2 <- lm(formula = SP.DYN.LE00.IN~SL.UEM.TOTL.ZS, data = data)

summary(Reg2)

# plotting line

plot8 <- ggplot(data, aes(x=SL.UEM.TOTL.ZS, y=SP.DYN.LE00.IN)) + geom_point(shape=1) +             scale_x_continuous(name = “Total Unemployment”) + scale_y_continuous(name =             “Total Life Expectancy at Birth”)+ geom_smooth(method=lm) +theme_bw()+             ggtitle(“Regression of Life Expectancy at Birth on Total Unemployment”)

plot8

Residuals:     Min       1Q   Median       3Q      Max -0.24688 -0.10782 -0.02588  0.02552  0.27936  Coefficients:                Estimate Std. Error t value Pr(>|t|)    (Intercept)    7.741e+01  1.496e-01  517.33  < 2e-16 ***NY.GNP.PCAP.CD 1.019e-04  5.202e-06   19.58 1.78e-10 ***—Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.164 on 12 degrees of freedomMultiple R-squared:  0.9697,  Adjusted R-squared:  0.9671 F-statistic: 383.5 on 1 and 12 DF,  p-value: 1.783e-10

 5. Conclusion

            From the analysis, GNI and life Expectancy at birth has shown an extremely strong positive relationship. Thus, GNI has to be kept high in order to protect the infants and keep them healthy. The relationship between the other two variables were not significant.

6. Reflections

            I faced a lot of problem in handling the large dataset with a lot of missing values for conducting the research. A lot of extraction had to be done to obtain a proper subset that was fit for the analysis. 

References

Celebi, M.E., Kingravi, H.A. and Vela, P.A., 2013. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40(1), pp.200-210.

Guha, S. and Mishra, N., 2016. Clustering data streams. In Data Stream Management (pp. 169-187). Springer Berlin Heidelberg.

Kabacoff, R., 2015. R in action: data analysis and graphics with R. Manning Publications Co..

Montgomery, D.C., Peck, E.A. and Vining, G.G., 2015. Introduction to linear regression analysis. John Wiley & Sons.

Calculate the price
Make an order in advance and get the best price
Pages (550 words)
$0.00
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
How it works
Receive a 100% original paper that will pass Turnitin from a top essay writing service
step 1
Upload your instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Pro service tips
How to get the most out of your experience with Course Scholars
One writer throughout the entire course
If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.
The same paper from different writers
You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."
Copy of sources used by the writer
Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.
Testimonials
See why 20k+ students have chosen us as their sole writing assistance provider
Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.
Business Studies
Great paper thanks!
Customer 452543, January 23rd, 2023
Political science
I like the way it is organized, summarizes the main point, and compare the two articles. Thank you!
Customer 452701, February 12th, 2023
Accounting
Thank you for your help. I made a few minor adjustments to the paper but overall it was good.
Customer 452591, November 11th, 2021
Psychology
Thank you. I will forward critique once I receive it.
Customer 452467, July 25th, 2020
Education
Thank you so much, Reaserch writer. you are so helpfull. I appreciate all the hard works. See you.
Customer 452701, February 12th, 2023
Technology
Thank you for your work
Customer 452551, October 22nd, 2021
Psychology
I requested a revision and it was returned in less than 24 hours. Great job!
Customer 452467, November 15th, 2020
Finance
Thank you very much!! I should definitely pass my class now. I appreciate you!!
Customer 452591, June 18th, 2022
Political science
Thank you!
Customer 452701, February 12th, 2023
11,595
Customer reviews in total
96%
Current satisfaction rate
3 pages
Average paper length
37%
Customers referred by a friend
OUR GIFT TO YOU
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat

Order your essay today and save 15% with the discount code GINGER