Research Questions
Current Research has found out that Sydney is among one of the world’s most overvalued cities for real estate (Grybauskas & Plinkiene, 2018). Thus, Engsted, Hviid & Pedersen (2016) stated that this phenomenon makes it almost impossible for a millennial to ever own a house in Sydney. According to the Duke & Robertson (2018), the property prices in Sydney are overvalued by 50 percent compared to the average earnings. The reason for this property boom is attributed to demand, supply and the cost of money (Jones, Cowe & Trevillion, 2018). Moreover, foreign investment has acted as a key driver in boosting the high market prices of property in the City (Rogers, Lee & Yan, 2015).
To answer these questions five variables were chosen. The variables were market price, Sydney price index, annual percentage change, the total number of square meters, and age of the house (years).
Data was collected from the ABS 6416.0, Residential Property Price Indexes: Three Capital Cities. The data was collected from 2002-03 to 2016-17, thus a total of 15 years. From the five variables, the dependent variables chosen will be the market price. Conversely, the independent variables for the chosen statistical model will be Sydney price index, annual percentage change, the total number of square meters, and age of houses (years).
To determine the relationship between the dependent and the independent variables, a regression model was chosen. The regression model was chosen since it is a statistical tool which is used in establishing the connection or the relationship between variables (Kutner, Nachtsheim & Neter, 2004). Consequently, a regression model was most appropriate since it determines the relationship in scenarios where there are one dependent variable and more than one independent variables (Carol, 2017; Harrell, 2015).
To answer the research questions, the following hypothesis was developed:
Hypothesis 1
Null hypothesis (H0): There is a relationship between Sydney price index and the Sydney market property price.
Alternate hypothesis (HA): There is no relationship between the Sydney price index and the Sydney market property price.
Hypothesis 2
Null hypothesis (H0): There is a relationship between the annual percentage change in the Sydney price index and the Sydney market property price.
Alternate hypothesis (HA): There is no relationship between the annual percentage change in the Sydney price index and the Sydney market property price.
Hypothesis 3
Null hypothesis (H0): There is a relationship between the total number square in meters and the Sydney market property price.
Data Collection and Methodology
Alternate hypothesis (HA): There is no relationship between the total number square in meters and the Sydney market property price.
Hypothesis 4
Null hypothesis (H0): There is a relationship between the age of houses in years and the Sydney market property price.
Alternate hypothesis (HA): There is no relationship between the age of houses in years and the Sydney market property price.
It should be noted that hypotheses are vital since they directly link the theory through operationally defined variables which are in testable forms (Tartakovsky, Nikiforov & Basseville, 2014). Thus, according to Silverman (2015), hypotheses allows for researchers to determine the correctness of a theory through research.
Scatter plots were developed to show how one variable is affected by another (Hastie, 2017). The results are as shown:
Sydney price index vs. Sydney market property price
Figure 1: Sydney Price Index vs. Sydney property prices Scatterplot
Figure 1 shows that there was a positive correlation between Sydney property prices and Sydney Price Index. The observation can be seen on the chart since the trend line shows an uphill pattern (Park, Kim & Elmqvist, 2016). Thus, it can be concluded that the Sydney price index increases with an increase in property prices. It can also be seen that the relationship between the Sydney price index and Sydney property price is relatively strong. Consequently, it could be observed that there were no significant outliers in the data.
Annual Percentage change vs. Sydney market property price
Figure 2: Sydney Annual Percentage change Vs. Sydney property prices Scatterplot
From figure 2 above, it can be seen that there was a positive correlation between Sydney annual percentage change and the Sydney Price Index. The observation can be seen on the chart since the trend line shows an uphill pattern. Thus, it can be concluded that the Sydney price index increases with an increase in the annual percentage change in Sydney. However, it can be seen that the relationship is weak. A weak relationship can be forecasted when there is the trend line is almost flat (Wiley & Pace, 2015). Nevertheless, it could be observed that there were outliers in the data.
Sydney total number of square meters vs. Sydney property price
Figure 3: Sydney total number of square meters Vs. Sydney property prices Scatterplot
From figure 3 above, it can be seen that there was a positive correlation between Sydney’s total number of square meters and the Sydney Price Index. The observation can be seen on the chart since the trend line shows an uphill pattern. Thus, it can be concluded that the Sydney price index increases with an increase in the total number of square meters in Sydney. Similarly to figure 2, it can be seen that the relationship is relatively weak. Nevertheless, it could also be observed that there were outliers in the data.
Results
Age of houses (years) in Sydney vs. Sydney property prices
Figure 4: age of houses (years) in Sydney vs. Sydney property prices Scatterplot
From figure 4 above, it can be seen that there was a negative correlation between the age of houses (years) in Sydney and the Sydney Price Index. The observation can be seen on the chart since the trend line shows a downhill pattern (Fan & Hauser, 2017). Thus, it can be concluded that the Sydney price index decreases with an increase in the age of houses in Sydney. More so, it can be seen that the relationship is relatively weak. Nevertheless, it could also be observed that there were no significant outliers in the data.
The following tables represent the full model of the regression analysis;
Table 1: Regression Statistics
Table 2: ANOVA
Table 3: Regression model
- The least squares regression equation
From table 3 above, the least squares regression equation can be obtained as shown below:
Sydney_property_price = 548.98 + 1.96*Sydney_price_index – 5.62*Annual_%_change + 0.52*total_number_of_square_meters – 2.49*age_of_house + .
From the least squares regression equation, it can be seen that the value of the constant is 548.98. All factors kept constant, an increase in Sydney price index and the total number of squares meters increase the prices of property in Sydney by 1.96 and 0.52 units respectively. On the other hand, the annual percent change and the age of houses decrease the property prices by 5.62 and 2.49 units respectively.
- Interpretation
From the least square regression equation, a number of observations can be made. Firstly, the coefficient of the constant is 548.98. Thus, all factors kept constant, the market price of houses in Sydney is 548 thousand dollars. The fixed market price is statistically significant since the p-value is less than the alpha value of 0.05.
On the other hand, the coefficient of the Sydney price index is 1.96. Thus, keeping all other factors constant, then a unit increase in the Sydney price index increases the market prices of property in Sydney by 1.96 units. Consequently, the coefficient is statistically significant since the p-value is less than the alpha value of 0.05. Thereby, we accept the null hypothesis that the Sydney price index influences prices of property in Sydney.
The coefficient of the annual percentage change is -5.62. Thus, keeping all factors constant, a unit increase in the annual percentage change decreases the price of property in Sydney by 5.62 units. On the other hand, the annual percentage change is not statistically significant since its p-value is greater than 0.05. Thereby, we reject the null hypothesis that annual percentage changes influences prices of property in Sydney.
Regression Analysis
The coefficient of the total number of square meters is 0.52. Keeping all factors constant, a unit increase in the total number of square meters increases the property prices by 0.52 units. Similarly to the annual percentage change coefficient, the total number of square meters coefficient is not statistically significant since the p-value is greater than the p-value of 0.05. Thereby, we reject the null hypothesis that land size in a total number of square meters influences prices of property in Sydney.
Finally, the coefficient of the age of houses coefficient is -2.49. Thus, keeping all factors constant, a unit increase in the age of a house decreases the property price by 2.49 units. Similarly to the Sydney price index coefficient, the age of house coefficient is statistically since tits p-value is equal to or less than the alpha value of 0.05. Thereby, we accept the null hypothesis that the age of houses influences prices of property in Sydney.
- Coefficient of determination
The coefficient of determination (R squared) of the regression model is 0.79. The coefficient of determination is used to determine the extent to which the dependent variable can be predicted (Zhang, 2017). Thus, the R squared of 0.79 suggests that 79% of the variance in the property prices (Y) can be predicted from the independent variables.
On the other hand, the adjusted coefficient of determination can also be observed in table 1. The adjusted coefficient of determination is used in determining how strong there is a linear relationship between two or more variables (Finch, Bolin & Kelley, 2016). The adjusted coefficient of determination in the regression model is 0.71. Thus, it can be stated that 71% of the variability in the model is explained by the variables in the model while 29% of the variability is explained by factors which are not in the model.
- 95% confidence intervals
The 95% confidence intervals of the four variables are as shown in figure 3 above. The 95% confidence interval of the regression model intercept is between 368.21 and 729.75. Thus, it can be said that we are 95% confident that the mean market price of property in Sydney is between 368.21 and 729.75 thousand dollars.
On the other hand, the 95% confidence interval of the Sydney price index is between 0.66 and 3.26. Therefore, we are 95% confident that the mean of the Sydney price index is between 0.66 and 3.26. Similarly, the 95% confidence interval of annual percentage change is between -12.84 and 1.6. Therefore, we are 95% confident that the mean of the annual percentage change is between 12.84 and 1.6.
Conclusion
The 95% confidence interval of a total number of square meters is between -0.2 and 1.24. Therefore, we are confident that the mean of the total number of square meters is between -0.2 and 1.24. More so, the 95% confidence interval of the age of the house is between -5.01 and 0.03. Therefore, we are 95% confident that the mean of the age of the house is between -0.51 and 0.03.
Re-estimated model: Relationship between the market price and the land size in total number of square meters
The results of the re-estimated regression model are as shown below:
Table 4: Regression statistics
Table 5: ANOVA
Table 6: Regression Model
- Comparing the original and the re-estimated model
The regression between the market price and the land size in total number of square meters has a coefficient of determination of 0.1. Thus, the coefficient of determination suggests that 10% of the variance in the market prices can be predicted by land size total number of square meters. On the other hand, based on the adjusted coefficient of determination (adjusted r squared = 0.03), 3% of the variability in the model can be explained by factors in the model while 97% is explained by factors which are not in the model. Thus the model does not show a goodness of fit.
The coefficient of the total number of square meters is 0.56. Thus a unit increase in the land size increases the market prices by 0.56 units. Thus, it can be said that when the size of land is considered independently, it has a positive impact on the market price. However, introducing other factors makes the size of land to have a negative effect. However, land size still continues to be statistically insignificant even when other factors are not considered (its p-value is greater than 0.05).
- Predicting the market price
The predicted market price of a house (in $) with a building area of 400 square meters is:
Market price = 659.14 + 0.56 * 400
Market price = 883.58
Thus, the market price is $883,580.
References:
Carroll, R.J., 2017. Transformation and weighting in regression. Routledge.
Duke, J. and Robertson, J., 2018. Sydney property prices tipped to fall 10 per cent in 2018. The Sydney Morning Herald, 2 January.
Engsted, T., Hviid, S.J. and Pedersen, T.Q., 2016. Explosive bubbles in house prices? Evidence from the OECD countries. Journal of International Financial Markets, Institutions and Money, 40, pp.14-25.
Fan, C. and Hauser, H., 2017. User-study based optimization of fast and accurate Mahalanobis brushing in scatterplots.
Finch, W.H., Bolin, J.E. and Kelley, K., 2016. Multilevel modeling using R. Crc Press.
Grybauskas, A. and Pilinkien?, V., 2018. Real Estate Market Stability: Evaluation of the Metropolitan Areas Using Factor Analysis. Engineering Economics, 29(2), pp.158-167.
Harrell, F.E., 2015. Cox proportional hazards regression model. In Regression modeling strategies (pp. 475-519). Springer, Cham.
Hastie, T.J., 2017. Generalized additive models. In Statistical models in S (pp. 249-307). Routledge.
Jones, C., Cowe, S. and Trevillion, E., 2018. Property Boom and Banking Bust: The Role of Commercial Lending in the Bankruptcy of Banks. John Wiley & Sons.
Kutner, M.H., Nachtsheim, C. and Neter, J., 2004. Applied linear regression models. McGraw-Hill/Irwin.
Park, D., Kim, S. and Elmqvist, N., 2016. Gatherplots: Extended scatterplots for categorical data. Technical Report HCIL-2016-10.
Rogers, D., Lee, C.L. and Yan, D., 2015. The politics of foreign investment in Australian housing: Chinese investors, translocal sales agents and local resistance. Housing Studies, 30(5), pp.730-748.
Wiley, J.F. and Pace, L.A., 2015. Multiple regression. In Beginning R (pp. 139-161). Apress, Berkeley, CA.
Tartakovsky, A., Nikiforov, I. and Basseville, M., 2014. Sequential analysis: Hypothesis testing and changepoint detection. Chapman and Hall/CRC.
Wiley, J.F. and Pace, L.A., 2015. Multiple regression. In Beginning R (pp. 139-161). Apress, Berkeley, CA.
Zhang, D., 2017. A coefficient of determination for generalized linear models. The American Statistician, 71(4), pp.310-316.