The Research Design
Of the three research designs – qualitative, quantitative and Mixed-Method Design, this study used quantitative research. Therefore, numerical data was used in the analysis to provide insights that helped in achieving the objective of the research (Bloomfield and Fisher 27-30). Also, the research employed deductive reasoning research design whereby general ideas were narrowed down to specific ideas about the research objectives in order to make conclusions (Alam et al. 175). The quantitative research design was appropriate because through Excel analysis, the numerical data provided useful insights that are easily interpreted to gain an understanding of how UK stock markets affects UK real estate /house prices index.
Major sources of data for most researches are primary and secondary sources. Primary sources of data collection enable collection of data for the first time hence provides first information to the researcher (Proenca et al. 256-263). Some primary sources of data are interviews, experiments, observations, focus groups, surveys (Proenca et al. 256-263). Secondary sources of data provide a ready-made data whereby a researcher collects data that has already been collected to be used in research (Johnston 619-626). Some sources of secondary data are internet, websites, journals, newspapers, published books, government records (Johnston 619-626). This paper used secondary data sources from Fred Economics St. Louis Federation website. Secondary data was appropriate because it helped in saving time and costs of data collection.
Our largest database for the real estate Index (REI), which was used in this study from January 1990 to January 2016. The data source is U.K Central Bank. 7 QE-20 index represents the United Kingdom stock exchange market performance, which measures twenty and the most liquid share listed in the U. K stock exchange. The UKE-20 index and macroeconomic index factor were collected from the data stream. We also utilized the historical data for the credit policies for collateral and lending interest rates by the banks and other institutions throughout the same times.
The data set consisted of five variables namely year, real estate index, stock market index, wealth index and credit index. The year represented time period from 1990 to 2016, the real estate index described the UK house price index, stock market index indicated the total share prices index for all stocks in UK, the wealth index described the CPI of residents in UK and credit index represented the corporate borrowing rate on loans from UK Banks.
This involved pre-processing processes performed on the data set to make it ready and suitable for analysis. For instance, it entails data cleaning processes to remove missing observations, NA entries or redundancy (Alasadi and Bhaya 4102-4107). The data set in this research ranged from 1990 to 2016 a total of 25 years. The lengths of the data sets were not equal because bank credit index was quarterly, share price index was monthly, real estate index /house price index was annually and wealth index was monthly. Annual averages were performed on the monthly and quarterly entries in order to have a uniform data length for an effective analysis. To convert the quarterly and monthly data sets into annual data set, pivot function was used in Excel.
Data Collection
Before performing predictive analytics which involved running the models, descriptive statistics were performed first. Here, the study performed descriptive summary analysis of the data sets collected in order to gain overall or summary understanding of the data sets (Kaur et al. 60). Some of descriptive statistics measures are mean, mode, median, standard deviation, variance, skewness and this case study, we considered mean and standard deviation measures. In addition, under descriptive analytics we performed exploratory analysis to visually explore and represent out data sets (Kaur et al. 60). For exploratory analysis, line graphs were used and this helped in determining the trends of real estate index, stock market index, wealth index and credit index (Kaur et al. 60). Scatter plots were also used to determine if outliers were present in the data sets.
Though correlation analysis is not a predictive analytic, it is an important step to perform before doing the real predictive analytics. Correlation measures the degree of association between two variables (Schober et al. 1763-1768). It also helps in selecting variables to be included in the model hence it is a variable selection process (Schober et al. 1763-1768). If two independent variables are strongly correlated, only one variable is included in the model to avoid multicollinearity effect (Daoud 012009). Correlation analysis was performed to determine the relationship among the variables real estate index, stock market index, wealth index and credit index.
For predictive analytics, we used both a simple linear regression model and a multi variable regression model in order to analyze and understand the data. Regression analysis is a set of statistical methods used for the estimation of relationships between a dependent variable and one or more independent variables (Sarstedt and Mooi 209-256). It can be utilized to assess the strength of the relationship between variables and for modeling the future relationship between them (Finance, Corporate 2021). This is a basic linear regression model;
y= a + bx
where
y= Real estate prices
a= intercept
x= stock prices
b= coefficient of x
error term
The research also used multiple linear regression model to further analyze other factors of the economy.
y= a + b + c
where
y= Real estate prices
a= intercept
= stock prices, = lending/collateral rates and = interest rates
b= coefficient of , c= coefficient of and c= coefficient of
error term
To confirm for the accuracy and reliability of the regression model, the research used Mean Absolute Percentage Error (MAPE) technique. The decision criterion was that a MAPE value below zero or a value approaching zero was a good indication that the model is accurate (Myttenaere et al. 38-48).
MAPE = |
where At = Actual value and Ft = Forecasted value.
A great challenge was faced in finding the correct websites and sources to collect the data sets. The most famous financial and economic data websites such as yahoo finance and kaggle had no our data sets of interest. Also, collecting data from UK Central bank website was not possible. To solve the solve the problem, we had to evaluate alternative terms or names for real estate index, wealth effect index, stock market index and credit index. We came to understand that real estate index is same to house price index, wealth effect can be measured by Consumer Price Index (CPI), stock market index could be measured by market share prices for all stocks and then credit index could be measured by corporate borrowing rate on loans from UK Banks. After gaining a thorough understanding of our data sets, it was now easy to find these data sets from Fred Economics St. Louis Federation website.
Overview of the Data Set
The research did not involve human subjects hence no much ethical considerations were required as stipulated in IRB approvals. It was a secondary data source research. However, still ethics considerations had to be considered whereby the research used the collected data sets only for the purposes of research and not for other ill or malicious intentions.
Variable |
M |
SD |
Real Estate Index |
64.71 |
28.73 |
Stock Market Index |
74.30 |
21.04 |
Wealth Index |
78.79 |
13.39 |
Credit Index |
6.68 |
3.36 |
Table 1: Descriptive summary statistics
From table 1, real estate index (M= 64.71, SD=28.73), stock market index (M= 74.30, SD= 21.04), wealth index (M= 78.79, SD= 13.39) and credit index (M= 6.68, SD= 3.36). The standard deviation of real estate index seems to be significantly large in relation to the mean value and this is an indication that there is high fluctuation in the mean house prices or real estate indices in UK. This high volatility in the real estate index becomes a barrier to investors because it implies the future house prices is highly uncertain hence the need for a model to help in predicting real estate indices in the future. The standard deviation for stock market index is also significantly large in relation to the stock market value again indicating high rate of fluctuations in the UK stock market. This is expected because as we know, stock market prices are always highly volatile and not stable for most countries. This is because especially in the modern world technology, stock prices are highly determined and affected by tweets from famous and rich CEOs like Elon Musk, Jeff Bezos. They are also affected current trends and pandemics like coronavirus. Therefore, it is rare for stock markets to be stable and hence they also require modelling to help in prediction. The standard deviation for wealth index is not significantly large and hence an indication that the CPI indices of UK economy is not volatile and does not experience high fluctuation. This is an indication of a good and stable economy and it indicates a good consumer spending. The stable CPI index indicates that economists can easily predict consumer spending in UK. Lastly, the standard deviation for the credit index is significantly large in relation to the mean value and this shows high rate of fluctuation in the credit index in UK. The rate of borrowing in UK is not predictable and this might be due to the bank interest rates. If bank interest rates are high, then it will discourage investors and when the interests are low it will encourage investors. The credit index volatility is an indication that UK bank interest rates are highly volatile which directly will affect investments.
The study used line graphs and scatter plot. Line graph was used to determine the trend while scatter plot was used to determine if we have presence of outliers in our data set. Outliers are abnormal or extreme values that deviates from the rest of group values of the data.
Fig 1a: Real estate index line graph
Fig 1b: Real estate index scatter plot
The Trend in real estate index from 1990 to 2006 and this is economically healthy to investors. However, from 2007 to 2009 we see a decline in real estate index and this is attributed to the 2007-2008 economic depression or financial crisis (Reddy et al. 257-281). During this crisis, the housing industry or the real estate industry was severely affected that the Federal Reserve Bank had come in and issue economic stimulus packages (Reddy et al. 257-281). The line graph then shows an increasing trend from 2009 to 201.The scatter plot in fig 1b indicates that there are no outliers in our data set.
Data Preprocessing
Fig 2a: stock market index line graph
Fig 2b: Stock market index scatter plot
We an increasing trend in stock prices from 1990 to 1999 after which we start experiencing significant fluctuations in the stock market prices. This reflects the summary statistics that stock market prices are highly volatile. Again, we see a decline in stock market prices between 2007 and 2008 and as it is known, this is due to the 2007-2008 economic depression (Reddy et al. 257-281). The scatter plot in fig 2b indicates that there is not presence of outliers in our data set.
Fig 3a: Wealth Index Line graph
Fig 3b: Wealth Index Scatter plot
The wealth index plot in a consistent upward rend and this indicates that consumer spending in UK has been increasing from 1990. The consumer spending in UK was not even affected by the 2007-2008 economic crisis (Reddy et al. 257-281). Generally, this is a sign of a good economy and we can this line plot result reflect the summary statistics result that consumer spending in UK is not volatile. The scatter plot in fig 3b indicates that there is no presence of outliers in our data set.
Fig 4a: Credit Index line graph
Fig 4b: Credit Index scatter plot
It can be seen that we have been having a decreasing trend in the credit index. This implies a decreasing trend in corporate borrowing rate on loans and this is an indication of a good economy because decreasing the borrowing rate on loans will encourage investors to borrow more and make investments.
It was used to determine the relationship among the variables. The correlation ranges between -1 to 1 indicating that two variables can be positively or negatively be related. A correlation coefficient approaching 1 or -1 shows a strong positive or strong negative relationship respectively (Schober et al. 1763-1768). Correlation analysis was used to detect presence of multicollinearity in our independent variables. The decision criterion was that two independent variables will result in multicollinearity if they have a strong correlation coefficient value of r> 0.7 or r> -0.7. In modelling, strong correlation between predictor variables is not good because it induces multicollinearity effect in the model distorting the results (Daoud 012009). On the other hand, strong correlation between predictor variables and response variables is highly encouraged. The predictor variables in this case study were stock market index, credit index and wealth index while the dependent variable was real estate index.
Real Estate Index |
Stock Market Index |
Credit Index |
Wealth Index |
|
Real Estate Index |
1 |
|||
Stock Market Index |
0.715208103 |
1 |
||
Credit Index |
-0.770129946 |
-0.71687699 |
1 |
|
Wealth Index |
0.924027905 |
0.794870319 |
-0.891894104 |
1 |
Table 2: Correlation Matrix
From table 2, correlation between real estate and stock market index is (r= 0.72), real estate index and credit index (r=-0.77), real estate index and wealth index (r=0.92), stock market index and credit index (r=-0.72), stock market index and wealth index (r=0.79) and lastly correlation between credit index and wealth index is (r=0.89). Real estate index is negatively related to credit index indicating that a decrease in credit index will lead to an increase in real estate index. On the other hand, real estate index is positively related to stock market index and wealth index hence increase in one leads to an increase in the other. It can be seen that all of the correlation coefficients are greater than 0.7 / -0.7 and this shows a strong correlation between the variables. For the independent variables this will bring problems in our model due multicollinearity effects distorting the results. However, the study assumed the strong correlation among the predictor variables.
Descriptive Analytics
This is an analysis used to determine the effect of independent variable(s) on the dependent variable. In this case study real estate index was the dependent variable while stock market index was the predictor variable.
Fig 5: Simple Linear Regression
R-Squared value was 51.15% indicating that the model explained 51.15% variability in the response variable and this averagely shows a good model (Sarstedt and Mooi 209-256). An R-Squared value approaching zero indicates a poor model while a value approaching 1 indicates a good strong model. For the F statistic, F (27, 1) = 26.18 and p<0.01 indicating that the model was generally good in fitting the data set (Sarstedt and Mooi 209-256). The intercept term was -7.83 implying that if stock market index is zero, then then real estate index will be -7.83. The coefficient for stock market index was 0.98 with p<0.01. The positive coefficient of 0.98 indicated that an increase in the stock market index by 1 will lead to an increase in real estate index by 0.98. This result is similar to the findings of Abul (2019 30-42) who found a positive relationship between stock market index and real estate index in Kuwait. The p<0.01 indicated that the stock market index is significant at explaining or predicting real estate index which agrees with the findings of Echaust & Malgorzata (2020 4-17). UK Stock market index significantly predicts house prices or real estate index and hence investors and relevant stakeholders can use stock markets index to predict and make decisions about real estate index. This agrees with hypothesis 1 and hence the null hypothesis is true hence we conclude that real estate index is strongly affected by stock market index. Our simple linear regression model is:
y= -7.83 + 0.98x.
where y= real estate index and x= stock market index.
This model can be used by real estate investors and property owners to predict future house prices in UK.
The response variable was real estate index and the predictor variables were stock market index, wealth index and credit index. The objective was to determine if wealth and credit index will improve our regression model as well as determining if the indices were significant at predicting real estate index.
Fig 6: Multiple linear regression
The R-Squared value was 86.89% which was a good and strong model because the model explained a greater percentage of variability of 86.89% in the response variable (Sarstedt and Mooi 209-256). This clearly indicates that including wealth and credit index improved the prediction model. For the F statistic, F (27, 3) = 50.80 with p<0.01 which shows that the model was generally good at fitting the data set (Sarstedt and Mooi 209-256).
The intercept term was -147.45 indicating that keeping all the independent variables at zero, real estate prediction will be -147.45 (Sarstedt and Mooi 209-256). The coefficient for stock market index was-0.06 with p>0.01. The negative coefficient showed that an increase in stock market index by 1 will lead to a decrease in real estate index by 0.06. This disagrees with the findings of Su et al (2019) who found out that the two markets have a unidirectional relationship. However, the p>0.01 showed that stock market index was not significant at explaining or predicting the real estate index which disagreed with the findings of Exhaust & Malgorzata (2020). In light to this result, we reject the null hypothesis and conclude alternative hypothesis to be hence, stock market index weakly affects real estate index. These findings disagree with the simple linear findings and this might be attributed to the multicollinearity effect of the independent variables.
Correlation Analysis
The credit index coefficient was 2.24 with p>0.01. This implies that an increase in credit index by 1 will lead to an increase in real estate index by 2.24. The p>0.01 indicated that the credit index was not significant at explaining or predicting real estate index. Therefore, we reject null hypothesis and accept alternative hypothesis to be true and hence conclude that credit index weakly affects real estate index. The wealth index coefficient was 2.56 with p<0.01. This implies that an increase in wealth index by 1 will lead to an increase in real estate index by 2.56. The p<0.01 indicated that the wealth index was significant at explaining or predicting real estate index. Therefore, we fail to reject null hypothesis and conclude that wealth markets strongly affects real estate markets. Our multiple linear regression model is shown below:
y= -147.45 – 0.06×1 + 2.24×2 + 2.56×3
where y= real estate index, x1= stock market index, x2 = credit index and x3 = wealth index.
It is evident that the strong correlation between the independent variables induced multicollinearity problem where two strongly correlated variables explained one thing. This distorted the results as evidenced the multiple linear regression above. Therefore, in order to avoid this problem, only one independent variable should be included the model independently in order to determine the true effect of the independent variable on real estate index. Therefore, the study performed simple linear regressions for credit index and wealth index and the results are summarized in the table below;
Dependent variable= real estate index
Regression Output |
Wealth index |
Credit Index |
R-Squared |
85.38% |
59. 31% |
F-Statistic |
146.03, p<0.01 |
36.44, p<0.01 |
Coefficient |
1.98 |
-6.58 |
P-value |
<0.01 |
<0.01 |
Intercept |
-91.43 |
108.64 |
Table 3: Summary regression results for wealth and credit index
From table 3, the p< 0.01 for all wealth and credit index and this shows that these wealth markets significantly affect/predict/explain real estate markets. Therefore, we fail to reject our null hypotheses in hypothesis 2 and 3 and conclude that, credit markets are strongly affect real estate markets and also, wealth markets strongly affect real estate markets. Excel outputs for this are in the appendix section.
Our model of interest is the simple linear regression model that has been used for determining the relationship between real estate markets and stock markets. Therefore, diagnostic check was performed on the model y= -7.83 + 0.98x. The check was done to determine the accuracy of the model. MAPE was used. Stock market indices from 1990 to 2007 was used as training data set while 2008 to 2016 was used as test data set.
year |
Actual |
Forecast |
2007 |
97.13 |
87.33938 |
2008 |
92.76 |
71.94598 |
2009 |
84.52 |
60.01915 |
2010 |
89.35 |
73.42452 |
2011 |
88.05 |
76.73908 |
2012 |
88.4 |
77.48666 |
2013 |
90.68 |
88.32389 |
2014 |
97.96 |
91.47891 |
2015 |
103.79 |
90.17 |
2016 |
111.36 |
81.19615 |
Table 4: Actual vs Predicted real estate index
N |
10 |
MAD |
14.58762827 |
MSE |
2127.988987 |
RMSE |
46.13013101 |
MAPE |
15.3945617% |
Table 5: Diagnostic statistics
From table 5, the MAPE value is 15.39% and this value approaches zero. Therefore, we conclude that our simple linear regression model is accurate and is reliable in making predictions for real estate indices or house prices in UK.
Conclusion and Recommendations
Data analytics is important in the modern-day business environment because it provide insights to managerial decision makers enabling them make well-informed decisions. In this case research, the aim of the study was to determine the relationship between real estate index and stock market index in UK. Other aims of the research were to determine if credit index and wealth index also predict the real estate index in UK. The descriptive statistics results indicated that real estate index, stock market index and credit index were volatile and had significant fluctuations while the wealth index was not volatile hence less volatile. The scatter plots indicated that there was no presence of outliers in our data sets. The line graph results reflected the summary statistics results in terms of volatility whereby for instance, stock market, credit and real estate line graphs showed no clear pattern hence deducing that they were volatile. Wealth index line graph indicated that there an increasing or upward trend in consumer spending in UK. Also from the line graphs, there was a decline in real estate index, stock market index and credit index between 2007-2009 and this is due to the economic depression of 2007-2008 (Reddy et al. 257-281). The correlation analysis results showed a strong relationship between independent variables and dependent variable as well as among the independent variables. The strong relationship between the independent variables indicated presence of multicollinearity and this effect significantly affected the results of our multiple linear regression model. For multicollinearity presence, the study used the criterion that two variables with correlation coefficient greater than seven or negative seven were strongly related and hence would impose multicollinearity effect in the model.
Predictive Analytics
The simple linear regression results showed that stock market index is positively related to real estate index implying that an increase in stock market prices will consequently lead to an increase in real estate prices/ house prices. Stock markets were significant at explaining or predicting the real estate markets. The simple linear model was averagely good because it the stock market index explained average percentage of the variability in the real estate index. The results for our multiple linear regression were not reliable and this is due to the multicollinearity effect. From the multiple linear regression model, stock market index was negatively related to real estate index and the stock markets were not significant at predicting/explaining the real estate markets. The multiple linear regression results also indicated that credit markets were positively related to the real estate markets but were not significant at explaining the real estate markets. The wealth markets were also positively related to real estate markets and they were significant at explaining the real estate markets. The multiple linear regression results indicated a strong improvement in the model because the R-Squared value improved to almost one and this is an indication that addition of credit and wealth indices variables had a significant impact on the general model. To determine the true impact of credit and wealth markets on real estate markets, the research performed simple linear regression for each. The outputs showed both credit and wealth index markets are significant at predicting or explaining the real estate markets. The results also indicated that credit markers are negatively related to real estate markets while wealth markets are positively related real estate markets. The MAPE value for diagnostic check indicated that our simple linear regression model was accurate and reliable in predictions. Therefore, from this research, stock market index has a positive and significant relationship with the real estate index in UK. Also, the real estate markets are strongly affected by consumer spending- wealth markets and credit markets-bank interest rates on loans and borrowings.
- I recommend policymakers, investors, firms, customers and any relevant stakeholders to consider using stock market index in predicting the real estate index. This is because stock markets are significantly positive at explaining and predicting the real estate markets. I also recommend them to the simple linear regression model of this study because the MAPE diagnostics indicated that the model is accurate and reliable.
- The real estate index is slightly volatile and this definitely will discourage investors in real estate and property owners investors because they are not certain with the future real estate markets. Therefore, I recommend for a research to be conducted to determine factors affecting real estate markets and measures that can be put in place to deal with this situation.
- The research also recommends that the multiple linear regression results and model should not be used because they are miss-directing and distorted due to the high rate of correlation among stock markets, credit markets and wealth markets imposing multicollinearity effects on the model. The simple linear regression models should be used.
- Further studies need to be performed to determine why the stock markets, credit markets and wealth markets –consumer spending are highly correlated.
I enjoyed the entire research experience because it was interactive and more so, the objective of the study sought to solve the real world problem- the real estate industry. Achieving the objective was not an easy task as they were various challenges experienced. The first challenge was on internal validation whereby understanding and finding the data sets was not an easy task. The variables of the research were UK real estate index, UK stock market index, UK credit index and UK wealth index. The data sets for these variables are not directly available and hence one has to understand what each variables means and appropriate data sets that is a true representation of the particular variable. For instance, in this study, the UK real estate index simply means UK house prices index whereby and so, I searched for house prices index and it was easily found on Fred Economic St. Louis Federal website. Similar applied to stock market index which simply meant stock share prices of UK. For wealth index it meant the consumer spending in UK and this is effectively represented by Consumer Price Index of all goods in UK. Lastly, the credit effect index was interpreted as bank loan rates. After internalizing and clearly understanding the variables, it was now easy to carry out the research and collect data sets. Identifying the correct data set is a significant part of the research that requires much care and focus.
Another problem came to data collection sources. Most of the data sets were not available on yahoo finance website, macrotrends and kaggle websites and yet these are most famous websites for providing free economic and finance data for students to carry out their research. Other websites required registration and payment first and yet one is not sure whether the data sets of interest will be found. However, after careful navigation on various websites, I was able to find all of the data sets on Fred Economic St. Louis Federal website. The website contains economic and finance historical data sets of various countries. Interpreting the variables as well as the challenge of obtaining the data sets improved my research skills, thinking capacity as well as problem solving skills.
Another internal validity problem the data set itself. One data set was yearly; another one was quarterly while two were monthly. The frequency of all the data sets had to be uniform- yearly and hence the raw data set could not be applied the way it was. I had to think of a way of solving this problem and after thorough research, Pivot function in Excel was appropriate to converting the quarterly and monthly data sets into annual data sets. This challenge helped me improve my data analytics and data pre-processing skills. I also got to learn that before performing any data analytics, the data set had to be made sure that it is clean and well-prepared for the analysis in order to help in attaining accurate and reliable results.
Another challenge I faced was inclined to external validity of the research. The output for simple and multiple linear regression were different. For simple linear regression, stock market index was positively related to real estate index and it was significant but in the multiple regression, it was the opposite whereby we had a negative and insignificant relationship with the real estate index. This made it difficult to apply the results in the real world scenarios as well as making it difficult to choose on the best model. This would pose a challenge to investors or anyone interested in using the results of the research because one will not be sure to determine which model to use. The decision on choosing the best and reliable model was solved by introducing the multicollinearity effect. Because the multiple regression model used stock market index, credit and wealth index as independent variables and yet these variables were strongly correlated, then the best model to use was simple linear model because it had no multicollinearity effects. This challenge again improved my problem solving and decision making skills.
On completion of the research I learnt on how to make a good presentation business report in such a way that even a statistical individual will get to understand what the research was about. I also learnt how to make report of statistical insights from data analytics such as descriptive summary analysis, exploratory, correlation and regression analysis. I also learned the trends of real estate index, stock market index, credit index and wealth index and the relationship between these variables. Generally, the challenges, data analysis and interpretation of the research results improved my intellectual skills and I learnt and strengthened my problem solving skills.