Task 2.1: Conducting an exploratory data analysis
Fundamental assumptions of organizational culture are developed through team work and internal integration of people, which are proliferated through amalgamation with new team associates through precise perceptions of institutional ethos. The interpretive views of metaphoric leadership quality and functionalist vision of collegial stability of the institution are two principal aspects of institutional ethnicity and defines organizational culture.
An urn full of values and beliefs enhancing the acquaintance with richness and flexibility of the organization defines the culture. A coherent analysis of organization culture emphasizes on the domain of group culture, development culture, rational culture, and hierarchical culture (Gregory et al., 2009). The hierarchical culture, integrated with experienced leadership establishes an effective psychological influence in a macro cultural system, which transforms the office culture to a corporate alignment.
Almost every industry is data driven, and decision-making today depends on assumptions supported by exploratory and confirmatory analysis of information available. The principled extraction of knowledge, based on which decision is taken, is directly related to availability and scrutiny of data (Provost, and Fawcett, 2013). Principles of data science are guided by the knowledge of precise information about the quality of the entity under analysis. Categorization of constraints associated with the data is important for comprehensive understanding of physical properties of the attributes for error free results. Advantage over the competitors with data-driven decision making (DDDM) influenced by precise organizational culture enthralls the philosophy of the employees. The institutional assessment gets influenced by qualitative as well as quantitative analysis of acquired data, and smart decision for profitability in business is adopted.
Task 1.2
ANS: Organizational decisions relying on data-based analytics facilitates overall growth, and there are numerous factors that foster management’s inertia for a DDDM. Persistent and constructive planning with supportive organization culture enthralls the organization to process and analyse information in regular business procedures (Robbins, and Judge, 2014.). This vision and approach guarantees the sustainability of the organization. A detailed comprehension of organizational culture is imperative for the advancement of information technology with proper innovative data analysis. Culture additionally controls administrative procedures that have specific impact on the innovation technology. Culture of an institution or organization is a multidimensional variable, contributing to a wide assemblage of comprehensive knowledge. It establishes the linkage amongst IT and development of the organization.
Amongst several interpretations, an agreement about the awareness of continuous development of the organization injects with an enthusiasm on the part of the implementation of DDDM. Organizational members take the surface level for granted and visible threats also get neglected due to absence of proper culture. Predicament of utilizing huge regulatory data is essentially the most rudimentary parts of data driven fundamental administration, and is reflected essentially by the decisions of the leadership. The general population actualizes the exorbitant and not effortlessly available electronic data to guarantee wellbeing for every person of association. People in the association express their feelings about what’s seemingly within easy reach and what will happen, or even how proficiently something will function, and after that plan are done as needs be. Particularly for these imperative decisions, the community meets people’s high expectations in the affiliation, and their ability and the track record states the fact (Iivari, and Iivari, 2011).
Task 2.2: Building a Linear Regression model to predict house values
The central level of the organization spreads esteems and convictions, concerning what should be accomplished. The extraction of data from business information is an essential angle in basic leadership, yet not very many organizations have possessed the capacity to grab this data advantage. A dominant part of the confused associations are insufficient in required specialized abilities to use the information, yet in addition have neglected to assemble a hierarchical culture that adds to development. Regardless of the quantity of information researchers in the association, successful utilization of information requires entrepreneurial corporate culture that controls the manner in which individual access the information and work with it. There organization culture injects trusts about every single business choices which ought to depend on information. The delegates can administer data openly and data examination is a part of everyone’s obligations, where permission to the data is available to every specialist. Heartbreakingly, consistently examination is used to help the association’s present status and ordinary decisions instead of controlling it toward progression. In this way, the issue of implementing an examination culture isn’t by and large in an association’s particular space, yet rather in its human-resource area, especially for the general population who work for the association. The engagements parallel with another approach to manage cooperation ought to be tended to as an issue of first significance for the employees (Bradley, Pridmore, and Byrd, 2006)
An Exploratory and Regression Approach for
- The Exploratory Data Analysis (EDA) was executed using the data of house prices. The exploratory analysis was performed in the Rapid Miner data mining software. The Retrieve operator was initially used, and then Select Attribute operator was joined to select attributes of choice in the EDA process. The Retrieve operator connected the data file and explored it while the Select Attribute was used for used defined output in Rapid Miner.
Final EDA process of House Prices
Table 2.1: Exploratory Data Analysis of the Attributes
Attributes |
Latitude |
Longitude |
Total Rooms |
Total Bedrooms |
Population |
Households |
Median House Value |
Housing Median Age |
|
Mean |
35.63 |
-119.57 |
2635.76 |
537.87 |
1425.48 |
499.54 |
3.87 |
206855.82 |
28.64 |
Median |
34.26 |
-118.49 |
2127.00 |
435.00 |
1166.00 |
409.00 |
3.53 |
179700.00 |
29.00 |
Mode |
34.06 |
-118.31 |
1527.00 |
280.00 |
891.00 |
306.00 |
3.13 |
500001.00 |
52.00 |
Standard Deviation |
2.14 |
2.00 |
2181.62 |
421.39 |
1132.46 |
382.33 |
1.90 |
115395.62 |
12.59 |
Kurtosis |
-1.12 |
-1.33 |
32.63 |
21.99 |
73.55 |
22.06 |
4.95 |
0.33 |
-0.80 |
Skewness |
0.47 |
-0.30 |
4.15 |
3.46 |
4.94 |
3.41 |
1.65 |
0.98 |
0.06 |
Range |
9.41 |
10.04 |
39318.00 |
6444.00 |
35679.00 |
6081.00 |
14.50 |
485002.00 |
51.00 |
Minimum |
32.54 |
-124 |
2 |
1.00 |
3.00 |
1.00 |
0 |
14999 |
1 |
Maximum |
41.95 |
-114.31 |
39320.00 |
6445.00 |
35682.00 |
6082.00 |
15.00 |
500001.00 |
52.00 |
Missing |
0 |
0 |
0 |
207 |
0 |
0 |
0 |
0 |
0 |
The exploratory analysis was done excluding the nominal attribute Ocean Proximity. For this purpose Select Attribute filter was used.
House prices in North California were assessed with ocean propinquity and other related attributes, especially income of people, and age of house properties. The exploratory investigation revealed that of the distribution of house prices were skewed in the right side, where average of median house prices were greater than the median of the prices provided. Mode of the house prices was the outlier value, where this exuberantly high outlier price(Mode = $ 5000010) was in the islands near North California, which indicated existence of an escalated price for exclusive homes at exclusive locations. Greater average income than median income reflected the skewness of income distribution, where a major segment of the people was earning less than the average income level. Average age of real estate properties was 28.64 years with bell shape distribution, but some outliers were visible as well (Figure 14).
For assessing the impact of the attributes on house prices, exploratory graphical investigation with confirmative correlation test was constructed. Average population corresponding to the ocean proximity has been represented in Figure 2, and density of the population near land areas was visible. Ocean proximity was a significant price predictor, as Figure 3 revealed that real estate prices were high in islands and along the coastlines. Holly, Pesaran, and Yamagata (2010) established the positive impact of income of individuals in Southern California, and from Figure 4 analogous association was observed. The confirmatory inspection was done using correlation coefficients between the attributes and house prices, and the results have been provided in Table 2.1.1. Five significant predictors of house prices were identified as ocean proximity, median income, house age, total rooms, and number of households (McMillen, and Redfearn, 2010). The nominal levels of ocean proximity were cross tabulated with Chi- Square goodness of fit, and weights of the attributes have been presented in Table 2.1.2.
Understanding Organizational Culture
Rapid Miner model of linear regression consisted of five forecasting attributes for the real estate propertymedian prices. The regression process consisted of six operators. Replace Missing Value filter was utilized, and the ocean proximity was converted to numerical scale from nominal scale (Figure 5). Set Role operator was utilized for selecting Median House Price as the estimated attribute.
Table 2.2: Final Linear Regression Model Results for Task 2.2
Attribute |
Coefficient |
Std. Error |
t-stat |
p-Value |
Ocean Proximity |
-8307.02 |
666.78 |
-12.46 |
0.00 |
Housing Median Age |
1752.40 |
47.68 |
36.75 |
0.00 |
Total Rooms |
-16.99 |
0.73 |
-23.17 |
0.00 |
Households |
123.07 |
4.05 |
30.39 |
0.00 |
Median Income |
46233.91 |
331.33 |
139.54 |
0.00 |
Intercept |
-26823.72 |
2718.19 |
-9.87 |
0.00 |
- The linear equation of the regression model was constructed as,
Selim in 2009 interpreted the hedonic regression model for house prices in Turkey, where location was an significant impact factor. Ocean Proximity in the current revise was a significant impact factor (t = – 12.46, p < 0.05). Probably due to uncontaminated environment real estate prices in islands and coastlines were costlier. Fuerst, and McAllister in 2009 provided evidence of the impact of quality and strength of the housing due to its age on the price of the property. In the model Median Age was a significant positive factor (t = 36.75, p < 0.05) for real estate prices. In accordance with Gan, and Hill (2009) Median Income was detected as the key significant factor (t = 139.54, p < 0.05) for determining home prices. Number of rooms are essential features of a house, but expensive properties on the islands and beaches were architected with less number of rooms with significant open spaces, and also was significant (t = – 23.17, p < 0.05), but a negative impact factor. The attribute, Household was an important estimator (t = 30.39, p < 0.05) and the relation was in line with previous work by Piazzesi, and Schneider (2009). Intercept was significant (t = -9.87, p < 0.05) but negative, pointed to a slide in home prices without the five predictors. House prices in exclusive locations such as island and beach areas are always expensive, and in the current work overwhelming effect of scenic view was evident from the regression model of price estimation (Le Goix, and Vesselinov, 2013)
(1) Text table/ graph views were created in Tableau software environment. Two text tables were created for comprehend the average and count of the attributes for different levels of ocean proximity. The data set of house prices was connected to the application, where the nominal attribute, Ocean Proximity was placed against all other factors. Average and count measures for the attributes were studied for levels of ocean immediacy.
Text tables reflected the reason for elevated house prices near coastlines and in islands, and few numbers of exclusive properties was identified as the primary reason. Due to regular features and availability, prices were the lowest in inland localities. Despite the fact that houses were old in island and near bay areas, average prices were higher in such localities; whereas, prices of relatively new houses were less in inland areas. Probably, less population and serene ambiance were the reasons for high living cost. Number of houses and average population was on the higher side in inland areas, and were two primary reasons for affordable living. Total rooms and bedrooms were comparatively higher in inland and distant areas from the bay, which was a familiar property in house architecture
Geo-Maps were created in Tableau 10.5 version to comprehend the variation of house prices in North California, especially finding out the relation with ocean proximity. Latitude and longitude were converted to dimension and geographical attributes. In absence of detailed address longitudes and latitudes were placed in column and row in Tableau geo map feature. The location was identified by the Tableau environment and coast line of the State of California, USA was visible.
Two Geo-Maps were constructed, respectively for analyzing the effect of ocean proximity and income level on house prices. The correlation of the attributes was eminent from the geographical representation of the plotted details, which can be identified from Figure 10 and Figure 11. High price trends were observed in bay areas and in islands, followed by vicinity areas of the ocean. Distribution of income of the people and relation with high purchase power was also apparent from the Geo Map.
References
Bradley, R.V., Pridmore, J.L. and Byrd, T.A., 2006. Information systems success in the context of different corporate cultural types: an empirical investigation. Journal of Management Information Systems, 23(2), pp.267-294.
Fuerst, F. and McAllister, P., 2009. An investigation of the effect of eco-labeling on office occupancy rates. Journal of Sustainable Real Estate, 1(1), pp.49-64.
Gan, Q. and Hill, R.J., 2009. Measuring housing affordability: Looking beyond the median. Journal of Housing economics, 18(2), pp.115-125.
Gregory, B.T., Harris, S.G., Armenakis, A.A. and Shook, C.L., 2009. Organizational culture and effectiveness: A study of values, attitudes, and organizational outcomes. Journal of Business Research, 62(7), pp.673-679.
Holly, S., Pesaran, M.H. and Yamagata, T., 2010. A spatio-temporal model of house prices in the USA. Journal of Econometrics, 158(1), pp.160-173.
Iivari, J. and Iivari, N., 2011. The relationship between organizational culture and the deployment of agile methods. Information and Software Technology, 53(5), pp.509-520.
Le Goix, R. and Vesselinov, E., 2013. Gated communities and house prices: Suburban change in southern California, 1980–2008. International Journal of Urban and Regional Research, 37(6), pp.2129-2151.
Leidner, D.E. and Kayworth, T., 2006. A review of culture in information systems research: Toward a theory of information technology culture conflict. MIS quarterly, 30(2), pp.357-399.
McMillen, D.P. and Redfearn, C.L., 2010. Estimation and hypothesis testing for nonparametric hedonic house price functions. Journal of Regional Science, 50(3), pp.712-733.
Piazzesi, M. and Schneider, M., 2009. Momentum traders in the housing market: survey evidence and a search model. American Economic Review, 99(2), pp.406-11.
Provost, F. and Fawcett, T., 2013. Data science and its relationship to big data and data-driven decision making. Big data, 1(1), pp.51-59.
Robbins, S.P. and Judge, T., 2014. Essentials of organizational behavior. Pearson,.
Selim, H., 2009. Determinants of house prices in Turkey: Hedonic regression versus artificial neural network. Expert Systems with Applications, 36(2), pp.2843-2852.