Part I
Task 1 has been done in excel. The excel file is attached together with this word document.
Task 2
Use Excel to produce a Frequency Column Chart and a Relative Frequency Pie-Chart for your sample to show the number and proportion, respectively, of each building type.Use these graphical summaries to answer the following questions:
How many properties in your sample consist of brick buildings
32 properties consist of brick buildings
- Which building type occurs most frequently in your sample
Brick veneer occurs most frequently
- What proportion of properties in your sample consists of weatherboard buildings
38% of the properties consists of weatherboard buildingsUse Excel to sort your sample “Sold Price” data and paste into your MS Word assignment document.
Answer
Sorted out sold price
Sold Price |
|||
112 |
387.5 |
590 |
805 |
132 |
400 |
615 |
831 |
245 |
410 |
621 |
845 |
265 |
424 |
627.5 |
865 |
300 |
440 |
670 |
880 |
305 |
441 |
675 |
882 |
319 |
445 |
711 |
900 |
348 |
501 |
718 |
1424 |
350 |
505 |
736 |
1800 |
351 |
552 |
745 |
2650 |
361 |
555 |
755 |
|
363 |
570 |
800 |
Use the percentile location formula;
, and the three associated rules (Slide 11 of Week 2 Seminar, Session 1)
|
- The 70th (1 mark)
Answer
Therefore the 70th percentile is located at rank 33 which is 736.00
- The first and third quartiles.
Answer
Therefore the 25th percentile is located at rank 12 which is 363.00 Therefore the 75th percentile is located at rank 35 which is 755.00
- Briefly explain what the 70th percentile that you have determined informs you about your sample “Sold Price” data.
Answer
70th percentile means that 70% of the properties sold made were sold out at $736,000 or below and that 30% of the properties were sold scored above$736,000.
- Determine the Inter-Quartile Range of your sample “Sold Price” data and provide a brief explanation of what information this statistic provides about your sample data.
Answer
Thus the inter-quartile range is $392,000. This value implies that the difference in the 75th percentile and 25th percentile is $392,000.
- Use Excel to produce a Descriptive Statistics table for your sample “Sold Price” data and paste into your MS Word assignment document.
Sold Price |
|
Mean |
635.3696 |
Standard Error |
63.63855 |
Median |
562.5 |
Mode |
#N/A |
Standard Deviation |
431.6176 |
Sample Variance |
186293.8 |
Kurtosis |
10.67923 |
Skewness |
2.792241 |
Range |
2538 |
Minimum |
112 |
Maximum |
2650 |
Sum |
29227 |
Count |
46 |
Use results from Task 3 to determine manually for this data, the upper and lower inner fence limits;
and IFLL = Q1 – 1.5 x IQR
- Based on the limits calculated in (b), choose from the numerical summary measures provided in the Descriptive Statistics table, and/or measures calculated previously in Task 3;an appropriate measure of central tendency, and,
Median is the appropriate measure of central tendency
(ii) an appropriate measure of dispersion for your sample “Sold Price” data.
Answer
Inter-Quartile range is the most appropriate measure of dispersionProvide a brief explanation of the reasoning behind your choice in both cases.
Answer
The two measures have been chosen based on the fact that the dataset had some outliers and it is the mentioned measures that are least affected by the outliers.
Remember to show all working! Failure to do so will result in the loss of marks.
- From the Descriptive Statistics table obtained in Task 5, identity three pieces of evidence that indicate whether your sample “Sold Price” data has been obtained from a normally distributed population or not. What is your conclusion? Note: Make sure only one piece of evidence relates to the shape of the sample data.
- Answer
The three pieces of evidence that indicate whether the dataset is normally distributed or not
- Skewness
- Kurtosis
- Standard deviation
The skewness value is given as 2.792241. This value is greater than 1 implying that the distribution is highly skewed. The kurtosis value is greater than 3 implying that it is a leptokurtic distribution.
- Regardless of your conclusion in above, assume the “Sold Price” population data is normally distributed. Applying the Standard Normal tables, calculate how many “Sold Price” observations in your sample would expect to lie within 1.5 standard deviations of the mean (i.e. between z = –1.5 and z = +1.5).
Answer
Probability (z > -1.5) = 0.0668
Probability (z < 1.5) = 0.9332
So the proportion of observations that lie within 1.5 SD are;
0.9332-0.0668 = 0.8664 = 86.64%
- Use the mean and standard deviation from the Descriptive Statistics table of Task 5 to calculate the bound for 1.5 standard deviation spread from the mean. Using the “Sold Price” sample data, manually count the number of observations fall within the bound. State whether this count matches, approximately, your answer to (b) and hence whether this result confirms (or not) your conclusion in (a).
Answer
We need to count the number of observations between -12.0568 and 1282.796. We have 43 observations in the range representing 93.48%. This though does not match the one in a) above it is close to.
Remember to show all working! Failure to do so will result in the loss of marks.
- Use Excel to produce a Descriptive Statistics table for the “Sold Price” variable in your sample suitable for constructing an interval estimate of the population mean “Sold Price”. Hence determine:
Answer
Sold Price |
|
Mean |
635.3696 |
Standard Error |
63.63855 |
Median |
562.5 |
Mode |
#N/A |
Standard Deviation |
431.6176 |
Sample Variance |
186293.8 |
Kurtosis |
10.67923 |
Skewness |
2.792241 |
Range |
2538 |
Minimum |
112 |
Maximum |
2650 |
Sum |
29227 |
Count |
46 |
A point estimate of the mean “Sold Price” of the population of properties.
Answer
The point estimate of the mean is 635.3696
- A 90% confidence interval estimate of the mean “Sold Price” of the population of properties.
Answer
Upper limit:
Lower limit:
Thus the 90% confidence interval is [530.6842, 740.0550]
- Make a brief verbal statement explaining the meaning of the confidence interval estimate obtained in (ii) in the context of the variable in this task.
Answer
We are 90% confident that the true population mean for the properties sold price is between $530,684.2 and $740,055.0.
- If the population mean “Sold Price” is actually 650 ($000s), would you consider the interval estimate obtained in (a), to be satisfactory? Explain why or why not.
Answer
I will have consider the the interval estimate obtained in (a), to be satisfactory since 650 ($000s) is within the interval. It is actually inside the 90% confidence interval.
Remember to show all working! Failure to do so will result in the loss of marks.
- Use Excel to produce a Descriptive Statistics table for the brick veneer properties in your sample suitable for constructing an interval estimate of the population proportion of brick veneer properties. Hence determine:
Answer
Brick veneer |
|
Mean |
0.38 |
Standard Error |
0.069341 |
Median |
0 |
Mode |
0 |
Standard Deviation |
0.490314 |
Sample Variance |
0.240408 |
Kurtosis |
-1.81429 |
Skewness |
0.509877 |
Range |
1 |
Minimum |
0 |
Maximum |
1 |
Sum |
19 |
Count |
50 |
Confidence Level(99.0%) |
0.18583 |
A point estimate of the proportion of brick veneer properties in the population.
Answer
The point estimate of the proportion of brick veneer properties in the population is 38% (0.38)
- A 99% confidence interval estimate of the proportion of brick veneer properties in the population.
Answer
The 99% confidence interval estimate of the proportion of brick veneer properties in the population is 0.858.
Using the following formula:
(sample statistic) ± (critical z or t) ´ (standard error of the sample statistic)
Use the rule of thumb for good normal approximation (Slide 3 of Week 7 Session 2) for proportion, then the Empirical Rule (Slide 8 of Week 5 Session 1) for a Normal distribution to determine a 95% confidence interval estimate of the proportion of brick veneer properties in the population.
- Compare, in terms of the precision, the interval manually calculated in (b) with the interval obtained from the Descriptive Statistics table in (a). Explain why the direction of the change in precision is expected.
Answer
As can be seen, the 99% confidence has a wider interval than the 95% confidence interval calculated in (b). This is expected since the 99% has a higher degree of precision.