Stem and Leaf Plot and Frequency Distribution
The stem and leaf plot representing the retail turnovers for Tasmania and Australian Capital Territory in millions of dollars from January 2016 to September 2017 is given in the following table:
Representation of Turnovers of Tasmania and Australian Capital Territory by Stem and Leaf Plot |
||
Leaf Values for Tasmania |
Stem Values |
Leaf Values for Australian Capital Territory |
45 |
6.1, 8.8, 9.6, 9.5, 9.6 |
|
46 |
0.8, 3.3, 6.4, 9.2 |
|
47 |
1.2, 2.1, 2.4, 2.9, 4.2, 6.1, 7.8, 9, 9.3, 9.1, 8.6, 7.7 |
|
48 |
||
3.5, 5.3, 7, 8.4, 9.6 |
49 |
|
0.8, 2, 3.6, 5.5, 7, 8, 8.4, 9.1 |
50 |
|
0.5, 2.7, 5.2, 7.3, 8.6, 9, 8.9, 8.4 |
51 |
To construct a stem and leaf display, the respective numbers are divided into two parts. The digits with the higher place value are considered as stem and the lower place values are considered as leaves. The first two letters in this case are considered as the stem and the remaining digits are considered as leaves for the turnovers of the two states for each month. As per the instruction in the question, the values of the leaves for Tasmania are given on the left side and the leaves for Australian Capital Territory were given on the right side.
Histogram showing the relative frequency of the retail turnovers for Tasmania and frequency polygon showing the relative frequency of the retail turnovers for the Australian Capital Territory are shown in the following graph.
To construct a relative frequency histogram, at first the turnovers for each of the months for Tasmania are grouped into classes with the first class as 400 – 450 as per the instructions in the question. The second class has been taken as 450 – 500 and the third class as 500 – 550. All the turnover values were covered within these three groups. Thus only three groups with class width of 50 was constructed. The number of turnover values in each of the classes were considered as frequency of the classes.
To construct relative frequency, the frequency of each of the classes were divided by the total frequency.
Turnover ($ millions) |
Frequency for Tasmania |
Relative Frequency |
Frequency for ACT |
Relative Frequency for ACT |
400 – 450 |
0 |
(0/21) = 0 |
0 |
(0/21) = 0 |
450 – 500 |
5 |
(5 / 21) = 0.24 |
21 |
(21 / 21) = 1 |
500 – 550 |
16 |
(16 / 21) = 0.76 |
0 |
(0/21) = 0 |
Grand Total |
21 |
(21 / 21) = 1 |
21 |
(21 / 21) = 1 |
To construct the histogram, this data on the relative frequency for Tasmania and Australian Capital Territory was selected in excel and from the insert option, column charts was selected. Then, by right clicking on the bars obtained, format data series was selected and the gap width was made zero. This gave the histogram.
To construct a frequency polygon for Australian Capital Territory, the bars that represent the Australian Capital Territory was selected and by right clicking on the bar, the change series chart type option was selected. Line chart was selected for the Australian Capital Territory data. This gave the resulting graph with Histogram for Tasmania and Frequency polygon for Australian Capital Territory.
The group bar chart comparing the retail turnovers of Tasmania and Australian Capital Territory are given in the following graph. It can be seen clearly from the graph that for both the states, the retail turnover is above 450 million dollars as the relative frequency is zero for the group 400 – 450 million dollars for both the states. The retail turnover of Australian Capital Territory is between 450 – 500 million dollars whereas, for Tasmania, most of the retail turnover is between 500 – 550 million dollars. Very little proportion of Tasmania Experiences a retail turnover between 450 – 500 million dollars.
Mean, Variance and Standard Deviation
To construct the grouped bar chart with the relative frequencies, the relative frequency for the turnovers of both the states were selected and from the insert tab in excel, the clustered column chart was selected. This has given the resultant group bar chart.
The formulae for mean, variance and standard deviation are given as:
(1.1)Mean=
(1.2)Variance =
(1.3) Standard deviation =√ ()
Then using the two datasets of S&P/ASX 50 andS&P/ASX 200, we use formulae (1), (2) and (3) to calculate the mean, variance and standard deviation where n is the length of the dataset and represents the th data point where.
The computation for the requirements however could be done in EXCEL using the function AVERAGE (array) for mean and STDEV (array) for standard deviation with the dataset range or array as the input parameter.
Therefore, the mean of S&P/ASX 50 is $5,742.29 with a standard deviation of $83.22 and the mean of S&P/ASX 200 is $5,821.73 with a standard deviation of $93.48
The formulae for minimum, maximum, quartile 1, quartile 3 and Media are given as:
(2.1)Minimum=
(2.2)Maximum=
(2.3) Median =
(2.4) Q1 =
(2.5) Q3 =
The formulae (2.1), (2.2), (2.3), (2.4), (2.5) gives the requirement upon applying the datasets
S&P/ASX 50 andS&P/ASX 200.
The MIN (), MAX () and QUANTILE (,) functions in EXCEL can compute the required values.
Therefore, for S&P/ASX 50,
Minimum = $5,584.02
Maximum = $5,825.99
Median = $5,786.52
First Quartile (Q1) = $5,669.22
Third Quartile (Q3) = $5,810.18
Again, for S&P/ASX 200,
Minimum = $5,651.77
Maximum = $5,919.08
Median = $5,868.19
First Quartile (Q1) = $5,738.40
Third Quartile (Q3) = $5,901.77
Using EXCEL to construct the Box and Whisker plot, compute Minimum, Maximum, Median and the first and third quartile values. Obtain the plot using these values to construct a modified stacked bar chart, which looks like the box whisker plot.
Construct the stacked chart based on the values Quartile 1, Median-Quartile 1, Quartile 3-Median that, leads to a stacked chart with three sections. Next, set the error bars for the lowest section with error value as negative Quartile 1 – Minimum and for the top most section with error value equal to positive Maximum- Quartile 3. Finally set the colour of the top and bottom section to colourless.
The aforementioned steps are followed to construct the box and whisker plots for the datasets S&P/ASX 50 and S&P/ASX 200.
It can be seen that the mean S&P for both ASX 50 and ASX 200 are very close but the variation for ASX 200 is higher than ASX 50. The share prices have been lower than the average mostly in the month of October for both S&P 50 and S&P 200.
In terms of statistical theory of probability, the probability of an event is defined as
Therefore, the probability that a randomly selected household live in a separate house dwelling is obtained as
Minimum, Maximum, Quartiles and Median
Now, given the number of household living in separate house is = 2324546
Total number of household = (2709429 + 2324546 + 190411 + 1873375) = 5411761
Therefore,
The number of household making a monthly mortgage payment of $800-$999 = 146303
Total number of occupied private dwellings = 2709429
The probability that a randomly selected household making a monthly payment of $800 – $999 is given as
Household making a monthly mortgage repayment of $1800 – $2399 = 548587 +464033+ 40573+ 42946 = 1096139
Household making a monthly mortgage repayment of $2400 – $2999 = 303184 +256405 +22677 + 23635 = 605901
Therefore, total number of household making a mortgage payment of either $1800 – $2399 or $2400 – $2999 = (1096139+ 605901) = 1702040
Total number of household = (2709429 + 2324546 + 190411 + 1873375) = 5411761
Hence, the required Probability that a randomly selected household will make a monthly mortgage repayment of $1,800-$2,399 or $2,400-$2,999
Number of household making a monthly mortgage repayment of $300-$449 in a flat, unit or apartment dwelling structure is = 3220
Total number of household = (2709429 + 2324546 + 190411 + 1873375) = 5411761
Therefore, probability that a randomly selected household makes a monthly mortgage repayment of $300-$449 in a flat, unit or apartment dwelling structure
In this problem, X has been considered as the number of properties whose price has increased. The properties whose price has increased are considered as success and the properties whose price has decreased or remained the same were considered as failure.
Since there are only two possible outcomes to this problem (Failure and success), this variable is considered to follow a binomial distribution.
Here, the probability of success is considered as p = 0.13 and the probability of failure is considered as q = 1 – p = (1 – 0.13) = 0.87.
The total number of samples are considered as the number of trials (n) = 10 (in this case)
The probability mass function (pmf) of a binomial distribution is given by the following formula:
P [X = x] =
Using this formula, the following probabilities of X = 2 and X = 3 have been calculated.
P [X = 2] =
P [X = 3] =
The probability that exactly two or three of the properties have increased in their prices indicates that either two properties have increased or three properties have increased. This situation can be explained by adding the probabilities that exactly two properties increased and three properties have increased.
Thus, the required probability = (0.25 + 0.1) = 0.35
It is given that a customer service officer tends to 10 customers on the average in an hour.
Box and Whisker Plot
The service time or rather the inter-arrival time follows exponential distribution. Then the parameter of the distribution is given, as .
Let X be the random variable, which denotes the inter-arrival time. The unit of time is in hours so, P (Xi = 5 minutes) =P (Xi =5/60 hours)
=Probability density of exponential (10) at X=5/60.
(i) It is given that the average expense for a family of three follows a normal distribution with mean equal to $27 and Standard deviation of $2.
Let X be the random variable denoting the expense. The X~N (27,2).
The probability that the expense of the family lies between $25 and $35 is P (25< X<35).
Now, P (25<X<35) = P(X<35) –P(X<25) = P (
=,denotes the cumulative distribution function of N (0, 1)
(ii)The upper 5% of the expense denotes the X values, which are greater than the 0.95 percentile value.
Note that,
Then, (Z) =0.95 implies that Z=1.645.
Therefore, X= 2.Z+27 = 30.29($)
The number of selected sample (n) = 150. The sample is quite large here. Therefore, binomial distribution cannot be directly used to obtain the desired probability. Hence, an approximation to binomial distribution is to find the probability that in the next 150 calls 90 or more will for one agent.
The probability of any one agent receive a call (p) = 0.5
The mean of the distribution (μ) = (n*p) = (150*0.5) = 75
The probability that any on agent will not receive the call (q) is = (1-P) = (1- 0.5) = 0.5
The variance of the distribution is obtained as = (n*p*q) = (150 *0.5*0.5) = 37.5
The standard deviation = = 6.12
The probability that in the next 150 calls 90 or more will be for one agent is given as
Now, using assumption of standard normal distribution the concerned probability is determined as
60 percent of the video game purchasers are men. Therefore, Proportion of male videogame purchasers is (60/100) = 0.6
The size of random sample (n) = 15.
Let X be the number of men purchasing video games.
The distribution of X follows a binomial distribution with n =15 and p = 0.6
Hence, Probability that there will be exactly 11 men purchasing video games is obtained by using the following distribution
Given, X =11 the required probability is
To test the difference between the means of two groups, two sample t test is the best method.
The critical value of the t statistic can be obtained from excel by using the formula T.DIST.RT (x, degrees of freedom). If the calculated value of the test statistic is greater than the critical value of t, then the null hypothesis is rejected, otherwise accepted.