Solution To Questions 1 And 3 On Statistics

Frequency distribution problem involving class width and histogram

Solution to Question 1

Here class width is $50.

Min = Minimum observation of the data= Minimum of {136, 281, 226, 123, 178, 445, 231, 389, 196, 175, 211, 162, 212, 241, 182, 290, 434, 167, 246, 338, 194, 242, 368, 258, 323, 196, 183, 209, 198, 212, 277, 348, 173, 409, 264, 237, 490, 222, 472, 248, 231, 154, 166, 214, 311, 141, 159, 362, 189, 260} = 123

Max = Maximum observation = Maximum of {136, 281, 226, 123, 178, 445, 231, 389, 196, 175, 211, 162, 212, 241, 182, 290, 434, 167, 246, 338, 194, 242, 368, 258, 323, 196, 183, 209, 198, 212, 277, 348, 173, 409, 264, 237, 490, 222, 472, 248, 231, 154, 166, 214, 311, 141, 159, 362, 189, 260}

= 490

Range= Min – Max = 490 – 123 = 367

Number of classes = Range / width = 367 / 50 = 7.34 8

So there are 8 classes as follows

123-173

173-223

223-273

273-323

323-373

373-423

423-473

473-523

Frequency distribution:

Frequency of particular class is nothing but the number of observation in that class i.e. number of observation greater than or equal to lower limit of class and less that upper limit of class.

Class Interval	Frequency
123-173	8
173-223	16
223-273	11
273-323	4
323-373	5
373-423	2
423-473	3
473-523	1
Total	50

Relative Frequency distribution:

Relative frequency of particular class is nothing but the number of observation in that class divided by total frequency. Here the total frequency is 50.

Class Interval	Relative Frequency
123-173	0.16
173-223	0.32
223-273	0.22
273-323	0.08
323-373	0.1
373-423	0.04
423-473	0.06
473-523	0.02
Total	1

Percent Frequency distribution:

Percent frequency of particular class is nothing but the relative frequency represented in percentage.

Percent frequency = Relative frequency × 100

Class Interval	Percent Frequency
123-173	16
173-223	32
223-273	22
273-323	8
323-373	10
373-423	4
423-473	6
473-523	2
Total	100

Figure : Percent Frequency Histogram

From the above histogram, one can say that shape of dollar-amount distribution of the furniture orders she recently receives is positively skewed.

Mean is always good measure of location for all type of shapes of distribution. For positively skewed shape distribution mean is always larger than median and mode is lies between the mean and median. For given sample data

Mean	251.46
Median	228.5
Mode	231

Here the Y is demand and X is unit price. In linear regression analysis, we test the linear relationship between Y and X. In linear regression analysis we test the null hypothesis that Y and X are not related and alternative hypothesis is Y and X are related. The portion regression ANOVA of computer output given as

ANOVA
df	SS
Regression	1	5048.818
Residual	46	3132.661
Total	47	8181.479

Where SS means sum of square and df means degrees of freedom.

Firstly we need to complete the ANOVA, for that we need to calculate mean sum of square and F statistic for testing the null hypothesis.

ANOVA problem related to completely randomized design

Given SSR i.e. sum of square of regression = 5048.818

sum of square of error = SSE = 3132.661

degrees of freedom for regression = 1

degrees of freedom for error = 46

degrees of freedom for total = 47.

Now Mean sum of square of regression is the sum of square of regression divided by its degrees of freedom

i.e. MSR= Mean sum of square of regression

MSR= sum of square of regression / d. f.

= SSR / d. f. = 5048.818 / 1 =5048.818

MSR = 5048.818

Now Mean sum of square of error is the sum of square of error divided by its degrees of freedom

i.e. MSE = Mean sum of square of error

MSE = sum of square of error / d. f.

= SSE / d. f. = 3132.661 / 46 = 68.101

MSE = 68.101

Now F statistics is the ration of mean sum of square of regression and mean sum of square of error.

i.e. F statistic = MSR / MSE

F statistic = 5048.818 / 68.101 = 74.137

The Complete ANOVA of the regression analysis is

ANOVA
df	SS	MSS	F statistic
Regression	1	5048.818	5048.818	74.137
Residual	46	3132.661	68.101
Total	47	8181.479

Now we compare this F statistic with Table value of F distribution. We call it F tabulated.

Decision Criteria:

If F tabulated > F statistic then we reject null hypothesis otherwise do not reject null hypothesis.

Under null hypothesis, F statistic follows F distribution with numerator degrees of freem is 1 and denominator degrees of freedom is 46. We compute the critical value of F distribution with (1, 46) degrees of freedom at = 0.05 and i.e. our F tabulated.

F tabulated = F _{(0.05, 1,46)}= 4.052.

Now F statistic > F tabulated so we reject null hypothesis. So we acan say that at

= 0.05, the demand (Y) and unit price (X) are related.

(b) Coefficient of determination is the proportion of variation explained by the model from the total variation. It is denoted by R² and given by

Coefficient of determination = sum of square of regression / total sum of square

= 5048.818 / 81.81479

=0.6171

i.e. 61.71 % variation in response variable is explained by the regression variable.

i.e. 61.71 % variation is demand (Y) is explained by the regression variable unit price (X).

(c)

Correlation coefficient:

Coefficient of determination is nothing but the square of correlation coefficient. Reversely correlation coefficient is square root of coefficient of determination.

About sign from the coefficient of X which -2.137 we can claims that X and Y are negatively related.

correlation coefficient = –

correlation coefficient = – = – 0.7856

Relationship between demand and unit price:

As from coefficient of X we can say that relation between demand and unit price is negative. We observe that correlation coefficient is -0.7856 which pretty high.

It means when unit price increases demand goes on decreasing and when unit price decreases demand goes on increasing.

Solution to the Question 3

In completely randomized design, we test whether the mean of all treatments are same or not. Completely randomized design has null hypothesis that there is no significant difference between different treatments and alternative hypothesis that there is no significant difference between different treatments. For the testing this hypothesis we need to complete the following given ANOVA.

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Between Treatments	390.58
Within Treatments (Error)	158.4
Total	548.98	23

Given that there are three treatment,

So,

degrees of freedom for between treatment = number of treatments – 1 = 3 – 1 = 2

Degrees of freedom for error (within treatments) = degrees of freedom for total – degrees of freedom for between treatment = 23 – 2 =21

Mean sum of square between treatments is nothing but sum of square between treatments divided by its degrees of freedom i. e.

Mean sum of square between treatments = sum of square between treatments / degrees of freedom for between treatment

= 390.58 / 2 = 195.290

Mean sum of square within treatments (error) is nothing but sum of square within treatments (error) divided by its degrees of freedom i. e.

Mean sum of square within treatments (error) = sum of square within treatments (error) / degrees of freedom for within treatments (error)

= 158.4 / 21 = 7.543

F Value is nothing but the ration of Mean sum of square between treatments and Mean sum of square within treatments (error) i.e.

F value = Mean sum of square between treatments / Mean sum of square within treatments (error)

= 195.290 / 7.543

= 25.891

Completed ANOVA:

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Between Treatments	390.58	2	195.290	25.891
Within Treatments (Error)	158.4	21	7.543
Total	548.98	23

Now we compare this F value with Table value of F distribution. We call it F tabulated.

Decision Criteria:

If F value > F tabulated then we reject null hypothesis otherwise do not reject null hypothesis.

Under null hypothesis, F statistic follows F distribution with numerator degrees of freedom is 2 and denominator degrees of freedom is 21. We compute the critical value of F distribution with (2, 21) degrees of freedom at = 0.05 and i.e. our F tabulated.

F tabulated = F _{(0.05, 2,21)}= 3.4668

Now F value > F tabulated so we reject null hypothesis.

i.e. there is significant difference between mean of three treatments population.

Solution to Question 4

Here the

Y = number of mobile phone sold per day

X1 = Price (in $ 1000)

X2 = number of advertising spots

(a)

Estimated regression equation of Y related to X1 and X2 as

Y = intercept + slope of X1 × X1 + slope of X2 × X2

i.e. Y= 0.8051 + 0.4977 × X1 + 0.4733 × X2

(b)

To test the significance relationship with X1 and X2 and Y. We need to complete the given ANOVA. Here our null hypothesis is that there is no X1 and X2 are not significantly related with Y and alternative hypothesis is X1 and X2 are significantly related with Y.

Given that there are 7 observation taken and we have two repressor variables X1 and X2 so

Degrees of freedom for regression = 2

Degrees of freedom for total = 7 – 1 = 6

Degrees of freedom for residual = Degrees of freedom for total – Degrees of freedom for regression

= 6 – 2 = 4

Sum of square for regression = 40.7

Sum of square for error = 1.016

Mean sum of square is nothing but sum of square divided by its degrees of freedom.

So,

Mean Sum of square for regression = 40.7 / 2 = 20.35

Mean Sum of square for error = 1.016 / 4 = 0.254

F value = Mean Sum of square for regression / Mean Sum of square for error

= 20.35 / 0.254 = 80.118

The complete ANOVA is given as

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Regression	40.7	2	20.350	80.118
Residual	1.016	4	0.254
Total	41.716	6

Now we compare this F value with Table value of F distribution. We call it F tabulated.

Decision Criteria:

If F tabulated < F value then we reject null hypothesis otherwise do not reject null hypothesis.

Under null hypothesis, F statistic follows F distribution with numerator degrees of freedom is 2 and denominator degrees of freedom is 4. We compute the critical value of F distribution with (2, 4) degrees of freedom at = 0.05 and i.e. our F tabulated.

F tabulated = F _{(0.05, 2,4)}= 6.994

Now F value > F tabulated so we reject null hypothesis.

i.e there is significant relationship between all the independent variables (X1 and X2) and the dependent variable (Y).

(c)

i) we test H₀: vs H₁:

t-statistic for testing this hypothesis is

Under H₀ t statistics follows t distribution with 4 degrees of freedom. So we find critical values of t distribution. At = 0.05, critical values of distribution is

Critical t value = 2.776

Decision criteria:

Reject H₀ if

So |1.078| < 2.776

So we fail to reject H₀.

i) we test H₀: vs H₁:

t-statistic for testing this hypothesis is

Under H₀ t statistics follows t distribution with 4 degrees of freedom. So we find critical values of t distribution. At = 0.05, critical values of distribution is

Critical t value = 2.776

Decision criteria:

Reject H₀ if

So |12.33| > 2.776

So we reject H₀.

(d)

Interpretation of Slope of X2:

If X1 is fixed (that is price is fixed), then each change in one unit in X2 (number of advertising spot), Y (number of mobile phone sold per day) changes by 0.4733.

(e)

If X1=20 and X2=10

Then

Y= 0.8051 + 0.4977 × 20 + 0.4733 × 10 = 15.4921 15

If the company charges $20,000 for each phone and uses 10 advertising spots, then approximately 15 mobile phones expected to sell in a day.

References

Bickel, P.J. and Doksum, K.A., 2015. Mathematical statistics: basic ideas and selected topics, volume I(Vol. 117). CRC Press.
DeGroot, M.H. and Schervish, M.J., 2012. Probability and statistics. Pearson Education.

Montgomery, D.C., 2017. Design and analysis of experiments. John wiley & sons.

Ross, S.M., 2014. Introduction to probability and statistics for engineers and scientists. Academic Press.
Ryan, T.P., 2008. Modern regression methods(Vol. 655). John Wiley & Sons.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Solution To Questions 1 And 3 On Statistics ”

Get high-quality paper

NEW! AI matching with writer

Order an Essay Now & Get These Features For Free:

Turnitin Report

Formatting

Title Page

Citation

Outline

Place an Order