Frequency, Regression, ANOVA And Multiple Regression Analysis

Frequency distribution and its types

Question 1:

Here X : Furniture order (in nearest $)

Min. Obs. = Min { X₁, X₂, …, X₅₀} 123

Max. Obs. = Max { X₁, X₂, …, X₅₀} 490

So, Range = Max. Obs. – Min. Obs. = 490 -123 = 367

As class width is 50,

Number of classes = (Range) / (class width) = 367 / 50 = 7.340

So we did 7.37 8 classes.

We did the classes such that first class include min. obs. and last class include max. obs.

We construct the 8 classes as follows:

120–170

170-220

220-270

270-320

320-370

370-420

420-470

470-520

Frequency distribution:

Frequency of particular data point means the number of times that data points occurs in the dataset.

Frequency of the class is the number of data points in that class. Following table shows the frequency distribution of dataset given for above mentioned classes: Classes

	Frequency
120-170	8
170-220	15
220-270	12
270-320	4
320-370	5
370-420	2
420-470	2
470-520	2
Total	50

Relative Frequency distribution:

Relative frequency = Class Frequency / Total number of observations

Classes	Relative Frequency
120-170	0.16
170-220	0.30
220-270	0.24
270-320	0.08
320-370	0.10
370-420	0.04
420-470	0.04
470-520	0.04
Total	1.00

Percent frequency = Relative frequency × 100

Classes	Percent Frequency
120-170	16
170-220	30
220-270	24
270-320	8
320-370	10
370-420	4
420-470	4
470-520	4
Total	100

We can observe that positive skewness in the data from percent frequency histogram.

Mean is always good measure of location. For our data, Mean = 251.46

Question 2:

Simple linear regression: Where we have only one predictor variable. Here the Response variable is Demand and Predictor (independent) variable is unit price. We denote,

Y: Demand & X: Unit Price

H₀: Response variable and predictor variable is not related.

H₀: Response variable and predictor variable is related.

To test this hypothesis, we construct the ANOVA for simple linear regression:

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Regression	SSReg	1	MSReg = SSReg	F Cal = MSReg / MSRes
Residual	SSRes	n-2	MSRes = SSRes / (n-2)
Total	TSS	n-1

Given, n – 1 =47, i.e. n = 48

SSReg = 5048.818, SSRes = 3132.661, SST = 8181.479

MSReg = SSReg = 5048.818,

MSRes = SSRes / (n-2) = 3132.661 / 46 = 68.10

F Cal = SREg / MSRes = 5048.818 / 68.10 = 74.14

Completed ANOVA of the simple linear regression is

ANOVA

Sources of Variation	Degrees of Freedom	SS	MSS	F-Value
Regression	1	5048.818	5048.818	74.14
Residual	46	3132.661	68.10
Total	47	8181.479

Decision Criteria to decide whether to reject null hypothesis (H₀) or not:

If we observe that F-Cal > Critical value of F then we reject the null hypothesis otherwise fail to reject null hypothesis.

Under null hypothesis, F-Cal follows F distribution with (1, 46) degrees of freedom.

where 1 is for numerator degrees of freedom and 46 is denominator degrees of freedom.

Table value of F distribution with (1, 46) degrees of freedom at = 0.05 is F _{(0.05, 1,46)}= 4.052

By comparing F-Cal and table F value, we observe that F-Cal = 74.14 > F _{(0.05, 1,46)}= 4.052 , so we reject null hypothesis at = 0.05.

Developing estimated regression equation

i.e. we have strong evidence that demand and unit price are related to each other.

When we observe the strong correlation between response variable and predictor variable we fit the regression model. Out of total variation in response variable is explained by predictor variable. When variation explained by the predictor variable is more, we say that model fit is good.

Coefficient of determination is the proportion of variation explained by the predictor variable from the total variation. Coefficient of determination is denoted by R² and given by

Coefficient of determination = SSReg / TSS

where TSS is total sum of square.

Alternatively it is defined as

Coefficient of determination = 1 – SSRes / TSS

Coefficient of determination (R²) = 1 – 3132.661 / 8181.479 = 0.6171

It means that 61.71% variation in response (Demand) is explained by the predictor variable (unit price).

Correlation coefficient:

Correlation coefficient is denoted by r. We know that Coefficient of determination i.e. R² is the square of correlation coefficient (r). So we calculate the r by square root of R². From slope of X, we can say that there is negative correlation between X and Y.

So,

r = –

r = – = – 0.7856

Relationship between demand and unit price:

We observed that correlation coefficient is -0.7856, suggest that demand and unit price is strongly negatively related with each other.

We can say that when unit price decreases the demand increases and when unit price increases demand will decreases.

Question 3:

Here we have three treatments.

Let is the population mean of first treatment, is the population mean of second treatment

And is the population mean of third treatment.

Our null and alternative hypothesis are

H₀: = =

H₁: At least one of different. i = 1, 2, 3

For testing the above null hypothesis vs alternative hypothesis, we need to have complete the ANOVA.

We have 3 treatments and total observations are n then for testing the above hypothesis, ANOVA is.

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Between Treatments	SST	2	MST = SST / 2	Cal F = MST / MSE
Within Treatment (Error)	SSE	n-3	MSE = SSE / (n – 3)
Total	TSS	n-1

Given: n-1 = 23 i.e. n = 24 i.e. sample size for each treatment is 24 / 3 = 8.

SST = 390.58

SSE = 158.40

TSS = 548.98.

Total degrees of freedom for error = n – 3 = 24 – 3 = 21

MST = SST / 2 = 390.58 / 2 = 195.290

MSE = SSE / 21 = 158.4 / 21 = 7.54

Cal F = MST / MSE = 195.290 / 7.543 = 25.89

Performing ANOVA

By using the above calculated value we complete the ANOVA as :

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Between Treatments	390.58	2	195.29	25.89
Within Treatment (Error)	158.4	21	7.54
Total	548.98	23

Decision Criteria to decide whether to reject null hypothesis (H₀) or not:

If we observe that F-Cal > Critical value of F then we reject the null hypothesis otherwise fail to reject null hypothesis.

Under null hypothesis, F-Cal follows F distribution with (2, 21) degrees of freedom.

where 2 is for numerator degrees of freedom and 21 is denominator degrees of freedom.

Table value of F distribution with (2, 21) degrees of freedom at = 0.05 is F _{(0.05, 2, 21)}= 3.47

By comparing F-Cal and table F value, we observe that F-Cal = 25.89 > F _{(0.05, 2, 21)}= 3.47 , so we reject null hypothesis at = 0.05. i.e. all the treatment means are not same. At least one of the treatment mean is different.

Question 4

Response variable:

Y : number of mobile phone sold per day.

Predictor variables:

X1 : Price in $ 1000

X2 : number of advertising spots

(a)

From the given output we can write regression equation for multiple when price (X1) and number of advertising spots (X2) are regressed over Y:

Intercept = 0.8051, Slope of X1 = 0.4977 and Slope of X2 = 0.4733

So,

Here we are interested in testing the following null and alternative hypothesis:

Null Hypothesis: There is no significant relationship between number of mobile phone sold per day and independent variables Price in $ 1000 and number of advertising spots

Alternative Hypothesis: There is no significant relationship between number of mobile phone sold per day and independent variables Price in $ 1000 and number of advertising spots

Our first step is construction of complete ANOVA

We have two independent variables and 7 observations are taken so

ANOVA

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Regression	SSReg	2	MSReg = SSReg / 2	F Cal = MSReg / MSRes
Residual	SSRes	4	MSRes = SSE / 4
Total	TSS	6

Given: SSreg = 40.7, SSRes = 1.016, TSS = 41.716

MSReg = SSreg / 2 = 40.7 / 2 = 20.35

MSRes = SSres / 4 =1.016 / 4 = 0.254

F Cal = MSReg / MSRes = 20.35 / 0.254 = 80.12

Complete ANOVA is given below by computing the remaining values

Source of Variation	Sum of Squares	Degrees of Freedom	Mean Square	F
Regression	40.7	2	20.350	80.12
Residual	1.016	4	0.254
Total	41.716	6

Decision Criteria to decide whether to reject null hypothesis (H₀) or not:

If we observe that F-Cal > Critical value of F then we reject the null hypothesis otherwise fail to reject null hypothesis. Under null hypothesis, F-Cal follows F distribution with (2, 4) degrees of freedom.

where 2 is for numerator degrees of freedom and 4 is denominator degrees of freedom.

Table value of F distribution with (2, 4) degrees of freedom at = 0.05 is F _{(0.05, 2, 4)}= 6.7

By comparing F-Cal and table F value, we observe that F-Cal = 80.12 > F _{(0.05, 2, 4)}= 6.7 , so we reject null hypothesis at = 0.05.

i.e There is significant relationship between number of mobile phone sold per day and independent variables (Price in $ 1000, number of advertising spots).

i) Here we test

H₀: against H₁:

Cal t for testing this null hypothesis is

Under Null hypothesi, Cal-t follows t distribution with 4 degrees of freedom.

Critical values of at = 0.05 is

Critical t value = 2.78

Decision criteria:

Reject H₀ if

Now |1.078| < 2.78 means we fail to reject H₀.

ii) Here we test H₀: vs H₁:

Cal-t for testing this hypothesis is

Under Null hypothesi, Cal-t follows t distribution with 4 degrees of freedom.

Critical values of at = 0.05 is

Critical t value = 2.78

Decision criteria:

Reject H₀ if

So |12.22997| > 2.78

So we reject Null hypothesis.

Interpretation of Slope of X2:

If price is fixed (i.e. X1 is fixed), then each change in one unit in number of advertising spot (X2), there is 0.4733 numbers of mobile phone sold per day (Y).

Given: X1 = 20 and X2 = 10

Then

Y= 0.8051 + 0.4977 × X1 + 0.4733 × X2=0.8051 + 0.4977 × 20 + 0.4733 × 10 = 15.492 15

15 (approximately) mobile phones expected to sell in a particular day if company charges $ 20000 and used 10 advertising spots.

References

Bickel, P.J. and Doksum, K.A., 2015. Mathematical statistics: basic ideas and selected topics, volume I(Vol. 117). CRC Press.

Box, G.E., Hunter, J.S. and Hunter, W.G., 2005. Statistics for experimenters: design, innovation, and discovery(Vol. 2). New York: Wiley-Interscience.

Chatterjee, S. and Hadi, A.S., 2015. Regression analysis by example. John Wiley & Sons.
DeGroot, M.H. and Schervish, M.J., 2012. Probability and statistics. Pearson Education.
Draper, N.R. and Smith, H., 2014. Applied regression analysis(Vol. 326). John Wiley & Sons.
Hogg, R.V. and Craig, A.T., 1995. Introduction to mathematical statistics.(5″” edition)(pp. 269-278). Upper Saddle River, New Jersey: Prentice Hall.

Montgomery, D.C., 2017. Design and analysis of experiments. John wiley & sons.
Moyé, L. A., Chan, W., & Kapadia, A. S. (2017). Mathematical statistics with applications. CRC Press.
Ross, S.M., 2014. Introduction to probability and statistics for engineers and scientists. Academic Press.
Ryan, T.P., 2008. Modern regression methods(Vol. 655). John Wiley & Sons.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Frequency, Regression, ANOVA And Multiple Regression Analysis ”

Get high-quality paper

NEW! AI matching with writer

Order an Essay Now & Get These Features For Free:

Turnitin Report

Formatting

Title Page

Citation

Outline

Place an Order