Frequency Distribution And Linear Regression Analysis

Frequency distribution of the dataset

Question 1 Solution:

Here our variable under study is furniture order (in $)

For constructing frequency distribution, we need to find minimum and maximum observation from the data. As the class with is given as 50.

Minimum = 123

Maximum = 490

So, Range of the data = Maximum – Minimum = 490 -123 = 367

Number of classes = (Range of the data) / (class width) = 367 / 50 = 7.340

So we make 7.37 8 classes so that minimum of the data included in the first class and maximum will be in the last class

So the 8 classes are as follows:

120 – 170

170 – 220

220 – 270

270 – 320

320 – 370

370 – 420

420 – 470

470 – 520

Frequency of any particular class is the count of observation in that class i.e. number of data points which greater than or equal to lower boundary of class and less that upper boundary of class.

Class-Interval	Frequency
120 – 170	8
170 – 220	15
220 – 270	12
270 – 320	4
320 – 370	5
370 – 420	2
420 – 470	2
470 – 520	2
Total	50

Relative frequency of any class is portion of frequency of that class with total frequency. Here the total frequency i.e. Total number of observation is 50.

Class-Interval	Relative Frequency
120 – 170	0.16
170 – 220	0.30
220 – 270	0.24
270 – 320	0.08
320 – 370	0.10
370 – 420	0.04
420 – 470	0.04
470 – 520	0.04
Total	1.00

Percent frequency of any particular class = Relative frequency of that class × 100

Class Interval	Percent Frequency
120 – 170	16
170 – 220	30
220 – 270	24
270 – 320	8
320 – 370	10
370 – 420	4
420 – 470	4
470 – 520	4
Total	100

Figure: Percent Frequency Histogram

One can see there is positive skewness in the data.

Apart from the shape of the distribution, mean is always good measure for all shaped distribution where it is exist.

For given observations,

Mean=251.46, Median = 228.5 and Mode = 231

Question 2 Solution :

It is the problem of simple linear regression.

Response variable is Demand and Predictor (independent) variable is unit price.

Y: Demand and X: Unit Price

(a)

Here we test the null hypothesis that Y and X are not related against the alternative hypothesis that Y and X are related.

To test this hypothesis, we need to complete the given incomplete ANOVA

SS = sum of square and

df = degrees of freedom.

Given values in ANOVA:

sum of square of regression = SSReg = 5048.818

sum of square of error = SSError = 3132.661

df for regression = 1

df for error = 46

df for total = 47.

Now,

MSReg= Mean sum of square of regression

MSReg= SSReg / df

= 5048.818 / 1 =5048.818

MSReg = 5048.818

MSE = Mean sum of square of error

MSError = SSError / df

= 3132.661 / 46 = 68.101

MSError = 68.101

F- Value = MSReg / MSError = 5048.818 / 68.101 = 74.137

Completed ANOVA of the regression analysis is

Relative and percent frequency distribution

ANOVA

Sources of Variation	df	SS	MSS	F-Value
Regression	1	5048.818	5048.8180	74.1369
Residual	46	3132.661	68.1013
Total	47	8181.479

Decision Criteria for taking the decision of reject or do not reject the null hypothesis:

If F-Value > Critical value of F then we reject the null hypothesis otherwise do not have enough evidence to reject null hypothesis. To test the hypothesis, we assume that null hypothesis is true.

Under null hypothesis, F-Value follows F distribution with (1, 46) degrees of freedom.

where 1 is for numerator df and 46 is denominator df.

Critical value of F distribution with (1, 46) degrees of freedom at = 0.05 is F _{(0.05, 1,46)}= 4.0517

By comparing F-Value and Critical F value, we can see that F-Value = 74.1369 > F _{(0.05, 1,46)}= 4.0517, so we reject null hypothesis. So conclude that we have strong evidence that demand (Y) and unit price (X) are related to each other.

R²/ Coefficient of Determination:

When we fit the regression model fit the data, variation in response variable is explained by the regressor variable (predictor variable) and error. Coefficient of determination is the proportion of variation explained by the predictor variable from the total variation. Usually Coefficient of determination is denoted by R² and given by

Coefficient of determination = SSReg / SSTotal

where SSTotal is total sum of square.

Coefficient of determination (R²) = 5048.818 / 81.81479 = 0.617103

It means that out of 100% variation in response (Demand), predictor variable (unit price) explained about 61.7103 % variation.

(c)

Correlation coefficient:

We know that R² is the square of correlation coefficient. So can compute the correlation coefficient by taking square root of coefficient of determination. AS slope of X is negative, means correlation between X and Y is negative.

correlation coefficient = –

correlation coefficient = – = – 0.785559

Relationship between response variable demand and predictor variable unit price:

As slope of X we can say that relation between response variable demand and predictor variable unit price is negative. Also we compute the correlation coefficient between response variable demand and predictor variable unit price is -0.785559 which suggest that there is very strong negative relationship between response variable demand and predictor variable unit price.

Both the response variable demand and predictor variable unit price are inversely related. As when unit price increases the demand decreases and when unit price decreases demand will increases.

Question 3 Solution:

Here we are interesting to test whether the all treatments have same mean or not.

Linear regression analysis for the dataset

H₀: All the treatment means are same.

H₀: At least one of the treatment means is different.

For testing the null hypothesis, we first have to complete the ANOVA

If we have k treatments and total observations are n then for testing the null hypothesis ANOVA is.

Source_of_Variation	Sum_of_Squares	Degrees_of_Freedom	Mean_Square	F
Between_Treatments	SST	k-1	MST	Cal F
Error	SSE	n-k	MSE
Total	TSS	n-1

Where SST is sum of square due to treatments, SSE sum of square due to error, TSS is total sum of square, MST is mean sum of square due treatments and MSE is mean sum of square due to error.

Given, k = 3, n-1 = 23, SST = 390.58, SSE = 158.40, TSS = 548.98.

Treatment degrees of freedom = k -1 = 3 – 1 = 2

Total degrees of freedom = n -1 = 23 i.e. n=24

Total degrees of freedom for error = n – k = 24 – 3 = 21

MST = SST / d.f. of treatment = 390.58 / 2 = 195.290

MSE = SSE / d.f of error = 158.4 / 21 = 7.543

Cal F = MST / MSE = 195.290 / 7.543 = 25.891

Completed_ANOVA:

Variation Source	SS	DF	MSS	Cal F
Between_Treatments	390.58	2	195.290	25.8907
Within_Treatments (Error)	158.4	21	7.5429
Total	548.98	23

Now we compare this Cal F value with F Tabulated.

Decision Criteria:

We reject the null hypothesis if F Cal > F tabulated then we reject null hypothesis otherwise do not have enough evidence to reject null hypothesis.

Under null hypothesis, F Cal follows F distribution with (2, 21) degrees of freedom. At = 0.05 F tabulated = F _{(0.05, 2,21)}= 3.467

Now F Cal =25.8907 > F tabulated = 3.467, so we reject null hypothesis. i.e. Means all the treatments means are not same.

Question 4 Solution:

This is multiple linear regression where we have two predictor variables. Our response variable is number of mobile phone sold per day (Y). Price in $ 1000 (X1) and number of advertising spots (X2) are two predictor variables.

(a)

Regression equation when X1 and X2 are regressed over Y:

i.e.

(b)

H₀: There is no significant relationship between dependent variable and independent variables.

H₁: There is significant relationship between dependent variable and independent variables.

To test this claim we need to complete the given ANOVA:

When there is p independent variables and n observations then

ANOVA

Source_of_Variation	Sum_of_Squares	Degrees_of_Freedom	Mean_Square	F
Regression	SSReg	p	MSReg = SSReg / p	F Cal = MSReg / MSE
Residual	SSE	n-1-p	MSE = SSE / (n-1-p)
Total	TSS	n-1

The complete ANOVA is given as

Source_of_Variation	Sum_of_Squares	Degrees_of_Freedom	Mean_Square	F
Regression	40.7	2	20.350	80.1181
Residual	1.016	4	0.254
Total	41.716	6

Decision Criteria:

If F Cal > F Critical then we reject null hypothesis otherwise do not reject null hypothesis.

F Critical = F _{(0.05, 2,4)}= 6.9943

So we reject H₀ as F Cal = 80.1181 > F Critical = 6.9943.

i.e There is significant relationship between dependent variable and independent variables.

i) Here we test

H₀: against H₁:

t-cal for testing this null hypothesis is

Under H₀, t-cal follows t distribution with 4 degrees of freedom.

At = 0.05, the critical values of distribution is

Critical t value = 2.7764

Decision criteria:

Reject H₀ if

So |1.078| < 2.7764, So we do not have enough evidence to reject H₀.

ii) Here we test H₀: vs H₁:

t-cal for testing this hypothesis is

Under H₀, t-cal follows t distribution with 4 degrees of freedom.

At = 0.05, the critical values of distribution is

Critical t value = 2.7764

Decision criteria:

Reject H₀ if

So |12.33| > 2.776

So we reject H₀.

(d)

Interpretation of Slope of X2:

If price is fixed (i.e. X1), then each change in one unit in number of advertising spot (X2), number of mobile phone sold per day (Y) changes by 0.4733 unit.

(e)

If price is $20000 ie. X1=20 and number of advertising spots i.e. X2=10

Then

Y= 0.8051 + 0.4977 × 20 + 0.4733 × 10 = 15.492 15

If the company charges $20,000 for each phone and they use 10 advertising spots, then 15 (approximately) mobile phones expected to sell in a particular day.

References

Bickel, P.J. and Doksum, K.A., 2015. Mathematical statistics: basic ideas and selected topics, volume I(Vol. 117). CRC Press.

Darlington, R.B. and Hayes, A.F., 2016. Regression analysis and linear models: Concepts, applications, and implementation. Guilford Publications.

Dean, A., Morris, M., Stufken, J. and Bingham, D. eds., 2015. Handbook of design and analysis of experiments(Vol. 7). CRC Press.

Fox, J., 2015. Applied regression analysis and generalized linear models. Sage Publications.

Larsen, R.J. and Marx, M.L., 2017. An introduction to mathematical statistics and its applications(Vol. 5). Pearson.

Montgomery, D.C., 2017. Design and analysis of experiments. John wiley & sons.

Rohatgi, V.K. and Saleh, A.M.E., 2015. An introduction to probability and statistics. John Wiley & Sons.

Tucker, H.G., 2014. An introduction to probability and mathematical statistics. Academic Press.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Frequency Distribution And Linear Regression Analysis ”

Get high-quality paper

NEW! AI matching with writer

Order an Essay Now & Get These Features For Free:

Turnitin Report

Formatting

Title Page

Citation

Outline

Place an Order