Order Quantity
The selected 60 random variables out of 2002 random variables are collected. The option “Random Variable Generation” of “Data Analysis” toolpack is used for extracting random variable.
Order Quantity |
|
Mean |
30.116667 |
Median |
32 |
Mode |
39 |
Standard Deviation |
13.09171256 |
Variance |
171.3929379 |
Minimum |
2 |
Maximum |
50 |
Range |
48 |
Coefficient of Variation |
43.46999% |
1st quartile |
19 |
3rd quartile |
40.25 |
Inter quartile range |
21.25 |
(Oja 1983)
The average value of the Order Quantity is 30.11667 with standard deviation 13.0917 and variance 171.3929. The maximum order quantity is 50 and minimum order quantity is 2 with the range 48. The calculated coefficient of variation is 43.4699%. The location parameters show that 1st quartile, median (2nd quartile) and 3rd quartile values of order quantity are 19, 30.11 and 40.25. The inter quartile range is found to be 21.25. The Order Quantity of number 32 is maximum in frequency.
Sales |
|
Mean |
$1,808.93 |
Median |
$579.59 |
Mode |
#N/A |
Standard Deviation |
$3,340.82 |
Variance |
$11,161,055.62 |
Minimum |
$30.10 |
Maximum |
$18,056.68 |
Range |
$18,026.58 |
Coefficient of Variation |
184.68425% |
1st quartile |
$185.93 |
3rd quartile |
$1,711.22 |
Inter quartile range |
$1,525.29 |
The average value of the Sales of Hardware and Gardens supplies is $1,808.93 with standard deviation $3,340.82 and variance $11,161,055.62. The maximum order quantity is $18,056.68 and minimum order quantity is $30.10 with the calculated range $18,026.58. The calculated coefficient of variation is 184.68425%. The location parameters show that 1st quartile, median (2nd quartile) and 3rd quartile values of order quantity are $185.93, $579.59 and $1,711.22. The inter quartile range is found to be $1,525.29. The mode is not found in case of Sales data.
Shipping cost |
|
Mean |
$13.69 |
Median |
$5.48 |
Mode |
$24.49 |
Standard Deviation |
$20.99 |
Variance |
$440.51 |
Minimum |
$0.49 |
Maximum |
$99.00 |
Range |
$98.51 |
Coefficient of Variation |
153.27066% |
1st quartile |
$1.99 |
3rd quartile |
$13.41 |
Inter quartile range |
$11.42 |
The average value of the Shipping cost of Hardware and Gardens supplies is $13.69 with standard deviation $20.99 and variance $440.51. The maximum order quantity is $99.00 and minimum order quantity is $0.49 with the calculated range $98.51. The calculated coefficient of variation is 153.27066%. The location parameters show that 1st quartile, median (2nd quartile) and 3rd quartile values of order quantity are $1.99, $5.48 and $13.41. The inter quartile range is found to be only $11.42. The Shipping cost of amount $24.49 has occurred for maximum time than any other shipping costs.
Days to ship |
|
Mean |
2.283333 |
Median |
2 |
Mode |
2 |
Standard Deviation |
2.067436525 |
Variance |
4.274293785 |
Minimum |
0 |
Maximum |
9 |
Range |
9 |
Coefficient of Variation |
90.54467% |
1st quartile |
1 |
3rd quartile |
2.25 |
Inter quartile range |
1.25 |
The average value of the Days to ship of Hardware and Gardens supplies is 2.283 days with standard deviation 2.0674 and variance 4.27429. The maximum number of days to ship is 9 days and minimum number of days to ship is 0 day with the calculated range of 9 days. The calculated coefficient of variation is 90.54467%. The location parameters show that 1st quartile, median (2nd quartile) and 3rd quartile values of days to ship are 1 day, 2 days and 2.25 days. The inter quartile range is 1.25 day. Most of the days to ship observed as 2 days.
Order Priority |
Frequency |
Percentage |
Not Specified |
10 |
16.67% |
Low |
18 |
30.00% |
Medium |
10 |
16.67% |
High |
9 |
15.00% |
Critical |
13 |
21.67% |
Total |
60 |
100.00% |
The qualitative variable Order Priority refers maximum frequency (18, 30%) for “Low” order priority followed by “Critical” Order Priority (13, 21.67%). The minimum frequency has (9, 15%) for “High” order priority.
Ship Mode |
Frequency |
Percentage |
Regular Air |
49 |
81.6667% |
Delivery Truck |
7 |
11.6667% |
Express Air |
4 |
6.6667% |
Total |
60 |
100% |
Sales
The qualitative variable “Ship Mode” indicates significantly maximum frequency (49, 81.667%) for “Regular Air” shipping mode. “Delivery Truck” (7, 11.667%) and “Express Air” (4, 6.667%) have significantly lesser frequencies and percentages of frequencies than “Regular Air”.
Region |
Frequency |
Percentage |
Easter region |
39 |
65.0000% |
Western region |
21 |
35.0000% |
Total |
60 |
100% |
The qualitative variable “Region” shows that “Eastern region” has more occurrences (39, 65%) than “Western region” (21, 35%).
Customer Segment |
Frequency |
Percentage |
Consumer |
11 |
18.3333% |
Corporate |
18 |
30.0000% |
Small Business |
17 |
28.3333% |
Home Office |
14 |
23.3333% |
Total |
60 |
100.0000% |
The “Customer Segment” refers that most of the customers are working in “Corporate” with frequency 18 and percentage 30% followed by “Small Business” with frequency 17 and percentage 28.33%. The minimum Customer Segment is observed in “Corporate” (11, 18.33%).
The Line Diagram shows the percentages of Order Priority
The histogram of percentages of Shipping Mode
The Pie chart shows distribution of two different regions. The areas in the circle are proportional to the two types of regions.
The bar plot of Customer Segment dictates the percentages of four types of customer segments. The heights of the bars are proportional to the percentages of types of customer segments (Jelen 2010).
The 95% Confidence limit with the help of Z-statistic is-
Z(0.05) *
The confidence intervals found with the help of one-sample Z-test is –
Z(0.05) * (Houghton Mifflin Harcourt, 2018)
We are keen to find the average sales amount of orders for the “Home Office” only. The average sale of 14 samples (Home Office) out of 60 samples (four types of customer segments) is $677.95. The Standard error is $186.23. At the level of 5% significance, the confidence interval is $365.00. The calculated upper confidence limit and lower confidence limits of “Sales” are $1042.95 and $312.94.
We are 95% confident that the average sales would lie in the interval ($312.94, $1042.95) for home office customers.
Next, we are interested to find the average shipping costs for all selected 60 sample- orders. The average value of Shipping Cost is $13.69. The Standard error is $2.71. At 5% level of significance, the confidence interval is $5.31. The calculated upper confidence limit and lower confidence limits of “Shipping Cost” are $19 and $8.38.
We are 95% confident that the average shipping cost would lie in the interval ($8.38, $19) for all sample orders.
It was realized that the Order Priority of “Critical” would have higher shipping costs than an order of “Low” priority.
Hypotheses:
Null Hypothesis (H0): The shipping cost of “Critical” Order Priority is higher than or equal to “Low” Order priority.
Alternative Hypothesis (HA): The shipping cost of “Critical” Order Priority is lesser than “Low” Order priority.
Shipping Cost
Chosen Test:
A Two-sample Z-test for equality of means is incorporated here for shipping costs of two types of Order priority.
Level of Significance:
α=0.05
Inferential Statistic:
The Z-statistic is – z = ,
where,
x1bar and x2bar are two observed means.
μ1 and μ2 are two observed hypothetical averages.
n1 and n2 are numbers of two samples.
σ1 and σ2 are observed standard deviations of two samples.
The calculated Z-score for means of two samples is 0.53116.
p-Value:
The two-tail p-value for 95% confidence interval is 0.5953.
Conclusion:
Therefore,we accept the null hypothesis of greater or equality of shipping cost for “Critical” Order Priority than “Low” Order Priority with 95% probability (Gardner and Altman 1986).
It is often felt that the mean values of order of sales in dollars, differs for Eastern states (E) and Western states (W).
Hypotheses:
Null Hypothesis (H0): The average sales of Eastern states and Western states are equal.
Alternative Hypothesis (HA): The average sales of Eastern states and Western states are unequal.
Chosen Test:
A Two-sample Z-test for equality of means is incorporated here for average sales of two regions.
Level of Significance:
α=0.05
Inferential Statistic:
The Z-statistic is – z = ,
where,
x1bar and x2bar are two observed means.
μ1 and μ2 are two observed hypothetical averages.
n1 and n2 are numbers of two samples.
σ1 and σ2 are observed standard deviations of two samples.
The calculated Z-score for means of two samples is 1.7822.
p-Value:
The two-tail p-value for 95% confidence interval is 0.07472.
Conclusion:
Therefore, we accept the null hypothesis of greater or equality of average sales for both Eastern and Western regions with 95% probability.
The scatter plot is constructed assuming Order quantity as independent and Sales as dependent variable. The trend line of the scatter plot indicates that the data points are not fitted well.
The linear regression model is-
Y = α + β*X
Where, Y = dependent variable, X= independent variable, α = intercept, β = slope/ coefficient of independent variable
According to the calculated linear regression model assuming “Sales” a dependent and “Order Quantity” an independent variable, the linear regression model is –
Sales = 95.42293306 + 56.89578252 * Order Quantity
The coefficient of correlation ( r ) is 0.222958431. It indicates a weak positive correlation between Sales and Order Quantity (Samawi and Vogel 2014).
The coefficient of determination (R2) is 0.049710462. It shows that the linear association is insignificant.
The intercept (α) of the linear regression model is 95.42293306 and slope (β) of the model is 56.89578252. For zero value of Order Quality, the Sales could be 95.42293306. For per unit increment and decrement of Order Quantity, the Sales increases or decreased by 56.89578252 units. The positive value of the slope refers a positive relationship between dependent and independent variable.
The p-value for F-statistic is 0.086836533. It is greater than 0.05. Therefore, Order Quality has significant effect on Sales of Hardware or Garden samples.
The value of multiple R-square is 0.049710462 (4.9%). Therefore, only 4.9% variability of Order Quality could be explained by Sales. The value of multiple R-square refers that the fitting of the model is not good.
Conclusion:
The analysis helped to find descriptive statistics for quantitative variables. The graphs and frequency tables are executed for qualitative variables. We tested 95% confidence intervals of two different quantitative variables (sales and shipping costs) both with and without condition. We tested the hypotheses of equality and inequality of shipping costs for two types of priorities. The higher value of shipping cost in case of Critical situation than Low situation is found. The testing of equality of average sales for two types of regions is also incorporated. The average sales or Eastern region and Western region are found equal. Finally, the linear regression model is carried out assuming “Sales” a dependent and “Order Quality” an independent variable. The linear association between two variables is found to be insignificant.
References:
Gardner, M.J. and Altman, D.G., 1986. Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J (Clin Res Ed), 292(6522), pp.746-750.
Houghton Mifflin Harcourt (2018). Two-Sample z-test for Comparing Two Means. [online] Cliffsnotes.com. Available at: https://www.cliffsnotes.com/study-guides/statistics/univariate-inferential-tests/two-sample-z-test-for-comparing-two-means [Accessed 6 Feb. 2018].
Jelen, B., 2010. Charts and Graphs: Microsoft Excel 2010. Que Publishing.
Oja, H., 1983. Descriptive statistics for multivariate distributions. Statistics & Probability Letters, 1(6), pp.327-332.
Samawi, H.M. and Vogel, R., 2014. Notes on two sample tests for partially correlated (paired) data. Journal of Applied Statistics, 41(1), pp.109-117.