Data Analysis
The focus of the current report is to carry out analysis of the available price data of 120 products of four different brands and span of three different styles thorough the application of both descriptive and inferential statistical techniques. This would enable the existing company to gain a valuable understanding about the existing and hence allow for prudent decision making for the company on the launch of new product lines. Through hypothesis testing, various common beliefs have been tested, the results of whom could be highly helpful in making prudent pricing decisions for the product launch. Also, through the usage of descriptive analysis, the various characteristics of the data collected in this regard has been obtained particularly with regards to shape, variability and underlying distribution. Based on the analysis conducted, useful conclusions have been drawn.
A manufacturing company based in Australia which is involved in the fashion sector intends to carry out the development of latest product designs in order to enhance the market share. In response to the need of the company’s desire to launch new product lines, the R&D department has a proposal prepared that deals with proposed clothing lines that could be launched. The main aim of this report is to carry out a thorough analysis of the prices of the existing competitors in the market so that the company can accordingly price the new product lines.
A detailed analysis of the price distribution for the various brands and styles need to be carried out using both descriptive and inferential statistics techniques. The objective of the descriptive statistical techniques is to describe the characteristics such as central tendency, variation, skew etc. On the other hand, the objective of the inferential statistical techniques is to infer conclusions about population characteristics based on sample information. The objective is to infer whether the prices across brands, styles and gender are significantly different or not. Further, using descriptive statistics an analysis of the existing prices can be drawn which is helpful.
The various tasks to be carried out for the analysis of the given data are performed below.
The descriptive statistics pertaining to the prices of the Calvin Klein is indicated below.
Particulars |
Amount ($) |
Mean |
69.57 |
Standard Error |
4.27 |
Median |
65.06 |
Standard Deviation |
23.38 |
Sample Variance |
546.55 |
Kurtosis |
0.11 |
Skewness |
0.40 |
Range |
99.09 |
Minimum |
29.09 |
Maximum |
128.17 |
The visual description of the given prices for Calvin Klein (Brand 1) is indicated below through the usage of a histogram.
Comment
It is apparent that the given distribution is not a normal distribution. This is apparent from the fact that the median and mean values are not the same as would be expected in a normal distribution. Further, there is presence of slight positive skew or rightward skew. The shape of the distribution is non-symmetric. Considering that the mean is higher than the median price implies that there are certain products with exceptionally higher prices which is distorting the mean and leading to presence of skew. Overall variation in the data is moderate when seen in respect to the mean price.
Brand-wise Analysis
The descriptive statistics pertaining to the prices of Zara is indicated below.
Particulars |
Amount ($) |
Mean |
51.11 |
Standard Error |
2.53 |
Median |
51.81 |
Standard Deviation |
13.87 |
Sample Variance |
192.37 |
Kurtosis |
-0.78 |
Skewness |
-0.05 |
Range |
50.02 |
Minimum |
25.07 |
Maximum |
75.09 |
The visual description of the given prices for Zara (Brand 2) is indicated below through the usage of a histogram.
Comment
The above distribution seems approximately normal as the mean and median values are approximately same. Also, the shape of the distribution is symmetric with almost zero skew present. The variability of the data is low in the given case as is apparent from the range and standard deviation as a % of the mean value.
The descriptive statistics pertaining to the prices of Tommy Hilfiger is indicated below.
Particulars |
Amount ($) |
Mean |
73.87 |
Standard Error |
5.90 |
Median |
76.16 |
Standard Deviation |
32.29 |
Sample Variance |
1042.67 |
Kurtosis |
-0.87 |
Skewness |
-0.06 |
Range |
115.82 |
Minimum |
14.38 |
Maximum |
130.20 |
The visual description of the given prices for Tommy Hilfiger (Brand 3) is indicated below through the usage of a histogram.
Comment
It is apparent that the given distribution is not a normal distribution. This is apparent from the fact that the median and mean values are not the same as would be expected in a normal distribution. Further, there is presence of slight negative skew or leftward skew. The shape of the distribution is non-symmetric. Considering that the mean is lower than the median price implies that there are certain products with exceptionally low prices which is distorting the mean and leading to presence of skew. Overall variation in the data is moderate when seen in respect to the mean price.
The descriptive statistics pertaining to the prices of Ralph Lauren is indicated below.
Particulars |
Amount ($) |
Mean |
71.50 |
Standard Error |
5.71 |
Median |
64.12 |
Standard Deviation |
31.26 |
Sample Variance |
977.42 |
Kurtosis |
-0.08 |
Skewness |
0.60 |
Range |
126.94 |
Minimum |
22.08 |
Maximum |
149.02 |
The visual description of the given prices for Ralph Lauren (Brand 4) is indicated below through the usage of a histogram.
Comment:
It is apparent that the given distribution is not a normal distribution. This is apparent from the fact that the median and mean values are not the same as would be expected in a normal distribution. Further, there is presence of slight positive skew or rightward skew. The shape of the distribution is non-symmetric. Considering that the mean is higher than the median price implies that there are certain products with exceptionally higher prices which is distorting the mean and leading to presence of skew. Overall variation in the data is moderate when seen in respect to the mean price.
The relevant description statistics for style 1 which represents business are captured in the table below.
Particulars |
Amount ($) |
Mean |
93.75 |
Standard Error |
3.62 |
Median |
94.55 |
Standard Deviation |
22.88 |
Sample Variance |
523.45 |
Kurtosis |
-0.42 |
Skewness |
0.27 |
Range |
92.53 |
Minimum |
56.49 |
Maximum |
149.02 |
The visual description of the given prices for Style 1 or Business is indicated below though the usage of a histogram.
Style-wise Analysis
Comment:
It is apparent that the given distribution is not a normal distribution. This is apparent from the fact that the median and mean values are not the same as would be expected in a normal distribution. Further, there is presence of slight positive skew or rightward skew. The shape of the distribution is non-symmetric. Considering that the mean is higher than the median price implies that there are certain products with exceptionally higher prices which is distorting the mean and leading to presence of skew. Overall variation in the data is low when seen in respect to the mean price.
The relevant description statistics for style 2 which represents sports are captured in the table below.
Particulars |
Amount ($) |
Mean |
64.86 |
Standard Error |
2.20 |
Median |
62.60 |
Standard Deviation |
13.91 |
Sample Variance |
193.47 |
Kurtosis |
-0.39 |
Skewness |
0.23 |
Range |
62.72 |
Minimum |
36.22 |
Maximum |
98.93 |
The visual description of the given prices for Style 2 or sports is indicated below through the usage of a histogram.
Comment
It is apparent that the given distribution is not a normal distribution. This is apparent from the fact that the median and mean values are not the same as would be expected in a normal distribution. Further, there is presence of slight positive skew or rightward skew. The shape of the distribution is non-symmetric. Considering that the mean is higher than the median price implies that there are certain products with exceptionally higher prices which is distorting the mean and leading to presence of skew. Overall variation in the data is low when seen in respect to the mean price.
The relevant description statistics for style 3 which represents casual are captured in the table below.
Particulars |
Amount ($) |
Mean |
40.93 |
Standard Error |
1.94 |
Median |
41.17 |
Standard Deviation |
12.25 |
Sample Variance |
150.03 |
Kurtosis |
-0.24 |
Skewness |
-0.05 |
Range |
50.88 |
Minimum |
14.38 |
Maximum |
65.26 |
The visual description of the given prices for Style 3 or casual is indicated below through the usage of a histogram.
Comment
The above distribution seems approximately normal as the mean and median values are approximately same. Also, the shape of the distribution is symmetric with almost zero skew present. The variability of the data is low in the given case as is apparent from the range and standard deviation as a % of the mean value.
The descriptive statistics corresponding to the various styles of Zara brand are indicated below.
Business |
|
Particulars |
Value |
Mean |
65.86 |
Standard Error |
2.17 |
Median |
66.60 |
Standard Deviation |
6.86 |
Sample Variance |
47.06 |
Kurtosis |
-1.73 |
Skewness |
-0.07 |
Range |
18.61 |
Minimum |
56.49 |
Maximum |
75.09 |
Sports |
|
Particulars |
Value |
Mean |
50.14 |
Standard Error |
2.21 |
Median |
51.81 |
Standard Deviation |
6.98 |
Sample Variance |
48.78 |
Kurtosis |
1.06 |
Skewness |
-0.57 |
Range |
25.45 |
Minimum |
36.22 |
Maximum |
61.67 |
Casual |
|
Particulars |
Value |
Mean |
37.32 |
Standard Error |
2.65 |
Median |
36.67 |
Standard Deviation |
8.37 |
Sample Variance |
70.06 |
Kurtosis |
-1.33 |
Skewness |
0.10 |
Range |
23.80 |
Minimum |
25.07 |
Maximum |
48.87 |
The relevant visual representation of the prices of the products belonging to the various styles for Zara is indicated below.
Business – The distribution seems approximately normal as the mean and median values are approximately same. Also, the shape of the distribution is symmetric with almost zero skew present. The variability of the data is low in the given case as is apparent from the range and standard deviation as a % of the mean value.
Conclusion
Sports – The distribution seems non-normal due to the mean and median not being equal coupled with presence of some negative skew which reflects that the distribution would not be symmetric. Further, the variability as measured through various indicators such as standard deviation and range is low.
Casual – The distribution seems approximately normal as the mean and median values are approximately same. Also, the shape of the distribution is symmetric with almost zero skew present. The variability of the data is low in the given case as is apparent from the range and standard deviation as a % of the mean value.
The objective is to ascertain whether the common belief about difference being prevalent in the clothing lines corresponding to males and females is indeed true or not. The relevant technique to be deployed here is hypothesis testing.
The relevant hypotheses are as stated below.
Null Hypothesis (Ho): µM = µF i.e. there is no significant difference between the prices of clothing lines corresponding to males and females
Alternative Hypothesis (H1): µM ≠ µF i.e. there is significant difference between the prices of clothing lines corresponding to males and females
It is apparent that the population standard deviation for the prices of clothing lines corresponding to males and females is unknown, hence a t statistic would be preferred to be a z statistic. Further, it would be reasonable to assume that the population would exhibit a normal distribution. A t-test for difference of means would be deployed in the given case, using EXCEL as the enabling tool.
t-Test: Two-Sample Assuming Unequal Variances |
||
Male |
Female |
|
Mean |
63.89 |
69.13 |
Variance |
604.73 |
903.04 |
Observations |
60 |
60 |
Hypothesized Mean Difference |
0 |
|
df |
114 |
|
t Stat |
-1.04 |
|
P(T<=t) one-tail |
0.15 |
|
t Critical one-tail |
1.66 |
|
P(T<=t) two-tail |
0.30 |
|
t Critical two-tail |
1.98 |
The above output clearly indicated that the two tail p value is 0.30. Taking a significance level of 5% or 0.05, it is apparent that the p value is greater than the significance level. This implies that there is lack of enough statistical evidence so as to reject the null hypothesis. Hence, the alternative hypothesis cannot be accepted in the given case. Therefore, it may be concluded that there is no significant difference between the prices of clothing lines corresponding to males and females.
The objective here is to determine whether there is any statistically significant difference between the average prices of the four brands given in the data. It is imperative that for comparison of means as in Task 4, we could have used the t test but a shortcoming of the t test is that it cannot be applied to more than two different variables or categories. However, in the given case, there are four brands and their mean needs to be compared simultaneously. Thus, in the hypothesis testing one factor ANOVA test would be deployed.
The relevant hypotheses are as stated below.
Null Hypothesis (Ho): µCalvin Klein = µZara = µTommyHilfiger = µRalph Lauren i.e. there is no statistically significant difference between the mean prices of the four brands
Alternate Hypothesis (H1): Atleast one of the four means is not equal and hence there is statistically significant difference between the mean prices of the four brands
The relevant output of excel from a single factor ANOVA test is indicated below.
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
CALVIN KLEIN |
30 |
2087.07 |
69.57 |
546.55 |
ZARA |
30 |
1533.25 |
51.11 |
192.37 |
TOMMY HILFIGER |
30 |
2216.08 |
73.87 |
1042.67 |
RALPH LAUREN |
30 |
2145.05 |
71.50 |
977.42 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
9769.178 |
3 |
3256.39 |
4.72 |
0.00 |
2.68 |
Within Groups |
80011.35 |
116 |
689.75 |
|||
Total |
89780.53 |
119 |
From the above ANOVA output, it is apparent that p value comes out to be zero. As the p value is lesser than the assumed significance level of 5% or 0.05, hence it indicates the presence of enough statistical evidence to cause null hypothesis rejection which further results in acceptance of alternative hypothesis. Hence, it may be fair to conclude that there is statistically significant difference between the mean prices of the four brands.
A claim is often made that the underlying prices for the various styles tends to be different due to a host of factors. This claim needs to be put to test through appropriate statistical test through the use of hypothesis testing. It is imperative that for comparison of means as in Task 4, we could have used the t test but a shortcoming of the t test is that it cannot be applied to more than two different variables or categories. However, in the given case, there are three styles and their mean needs to be compared simultaneously. Thus, in the hypothesis testing one factor ANOVA test would be deployed.
The relevant hypotheses are as stated below.
Null Hypothesis (Ho): µBusiness = µSports = µCasual i.e. there is no statistically significant difference between the mean prices of the three styles.
Alternate Hypothesis (H1): Atleast one of the three means is not equal and hence there is statistically significant difference between the mean prices of the three styles.
The relevant output of excel from a single factor ANOVA test is indicated below.
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
Business |
40 |
3750.01 |
93.75 |
523.45 |
Sports |
40 |
2594.36 |
64.86 |
193.47 |
Casual |
40 |
1637.08 |
40.93 |
150.03 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
55969.68 |
2 |
27984.84 |
96.84 |
0.00 |
3.07 |
Within Groups |
33810.85 |
117 |
288.98 |
|||
Total |
89780.53 |
119 |
From the above ANOVA output, it is apparent that p value comes out to be zero. As the p value is lesser than the assumed significance level of 5% or 0.05, hence it indicates the presence of enough statistical evidence to cause null hypothesis rejection which further results in acceptance of alternative hypothesis. Hence, it may be fair to conclude that there is statistically significant difference between the mean prices of the three styles.
The objective of the given task is to test the claim made by Zara that it has similar prices for the three different styles namely business, sports and casual. Since the mean of three variables need to be compared simultaneously, hence ANOVA test would be a suitable choice. Further, since there is only one factor involved i.e. prices of various styles, hence the test conducted is known as single factor ANOVA test.
The relevant hypotheses are as stated below.
Null Hypothesis (Ho): µBusiness(Z) = µSports(Z) = µCasual (Z) i.e. there is no statistically significant difference between the mean prices of the three styles for ZARA brand.
Alternate Hypothesis (H1): Atleast one of the three means is not equal and hence there is statistically significant difference between the mean prices of the three styles for ZARA brand.
The relevant output of excel from a single factor ANOVA test is indicated below.
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
Business |
10 |
658.59 |
65.86 |
47.06 |
Sports |
10 |
501.45 |
50.14 |
48.78 |
Casual |
10 |
373.22 |
37.32 |
70.06 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
4085.695 |
2 |
2042.85 |
36.94 |
0.00 |
3.35 |
Within Groups |
1493.111 |
27 |
55.30 |
|||
Total |
5578.806 |
29 |
From the above ANOVA output, it is apparent that p value comes out to be zero. As the p value is lesser than the assumed significance level of 5% or 0.05, hence it indicates the presence of enough statistical evidence to cause null hypothesis rejection which further results in acceptance of alternative hypothesis. Hence, it may be fair to conclude that there is statistically significant difference between the mean prices of the three styles for Zara brand and hence the claim made by the company is not correct.
Conclusion
Based on the above analysis, it is apparent that there is no major difference in the clothing lines for the two genders. However, price difference is significant in terms of both brand and styles. The business style products seem to be the most expensive while the casual products are priced at the cheapest prices. Further, all brands tend to maintain this difference despite their claim otherwise. Hence, it may be recommended that the company should also incorporate these observations in the pricing of new clothing line while also accounting for the existing products in the market by various brands and their underlying distribution.