Executive Summary
This paper is a report on an analysis of the sales of the three car types, namely, BMW, Lexus and Mercedes. It seeks to identify and segment households into preference or customer groups for the cars. The analysis was done with respect to the different ages, incomes and educational qualification of the members of the household. The demand for the cars among the households are analyzed in terms of these variables. Therefore the aim of this paper is to resolve what traits to look at in terms of age of the members, economic condition and educational qualification in a household or a customer to discern their chances of buying either BMW, Lexus or Mercedes. The problem is thus of market segmentation for the cars. The profiles of customers who may prefer a BMW or Lexus or Mercedes is required to be defined to help the Automobile Association and hence the automobile businesses to narrow down business strategies.
The problem here can be approach using a data driven statistical approach, more specifically using inferential techniques of statistics involving testing of hypothesis. The techniques of linear models and binary regression also is found to be relevant to the statistical problem in ways described hence. Data from 420 households including age of members, annual income and number of years of education as well as the model of car owned were used. The data is first summarized and explored using descriptive statistical methods. To study the variables representing the attributes of customers on the basis of which car model preference profiles are defined, probability distributions of age, income and number of years of education are studied for each specific car model type owned. The location, spread and shape of these distributions were then looked at to identify the differences and similarities. The objective behind this is to be able to discern whether buyers of a specific car type differ by income, age or educational qualification or not. Following this, techniques of hypothesis testing are carried out to determine whether the car buyer groups have statistically significant differences among each other on the basis of age, income and education or not. Another research question addressed is whether there exists a bias among older people to prefer Mercedes over BMW or Lexus or not.
Table 1: Ages of buyers of luxury car models
Count of Age (Years) |
Column Labels |
|||
Row Labels |
1 |
2 |
3 |
Grand Total |
35-44 |
56.86% |
19.61% |
23.53% |
100.00% |
45-54 |
29.73% |
36.94% |
33.33% |
100.00% |
55-64 |
6.67% |
37.78% |
55.56% |
100.00% |
65-74 |
0.00% |
66.67% |
33.33% |
100.00% |
Grand Total |
30.95% |
33.33% |
35.71% |
100.00% |
Figure 1: Ages of buyers of different luxury cars
The ages divided into four age groups, ranging from 35 as minimum age to 70 as maximum age are considered with each group interval having length of 10 years each. 56.83% of the owners in age group 33 to 44 years were found to be BMW owners. 29.73% of the owners in the age group 45 to 54 years were found to be BMW owners, 6.67% were in the age group 55 to 64 years were found to have BMW and no BMW owners were found to be in the age group above 65 years. 19.61 % of the owners in the age group 35 to 44 years, 36.94 % in the age group 45 to 54 years, 37.78% in the age group 55 to 64 years and 66.67% in the age group above 65 years were found to own Lexus model cars. 23.53% of owners aged between 35 to 44 years owned Mercedes, 33.33% aged between 45 to 54 years were found to own Mercedes and 55.56% were in age group 55 to 64 years. 33.33% of owner in age group above 65 years owned Mercedes. Therefore owners in age group 65 to 74 primarily owned Lexus followed by Mercedes. Those in age group 55 to 64 years mostly owned Mercedes followed by Lexus. The owners aged 35 to 44 years owned mostly BMW followed by Mercedes and then Lexus.
Table of Contents
Table 2: Descriptive Statistics for ages with preference for BMW
Age ( in Years) (BMW) |
|
Mean |
45.21538 |
Standard Error |
0.381908 |
Median |
45 |
Mode |
46 |
Standard Deviation |
4.354423 |
Sample Variance |
18.961 |
Kurtosis |
0.04742 |
Skewness |
0.508655 |
Range |
21 |
Minimum |
36 |
Maximum |
57 |
Sum |
5878 |
Count |
130 |
Figure 2: Age distribution of BMW
A total of 130 people were seen to be owners of BMW model. The mean age of a household for this model ownership group is 45. The median age of this customer segment group is 45. The mode value of the age in years of BMW owners was found to be 46 implying that most of the people in owning BMW is in aged 46. The range of ages of BMW owners lie between 36 and 57. The variance measured by the standard deviation is 4.35. Since the standard deviation is found to be lesser than the mean age, this indicates that the age distribution is not as consistent for the group of people who own BMW. The measure of shape or Skewness being 0.51 was thus indicated to be close to normal shape. The distribution has slight positive value which implies some degree of positive skewness.
Table 3: Descriptive Statistics for ages with preference for Lexus
Age (in Years) (Lexus) |
|
Mean |
50.45714 |
Standard Error |
0.515472 |
Median |
50 |
Mode |
55 |
Standard Deviation |
6.099147 |
Sample Variance |
37.19959 |
Kurtosis |
0.611214 |
Skewness |
0.360262 |
Range |
32 |
Minimum |
36 |
Maximum |
68 |
Sum |
7064 |
Count |
140 |
Figure 3: Age distribution of Lexus
A total of 140 people were seen to be owners of Lexus model. The mean age of a household for this model ownership group 50.45. The median age of this customer segment group is 50. The mode value of the age in years of Lexus owners was found to be 50 implying that most of the people in owning Lexus is in aged 50. The range of ages of Lexus owners lie between 36 and 68. The variance measured by the standard deviation is 6.099. Since the standard deviation is found to be lesser than the mean age, this indicates that the age distribution is not as volatile for the group of people who own Lexus. The measure of shape or Skewness being 0.360. The distribution has slight positive value which implies some degree of positive skewness.
Table 4: Descriptive Statistics for ages with preference for Mercedes
Age (Years)(Merceded) |
|
Mean |
51.98667 |
Standard Error |
0.55037 |
Median |
53 |
Mode |
53 |
Standard Deviation |
6.740628 |
Sample Variance |
45.43606 |
Kurtosis |
-0.0195 |
Skewness |
-0.02894 |
Range |
35 |
Minimum |
35 |
Maximum |
70 |
Sum |
7798 |
Count |
150 |
Figure 4: Age distribution of Mercedes
A total of 150 people were seen to be owners of Mercedes model. The mean age of a household for this model ownership group is 51.98. The median age of this customer segment group is 53. The mode value of the age in years of Mercedes owners was found to be 53 implying that most of the people in owning Mercedes is in aged 53. The range of ages of BMW owners lie between 35 and 70. The variance measured by the standard deviation is 6.74. Since the standard deviation is found to be lesser than the mean age, this indicates that the age distribution is not volatile for the group of people who own BMW. The measure of shape or Skewness being -0.02 was thus indicated to be close to normal shape. The distribution has slight negatibve value which implies some degree of negative skewness.
Table 5: Income of buyers of different luxury cars
Count of Type of Car |
Column Labels |
|||
Annual Income |
BMW |
Lexus |
Mercedes |
Grand Total |
46068-121067 |
62.16% |
27.03% |
10.81% |
100.00% |
121068-196067 |
28.78% |
39.57% |
31.65% |
100.00% |
196068-271067 |
6.45% |
16.13% |
77.42% |
100.00% |
271068-346067 |
0.00% |
0.00% |
100.00% |
100.00% |
Grand Total |
30.95% |
33.33% |
35.71% |
100.00% |
Figure 5: Income of buyers of different luxury cars
Business Problem
The 420 sampled households were then divided on the basis of their annual income in to 4 groups, namely, the first interval, that is, group 1 had income between $46068 and $121067.The next interval, say group 2 had income between $121068 and $196067, the following group 2 had income between $196068 and $271067. The group 4 had income between $27168 and $346067 in a year. 62.16% of the people in income group 1 owned a BMW, 27.03% of those in income group 1 owned a Lexus, and 10.81 % in the income group 1 had a Mercedes. 28.78% of the people in group 2 had a BMW, 39.57% of those I income group 2 had a Lexus and 31.65% of those in group 2 had a Mercedes. 6.45% of the people in group 3 had a BMW whereas 16.13% owned a Lexus and 77.42% had a Mercedes. 0% of the people in income group 4 had any BMW or Lexus. It was seen that all 100% of the people with income between $271068 and $346067owned a Mercedes. Therefor it is seen that higher income groups are most likely to own a Mercedes whereas BMW is more common among the relatively lower income groups.
Table 6: Descriptive Statistics for income group preferring BMW
Annual Income ($)(BMW) |
|
Mean |
139271.3 |
Standard Error |
2907.846 |
Median |
138512 |
Mode |
109568 |
Standard Deviation |
33154.54 |
Sample Variance |
1.1E+09 |
Kurtosis |
-0.22439 |
Skewness |
-0.03855 |
Range |
170652 |
Minimum |
46068 |
Maximum |
216720 |
Sum |
18105274 |
Count |
130 |
Figure 6: Income distribution for BMW
The mean annual income of people owning BMW cars was found to be 139271 dollars. The median income was found to be 138512 dollars implying that exactly 50% of the people in this group had an annual income equal to 138512 dollar. The range of the annual income of the people owning a BMW was found to be 170652 dollars with minimum being 46068 dollars and maximum being 216720 dollars. The variation in the data measured by the standard deviation of income distribution of this group is 33154.54. As standard deviation is less than mean, therefore the coefficient of variation would be less than 100 and this implies that the income distribution has smaller variation than that of the distribution of the other attribute variable of age. The shape of the distribution as measured using the skewness measure was found to have the value -0.0385 which indicates a slight negative skewness in the distribution.
Table 7: Descriptive Statistics for income group preferring Lexus
Annual Income ($) (2) |
|
Mean |
154186.9 |
Standard Error |
2556.425 |
Median |
154492 |
Mode |
179617 |
Standard Deviation |
30248.02 |
Sample Variance |
9.15E+08 |
Kurtosis |
0.963641 |
Skewness |
0.693685 |
Range |
152065 |
Minimum |
96069 |
Maximum |
248134 |
Sum |
21586160 |
Count |
140 |
Figure 7: Income Distribution of Lexus
The mean annual income of people owning Lexus cars was found to be 154186 dollars. The median income was found to be 154492 dollars implying that exactly 50% of the people in this group had at least an annual income equal to 154492 dollar. The range of the annual income of the people owning a Lexus was found to be 152065 dollars with minimum being 96069 dollars and maximum being 248134 dollars. The variation in the data measured by the standard deviation of income distribution of this group is 30248.02. As standard deviation is less than mean, therefore the coefficient of variation would be less than 100 and this implies that the income distribution has smaller variation than that of the distribution of the other attribute variable of age. The shape of the distribution as measured using the skewness measure was found to have the value 0.693 which indicates strong positive skewness in the distribution.
Statistical Problem
Table 8: Descriptive Statistics for income group preferring Mercedes
Annual Income ($) (Mercedes) |
|
Mean |
184423.9 |
Standard Error |
3845.333 |
Median |
186070 |
Mode |
161590 |
Standard Deviation |
47095.52 |
Sample Variance |
2.22E+09 |
Kurtosis |
0.987178 |
Skewness |
0.273966 |
Range |
284882 |
Minimum |
49941 |
Maximum |
334823 |
Sum |
27663592 |
Count |
150 |
Figure 8: Income Distribution of Mercedes
The mean annual income of people owning Mercedes cars was found to be 184423.9 dollars. The median income was found to be 186070 dollars implying that exactly 50% of the people in this group had at least an annual income equal to 186070 dollar. The range of the annual income of the people owning a Lexus was found to be 284882 dollars with minimum being 49941 dollars and maximum being 334823 dollars. The variation in the data measured by the standard deviation of income distribution of this group is 47095.52. As standard deviation is less than mean, therefore the coefficient of variation would be less than 100 and this implies that the income distribution has smaller variation than that of the distribution of the other attribute variable of age. The shape of the distribution as measured using the skewness measure was found to have the value 0.273 which indicates moderate positive skewness in the distribution.
Table 9: Income of buyers of different luxury cars
Count of Type of Car |
Column Labels |
|||
Row Labels |
1 |
2 |
3 |
Grand Total |
11-13 |
25.00% |
70.83% |
4.17% |
100.00% |
14-16 |
42.31% |
33.33% |
24.36% |
100.00% |
17-19 |
27.37% |
23.16% |
49.47% |
100.00% |
20-22 |
0.00% |
38.46% |
61.54% |
100.00% |
Grand Total |
30.95% |
33.33% |
35.71% |
100.00% |
Figure 9: Education years of buyers of different luxury cars
The households were then divided into four education years intervals, viz. , group 1 with years of education between 11 and 13, group 2 with education between 14 to 16 years, group 3 with years between 17 and 19 years and group 4 with years of education between 20 and 22 years. 25% of the households in group 1 were seen to have a BMW, 70.83% had a Lexus and 4.17% had a Mercedes. 42.31% of those in group 2 are owners of BMW car model, 33.33% owned a Lexus and 24.36% owned a Mercedes. 27.37% of those in group 3 owned a BMW, 23.16% owned a Lexus and 49.47% owned Mercedes. 61.54% of the household in group 4 owned a Mercedes, 38.46% owned a Lexus but none owned a BMW.
Table 10: Descriptive Statistics for education years showing preference for BMW
Education (Years) (1) |
|
Mean |
15.83077 |
Standard Error |
0.160923 |
Median |
16 |
Mode |
16 |
Standard Deviation |
1.834799 |
Sample Variance |
3.366488 |
Kurtosis |
-0.17288 |
Skewness |
-0.4345 |
Range |
8 |
Minimum |
11 |
Maximum |
19 |
Sum |
2058 |
Count |
130 |
Figure 10: Distribution of education years for BMW
It is seen that the mean of the number of education years of those households that have a BMW is 15.83 or approximately 16 or at least 15. The median of the distribution for BMW owners was found to be 16, that is, 50% of the households have at least 16 years of education. The mode of the distribution for BMW was also 16 that is majority of the households that own a BMW have 16 years of education. The maximum number of years of education was found to be 19 years and the minimum number of years of education among those owning a BMW was 11. The range was thus found to be 8. The variability in the data as measured using the standard deviation was found to be 1.834. Since the standard deviation is lower than the mean, the coefficient of variation is indicated to be less than 0 and hence the estimate is considered to have low volatility. Again, the skewness is -0.43 indicating a low skewness or in other words a distribution which is close to symmetric normal shape.
Analysis
Table 11: Descriptive Statistics for education years showing preference for Lexus
Education (Years) (2) |
|
Mean |
15.8 |
Standard Error |
0.204069593 |
Median |
16 |
Mode |
16 |
Standard Deviation |
2.414583986 |
Sample Variance |
5.830215827 |
Kurtosis |
-0.977282698 |
Skewness |
0.169719918 |
Range |
9 |
Minimum |
12 |
Maximum |
21 |
Sum |
2212 |
Count |
140 |
Figure 11: Distribution of education years for Lexus
A total of 140 people were seen to be owners of Lexus model. The mean age of a household for this model ownership group is 15.8. The median age of this customer segment group is 16. The mode value of the age in years of Lexus owners was found to be 16 implying that most of the people in owning Lexus is in aged 16. The range of ages of Lexus owners is 9 where the values lie between 12 and 21. The variance measured by the standard deviation is 5.830. Since the standard deviation is found to be lesser than the mean age, this indicates that the age distribution is not as consistent for the group of people who own Lexus. The measure of shape or Skewness being 0.169 was thus indicated to be close to normal shape. The distribution has slight positive value which implies some degree of positive skewness.
Table 12: Descriptive Statistics for education years showing preference for Mercedes
Education (Years) (3) |
|
Mean |
17.29333 |
Standard Error |
0.142067 |
Median |
17 |
Mode |
17 |
Standard Deviation |
1.739963 |
Sample Variance |
3.027472 |
Kurtosis |
0.039633 |
Skewness |
0.081676 |
Range |
9 |
Minimum |
13 |
Maximum |
22 |
Sum |
2594 |
Count |
150 |
Figure 12: Distribution of education years for BMW users
A total of 150 people were seen to be owners of Mercedes model. The mean age of a household for this model ownership group is 17.293. The median age of this customer segment group is 17. The mode value of the age in years of Mercedes owners was found to be 17 implying that most of the people in owning Mercedes is in aged 17. The range of ages of Mercedes owners is 9 where the values lie between 13 and 22. The variance measured by the standard deviation is 1.739. Since the standard deviation is found to be lesser than the mean age, this indicates that the coefficient of variation is less than 0 and that means that the estimate of mean is consistent for the age distribution for people who own Mercedes. The measure of shape or Skewness being 0.0816 was thus indicated to be close to normal shape. The distribution has slight positive value which implies some degree of positive skewness.
It is of then interest to check whether there is any statistically significant difference between the ages of the owners of the three car model types, namely BMW, Lexus and Mercedes..
Null Hypothesis: mean age of BMW owners= mean age of Lexus owners = mean age of Mercedes owners.
Alternative Hypothesis: At least one inequality exists in the equation specified in the null.
Table 13: Hypothesis testing for average ages of three different group of buyers
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
BMW 1 |
130 |
5878 |
45.21538 |
18.961 |
LEXUS 2 |
140 |
7064 |
50.45714 |
37.19959 |
MERCEDES 3 |
150 |
7798 |
51.98667 |
45.43606 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
3436.362 |
2 |
1718.181 |
49.80171 |
4.02787E-20 |
3.017357 |
Within Groups |
14386.69 |
417 |
34.50044 |
|||
Total |
17823.05 |
419 |
The ANOVA method for comparisons of mean was employed to test the validity of the hypotheses. The assumed level of significance was 0.05. The p-value of the test was found to be less than 0.0001 and hence the test was found to be statistically significant, that is the null was rejected at 5% level of significance. Thus it was concluded that there exists significant difference among the car owner groups in terms of age.
It is of then interest to check whether there is any statistically significant difference in the annual income levels of the owners of the three car model types, namely BMW, Lexus and Mercedes.
Null Hypothesis: mean annual income of BMW owners= mean annual income of Lexus owners = mean annual income of Mercedes owners.
Alternative Hypothesis: At least one inequality exists in the equation specified in the null.
Table 14: Hypothesis testing for average income of three different group of buyers
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
1 |
130 |
18105274 |
139271.3 |
1.1E+09 |
2 |
140 |
21586160 |
154186.9 |
9.15E+08 |
3 |
150 |
27663592 |
184423.9 |
2.22E+09 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
1.5E+11 |
2 |
7.5E+10 |
52.1761 |
5.98E-21 |
3.017357 |
Within Groups |
5.99E+11 |
417 |
1.44E+09 |
|||
Total |
7.49E+11 |
419 |
The ANOVA method for comparisons of mean was employed to test the validity of the hypotheses. The assumed level of significance was 0.05. The p-value of the test was found to be less than 0.0001 and hence the test was found to be statistically significant, that is the null was rejected at 5% level of significance. Thus it was concluded that there exists significant difference among the car owner groups in terms of annual income.
Null Hypothesis: mean years of education of BMW owners= mean years of education of Lexus owners = mean years of education of Mercedes owners.
Alternative Hypothesis: At least one inequality exists in the equation specified in the null.
Table 15: Hypothesis testing for years of education of three different group of buyers
SUMMARY |
||||
Groups |
Count |
Sum |
Average |
Variance |
1 |
130 |
2058 |
15.83077 |
3.366488 |
2 |
140 |
2212 |
15.8 |
5.830216 |
3 |
150 |
2594 |
17.29333 |
3.027472 |
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
210.8583 |
2 |
105.4292 |
25.92566 |
2.44085E-11 |
3.017357 |
Within Groups |
1695.77 |
417 |
4.066595 |
|||
Total |
1906.629 |
419 |
The ANOVA method for comparisons of mean was employed to test the validity of the hypotheses. The assumed level of significance was 0.05. The p-value of the test was found to be less than 0.0001 and hence the test was found to be statistically significant, that is the null was rejected at 5% level of significance. Thus it was concluded that there exists significant difference among the car owner groups in terms of years of education.
Finally, it is of interest to check whether people who are in older age groups tend to buy Mercedes more than BMW or Lexus. To address this problem the car type is re-coded as 0 when the car type is either Lexus or BMW and 1 when it is Mercedes. Then the relationship between the age and car type is analysed using logistic regression. The results of the logistic regression is given in the following table:
Variable |
Categories |
Frequencies |
% |
Car |
0 |
270 |
64.286 |
1 |
150 |
35.714 |
The above table depicts the frequency of the incidence of Mercedes and a car being not Mercedes as per the re-coded data.
The fitted model parameters of the logistic regression is specified in the following table. The table shows the results of the regression analysis:
Model parameters (Variable Cars) |
|||||
Source |
Value |
Standard error |
Wald Chi-Square |
Pr > Chi² |
Odds ratio |
Intercept |
-5.647 |
0.881 |
41.037 |
< 0.0001 |
|
Age (Years) |
0.101 |
0.017 |
33.942 |
< 0.0001 |
1.107 |
The level of significance was assumed to be 0.05. It is seen that the covariates age (in years), annual income and education (in years) are all significant at 0.05 since they have p-values less than 0.0001. The effect are all positive with odd ratio of age being 1.107. This means that with 0.101 unit increase in age in years the chance of car being Mercedes is 1.107 times as likely that of others.
The goodness of fit of the fitted model was checked for using the Hosmer Lemeshow test for goodness of fit. The p-value was found to be 0.058 which is greater than 0.05 and hence the model had moderately good fit.
Statistic |
Chi-square |
DF |
Pr > Chi² |
Hosmer-Lemeshow Statistic |
13.657 |
7 |
0.058 |
The age in years variable has effect size of 0.357 and testing for the validity of the results, it was seen that p-value for Wald’s test is less than 0.05b and hence significant indicating that the model constructed using the variable being considered as indeed significant is a good model.
Test of the null hypothesis H0: Y=0.357 (Variable Car) |
|||
Statistic |
DF |
Chi-square |
Pr > Chi² |
-2 Log(Likelihood) |
1 |
38.208 |
< 0.0001 |
Score |
1 |
37.333 |
< 0.0001 |
Wald |
1 |
33.942 |
< 0.0001 |
Equation of the model (Variable Car):
Pred(Car) = 1 / (1 + exp(-(-5.64688902487644+0.101415874522654*Age (Years))))
Conclusion
The analysis showed that people who had higher age were more likely to have a Mercedes rather than a Lexus or a BMW. Customers with higher annual incomes were seen to have and prefer mostly Mercedes cars rather than BMW cars.Again, higher education years were seen to correspond to owning Mercedes cars rather than BMW or Lexus.
Therefore, it is recommended that the businesses focus toward s aiming their promotional efforts for Mercedes towards older and higher paid customers or higher paid and highly educated customer whereas target the comparatively younger customers or those with annual income less than about 22,000 USD with BMW or less than 50,000 USD with Lexus.