Analysis of Star Ratings
Yelp is a site that engages in the online business of accepting and publishing reviews about locally run businesses by anyone. The businesses that are reviewed range from hotels to shops to restaurants and even dentists or hairstylists and many more.
Cafes, restaurants, cafes and other such establishments who serve food are particularly popular businesses on whom reviews are found on Yelp. This paper is an objective exploration of the reviews of businesses, such as restaurants, cafes and hotels among others that are available via Yelp. Data mining techniques are used to gather and provide key statistical insights. The analysis comprises of summarising the available reviews online using sentiment mining to be specific. The analysis has been done using the software R.
Analysis
The statistical summary of number of stars is given in the following table:
Mean: 3.743 Standard deviation : 1.311 Minimum: 1.000 1st Qu 3.000 Median 4.000 3rd Qu. 5.000 Max. 5.000 |
Table 1: Summary of Star ratings
It is seen that the average star ratings of businesses on yelp is 3.7 and the median is 4. The third quartile is same as maximum and this suggests that there are a lot of high rated businesses. The standard deviation is 1.311, and not very high suggesting that Yelp has deals with high rated businesses.
The statistical summary of review length is given as:
Mean: 125.6013 Standard deviation: 115.4981 Minimum: 0.0 1st Quartile: 48.0 Median: 92.0 3rd Quartile: 165.0 Maximum: 1047.0 |
Table 2: Summary statistics of review lengths
The mean length per review is 125.60 however the standard deviation is high at 115.498. The minimum is 0 whereas the median is 92 and maximum is 1047. However the third quartile is 165 which is a lot less than maximum. The reviews are thus quite varied in terms of length. The distribution is suggested to be positively skewed since median is lower than mean and third quartile is much lower than maximum, that is most reviews are less than the suggested central tendency by the mean, 125.60.
The statistical summary of positive words is given as:
Mean: 7.071689 Standard deviation: 5.927295 Min. 0.000 1st Qu. 3.000 Median 6.000 3rd Qu 9.000 Max 94.000 |
Table 3: Summary Statistics of positive words used in review
The mean number of positive words per review is 7.07 however the standard deviation is moderately high at 5.92. The minimum is 0 whereas the median is 6 and maximum is 94. However the third quartile is 9 which is a lot less than maximum. The reviews are thus quite varied in terms of positive words used. The distribution is suggested to be positively skewed since median is lower than mean and third quartile is much lower than maximum.
The statistical summary of negative words used per review is given as:
Mean: 2.54982 Standard deviation: 3.250065 Min. 0.00 1st Qu. 0.00 Median 2.00 3rd Qu 4.00 Max 65.00 |
Table 4: Summary Statistics of negative words used per review
Analysis of Review Lengths
The mean number of positive words per review is 2.54 however the standard deviation is moderately high at 3.25. The minimum is 0 and first quartile is also equal to 0 whereas the median is 2 and maximum is 65. However the third quartile is 4 which is a lot less than maximum. The reviews are thus more inclined towards not that many negative words used in a typical review. The distribution is suggested to be positively skewed since median is lower than mean and third quartile is much lower than maximum and thus count of negative words are mostly less than 3 in a typical review.
The statistical summary of net sentiment scores is given by:
Mean: 4.521869 Standard deviation: 5.240387 Min. -59.000 1st Qu. 1.000 Median 4.000 3rd Qu 7.000 Max 80.000 |
Table 5: Summary Statistics of net sentiment scores of reviews
The mean score is 4.52, whereas the minimum is seen to be -59.00 and maximum is 80. The median is also found to be close to mean at 4. The standard deviation is 5.24. However the range is very high with 80-(-59) =139. The main problem is the very nature of the measure. The number of stars, positive words, negative words and length of review were discrete positive measures which characterized the reviews. However the sentiment scores being unbound along the real line has the risk of cancelling out positive and negative reviews. So if there are three moderately positive reviews with sentiment scores 2,3 and 4 and one with negative score -9, the mean sentiment score would amount to 0 and fail to identify that there are more positive reviews by being influenced by the magnitude of the scores instead of the number of positive and negative reviews.
The following table gives the frequency distribution of positive word count per review ranging from 0 to 19:
Observation |
Positive words |
Frequency |
1 |
0 |
50339 |
2 |
1 |
103321 |
3 |
2 |
140506 |
4 |
3 |
162600 |
5 |
4 |
164596 |
6 |
5 |
151969 |
7 |
6 |
132809 |
8 |
7 |
113252 |
9 |
8 |
94409 |
10 |
9 |
78414 |
11 |
10 |
64375 |
12 |
11 |
52484 |
13 |
12 |
43092 |
14 |
13 |
35415 |
15 |
14 |
29489 |
16 |
15 |
24467 |
17 |
16 |
20168 |
18 |
17 |
16812 |
19 |
18 |
13975 |
20 |
19 |
11732 |
Table 6: Frequency table of positive words
The following figure gives the frequency distribution of the positive words per review. It is seen that the reviews have a positively skewed shape. The number of positive words per review being mostly between 0 and 12 per review, with the counts tapering higher than that.
Figure 1: Distribution of Positive words
The following table gives the frequency distribution of negative word count per review ranging from 0 to 19:
negative words |
Frequency |
|
1 |
0 |
436419 |
2 |
1 |
340959 |
3 |
2 |
235540 |
4 |
3 |
161944 |
5 |
4 |
112062 |
6 |
5 |
77610 |
7 |
6 |
54395 |
8 |
7 |
38704 |
9 |
8 |
28010 |
10 |
9 |
20340 |
11 |
10 |
14880 |
12 |
11 |
10980 |
13 |
12 |
8302 |
14 |
13 |
6341 |
15 |
14 |
4749 |
16 |
15 |
3744 |
17 |
16 |
2938 |
18 |
17 |
2164 |
19 |
18 |
1810 |
20 |
19 |
1459 |
Table 7: Frequency table of negative words
The following figure gives the frequency distribution of the negative words per review. It is seen that the reviews have a highly positively skewed shape. The number of negative words per review is mostly less than 5 per review, with the counts tapering higher than that.
Analysis of Positive Words
Figure 2: Distribution of negative words
The following table gives the frequency distribution of net sentiment per review for first 20 observations in the dataset:
net sentiment |
Frequency |
|
1 |
-59 |
1 |
2 |
-44 |
1 |
3 |
-41 |
1 |
4 |
-38 |
1 |
5 |
-36 |
2 |
6 |
-35 |
2 |
7 |
-34 |
1 |
8 |
-33 |
4 |
9 |
-32 |
1 |
10 |
-31 |
1 |
11 |
-30 |
4 |
12 |
-29 |
9 |
13 |
-28 |
5 |
14 |
-27 |
7 |
15 |
-26 |
9 |
16 |
-25 |
7 |
17 |
-24 |
16 |
18 |
-23 |
18 |
19 |
-22 |
32 |
20 |
-21 |
21 |
Table 8: Frequency table of net sentiment score
It is seen that the distribution net sentiment is highly negatively skewed, that is it lies towards higher values, that is positive values. So it is anticipated that it tends towards positive overall.
Figure 3: Distribution of net sentiment
The following table gives the mean, standard deviation, median and inter-quartile range of review lengths categorised as per the number of star ratings that they have received.
stars |
Mean |
sd |
median |
IQR |
1 |
152.7152 |
142.865 |
109 |
141 |
2 |
153.9839 |
131.189 |
117 |
138 |
3 |
140.5944 |
118.4173 |
109 |
126 |
4 |
125.0374 |
109.3754 |
95 |
116 |
5 |
105.93 |
102.1045 |
75 |
97 |
Table 9: Summary of review length over the star ratings
The following figure gives the graphical comparison of review length over the different star ratings of the reviews. The median was used to compare the lengths since it is unaffected by outliers unlike the mean.
Figure 4: Review length against star rating
It is clear that reviews rated as 2 have the longest length followed by 3 and then 1. It is suggested that length of the review is influenced by dissatisfaction more than satisfaction on part of the users. Thus it is suggested that negative reviews are lengthier than positive reviews.
5880 businesses were found to have mean rating of 5 stars and 759 businesses were found to have mean ratings of 1 star. Thus there are 5880 top rated businesses all tied as having highest rating and 759 worst rated all tied at having worst rating. The following figure gives a subset of top rated businesses that were identified.
Figure 5: Best Rated Businesses
The following figure gives a subset of worst rated businesses that were identified.
Figure 6: Worst Rated Businesses
114023 users were found to have gave out 5 star ratings and 42004 were found to have given out 1 star ratings. The user ids of some of those users are given in figure 7 and 8 respectively. Hence it is seen that more people have rated high than those rated who rated low.
Figure 7: Top rating users
Figure 8: Worst rating users
The following figure shows that reviews rated either 5 stars or 1 star have higher values of being voted useful than those which fall in between. Suggesting that reviews suggesting extreme, either really good experiences or really bad experiences tend to make people appreciate the information they are giving more about the business. People are more interested to know if a business is really bad or really good rather than be concerned about those which are moderate or are ambiguous about the experience. Clear cut, specific reviews are appreciated more.
Figure 9: Relationship between being voted useful and star rating
Conclusion
It was found that the reviews on yelp are mostly less than 125 words in length with at least half of them being around 92 words each. The number of positive words is more common than negative words. The net sentiment is overall positive for the interviews. There are 5880 high rated businesses and 759 worst rated businesses that have been reviewed on yelp. Again 114023 users had given out 5 star ratings while 42004 had given out only 1 stars and so the businesses reviewed were found to be ones with good ratings mostly. Another finding was that lower star ratings have longer reviews overall and that star ratings which are high or very low, that is either 5 or 1 are deemed most useful by users over those with rating which are moderate.