Skip to content

Commit b0e181b

Browse files
committed
Modified figure
1 parent 335af13 commit b0e181b

File tree

5 files changed

+6
-24
lines changed

5 files changed

+6
-24
lines changed

Report/report.log

Lines changed: 5 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=pdflatex 2016.5.22) 30 NOV 2016 18:19
1+
This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=pdflatex 2016.5.22) 30 NOV 2016 18:33
22
entering extended mode
33
restricted \write18 enabled.
44
%&-line parsing enabled.
@@ -494,21 +494,6 @@ Underfull \hbox (badness 10000) in paragraph at lines 201--203
494494

495495
[]
496496

497-
498-
LaTeX Warning: Reference `wc1' on page 3 undefined on input line 207.
499-
500-
501-
LaTeX Warning: Reference `wc2' on page 3 undefined on input line 207.
502-
503-
504-
LaTeX Warning: Reference `wc3' on page 3 undefined on input line 207.
505-
506-
507-
LaTeX Warning: Reference `wc4' on page 3 undefined on input line 207.
508-
509-
510-
LaTeX Warning: Reference `wc5' on page 3 undefined on input line 207.
511-
512497
LaTeX Font Info: Try loading font information for U+msa on input line 213.
513498
(/usr/local/texlive/2016/texmf-dist/tex/latex/amsfonts/umsa.fd
514499
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
@@ -536,12 +521,12 @@ File: len_count.png Graphic file (type png)
536521
Package pdftex.def Info: len_count.png used on input line 266.
537522
(pdftex.def) Requested size: 250.9383pt x 200.75136pt.
538523

539-
<star_review_count.png, id=24, 521.0667pt x 359.1819pt>
524+
<star_review_count.png, id=24, 721.2546pt x 506.6127pt>
540525
File: star_review_count.png Graphic file (type png)
541526

542527
<use star_review_count.png>
543528
Package pdftex.def Info: star_review_count.png used on input line 273.
544-
(pdftex.def) Requested size: 250.93605pt x 200.75136pt.
529+
(pdftex.def) Requested size: 250.93513pt x 200.74756pt.
545530

546531
<1_star_wordcloud_500k.png, id=25, 963.6pt x 963.6pt>
547532
File: 1_star_wordcloud_500k.png Graphic file (type png)
@@ -634,16 +619,13 @@ ne 595.
634619

635620
[15] (./report.aux)
636621

637-
LaTeX Warning: There were undefined references.
638-
639-
640622
LaTeX Warning: There were multiply-defined labels.
641623

642624
)
643625
Here is how much of TeX's memory you used:
644626
7708 strings out of 493014
645627
116629 string characters out of 6133351
646-
444739 words of memory out of 5000000
628+
444795 words of memory out of 5000000
647629
11100 multiletter control sequences out of 15000+600000
648630
22029 words of font info for 51 fonts, out of 8000000 for 9000
649631
1141 hyphenation exceptions out of 8191
@@ -661,7 +643,7 @@ live/2016/texmf-dist/fonts/type1/public/cm-super/sfrm2074.pfb></usr/local/texli
661643
ve/2016/texmf-dist/fonts/type1/public/cm-super/sfti1095.pfb></usr/local/texlive
662644
/2016/texmf-dist/fonts/type1/public/cm-super/sfti1440.pfb></usr/local/texlive/2
663645
016/texmf-dist/fonts/type1/public/cm-super/sftt1095.pfb>
664-
Output written on report.pdf (15 pages, 1338214 bytes).
646+
Output written on report.pdf (15 pages, 1338768 bytes).
665647
PDF statistics:
666648
125 PDF objects out of 1000 (max. 8388607)
667649
73 compressed objects within 1 object stream

Report/report.pdf

554 Bytes
Binary file not shown.

Report/report.synctex.gz

66 Bytes
Binary file not shown.

Report/report.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ \section{Exploratory Analysis of Yelp Dataset}
201201
Since there are so many reviews for restaurant, we decided to focus on the subset of reviews for restaurants ( the subset is obtained in the pre-processing step) for further analysis.
202202
Next we observed the variation of average rating stars against the \textit{length of the reviews (in characters)}. The average rating varies quite a lot for reviews with higher number of characters, indicating that the polarity of reviews fluctuates a lot more from one length to another. There is less fluctation in the average rating of reviews from length ~50 to ~500 characters. Hence we chose to subset the reviews based on minimum and maximum length threshholds. Figure \ref{average_length} indicates this observed trend. Another motive for reducing the subset for reviews is based from inferences drawn from Figure \ref{length_count}. From this we can see that the total number of length >700 is very less as compared to the overall size of the review corpus. Thus combining the analysis from this and the previous figure, we subsetted the number of reviews.\\
203203

204-
Next we observed the distribution of reviews with the star rating, i.e. what review was given which star by a user. From Figure \ref{star_distribution}, we can see that there is a skewed distribution of reviews in terms of star ratings they have received. A majority of reviews have a 5 star rating, while the count for 2 star is the lowest. This will later on form a basis for us to create training datasets based on data sampled from each star rating, in order to ensure even representation in the training corpus.
204+
Next we observed the distribution of reviews with the star rating, i.e. what review was given which star by a user. From Figure \ref{star_distribution}, we can see that there is a skewed distribution of reviews in terms of star ratings they have received. A majority of reviews have a 5 star or a 4 star rating, while the count for 1 star is the lowest. This will later on form a basis for us to create training datasets based on data sampled from each star rating, in order to ensure even representation in the training corpus.
205205

206206
\subsection{Word Cloud}
207207
Given the distribution of reviews, we decided to capture the common sentiment for all the reviews individually. We did this by plotting a wordcloud of review texts grouped by their star rating. To do this, we preprocessed the review texts to remove all the common stop words, tokenized the words and then plotted them. Figures \ref{wc1}, \ref{wc2}, \ref{wc3}, \ref{wc4}, \ref{wc5} represent this pictorially.

Report/star_review_count.png

1.35 KB
Loading

0 commit comments

Comments
 (0)