Yelp Rating Prediction using Sentiment Analysis
A course project for Artificial Intelligence
---
Description
On Yelp.com, customers can rate a business in a range of 1 to 5 stars, and they can also provide a review (in text format) explaining their experience. Other optional rating scores that the customers can provide include: useful, cool, funny. Each of these fields can also be rated on a scale of 1 to 5.
This project aims to develop a sentiment analysis model that can predict any of the mentioned rating scores for a business on Yelp.com given the content of the review text. Specificly, star rating is considered as a classification task, while the other three rating scores are considered as regression tasks.
The data used in this project can be found here. Please download the JSON file, and find yelp_academic dataset_review.json
.
---
About
This is a course project for my Artificial Intelligence class. In this project, I:
- Processed 6,990,280 data entries, involving dataset splitting, handling of missing values, text preprocessing, as well as text tokenizing.
- Developed and trained a multi-task CNN text model, capable of classification and regression, to predict review scores from Yelp comments.
- Fine-tuned the model with random search and early-stopping.
---