Skip to content

This project included webscraping, clustering and classification models. The ML models applied to predict course types to complete the dataset for further analysis.The aim is to evaluate course review for an online education platform.

Notifications You must be signed in to change notification settings

MaggieUBC/Product-Review-Project

Repository files navigation

Product-Review-Project

Summary

  • This project contains two notebook, one for webscarping customer's review and one for developing clustering and classification model for predicting missing values in courses.
  • To be noted, data.csv is the overall raw dataset, and pred_data is the visualization dataset with ml prediction.

Challenges

  • The first challenge is to scrap review starts with picture format in pop-up windows, which require strong coding skills and understanding of html.
  • Another challenge is to clean the messy courses and use KMeans to cluster 180 course types to 9 distinct segments.
  • The last challenge is to address the title and review as featue and develop a proper ml classification model.

Findings and Future work

  • With evalution of curriculunm, instructor, job assistance, overall rating, the top 2 insititutes are Le Wagon and WeCloudData, while Le Wagon mainly focus on Full Stack and Web Development, and WeCouldData focused on Data Science.
  • For next, with the predicted dataset, it would be good to understand the pros and cons of courses in every institutes by analyzing words frequency in reviewer's comments.

About

This project included webscraping, clustering and classification models. The ML models applied to predict course types to complete the dataset for further analysis.The aim is to evaluate course review for an online education platform.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published