Skip to content

This project aims to find the reason of lululemon's sucuess by analyzing their products, price, and reviews. The main skill sets required are Python, Pandas, BeautifulSoup, Selenium, Matplotlib, Seaborn, etc.

Notifications You must be signed in to change notification settings

MaggieUBC/Lululemon-Webscraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Lululemon-Webscraping-Project

Summary

This project provides an example notebook of how to apply beautifulSoup and Selenium to scrap data online. With the obtained dataset, it also showed data analysis and visulations about products distribution, the average prices of each catagory, the most popular products and the relation of prices and review points.

Major Challenge

Lululemon is a semi-dynamic page, which brings the major difficulty to grab data. With an automatic loading to a certain level, it will stop loading. Instead, it required a click to continue loading products. Another challenge is, the tags of product info and review is not presented closely, instead, it required scrolling down to get the review available in the screen, which is necessary for using selenium to grab data.

Business insights

Based on the visulization, it can be concluded that the 62.2% products are womens, and the most products in womens are shirts and leggings with 51.4% propotion. The major revenue supposed comes from pants as its avg price in womens and mens are the most expensive. The avg of the most popular products is nearly 4.0/5.0, even for the products in higher price range.

How to use

To try on this project, play with the code in the notebook with data in csv file that scarped from Lululemon.

About

This project aims to find the reason of lululemon's sucuess by analyzing their products, price, and reviews. The main skill sets required are Python, Pandas, BeautifulSoup, Selenium, Matplotlib, Seaborn, etc.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published