A data analysis and predictive modeling project built using Python and Google Colab. The project focuses on understanding Pokémon encounter patterns in the mobile game Pokémon GO by analyzing a large dataset of user catch behavior.
This project involves:
- Data Cleaning & Preparation: Preprocessing a 300k dataset of Pokémon encounters.
- Exploratory Data Analysis (EDA): Visualizing Pokémon rarity and user catch behavior.
- Predictive Modeling: Utilizing Classification and Clustering methods to identify rare Pokémon based on catch behavior, achieving 63% accuracy for predicting rare Pokémon.
- Comprehensive Data Cleaning: Ensures dataset quality for robust analysis.
- EDA Visualizations: Highlights Pokémon rarity trends and catch patterns.
- Machine Learning Models: Implements Classification and Clustering techniques for prediction.
- Python
- Google Colab
- Machine Learning (Classification & Clustering Methods)
- Data Visualization (e.g., Matplotlib, Seaborn)
The dataset consists of 300k Pokémon encounters in Pokémon GO, including user behavior and Pokémon characteristics.
⚠️ Note: The dataset is not included in this repository. You must download the dataset yourself from the following link: Pokémon GO Dataset.
- Clone the repository:
git clone <repository_url> cd gotta-predict-em-all
- Download the dataset from Pokémon GO Dataset and place it in the appropriate directory.
- Open the project in Google Colab.
- Follow the steps in the Jupyter Notebook to:
- Clean and preprocess the dataset.
- Perform EDA on Pokémon rarity and catch behavior.
- Train and evaluate the prediction models.
The model achieves:
- Achieved 63% accuracy in predicting rare Pokémon based on user catch behavior.
- Identified key factors influencing the rarity of Pokémon.
- Enhanced understanding of user behavior in Pokémon GO.
- Google Colab for providing a robust platform for analysis and modeling.
- Pokémon GO and its player community for inspiration.