Skip to content

Latest commit

 

History

History
14 lines (14 loc) · 747 Bytes

File metadata and controls

14 lines (14 loc) · 747 Bytes

Supervised-Learning-Classification

  • Repository includes two files:
    • Jupyter notebook with Python code written for data analysis and model building
    • CSV file includes data imported into notebook

Problem Statement

  • Analyze the data of INN Hotels to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.

Skills and Tools

  • Exploratory Data Analysis (Variable identification, Univariate analysis, Bi-Variate analysis)
  • Data Pre-processing
  • Logistic regression
    • Multicollinearity
    • Optimal threshold using AUC-ROC curve
  • Decision trees
    • Pruning