Advanced Regression(Regularization) on Surprise House Dataset

This project uses regularization( L1 and L2) on the Surprise House Dataset and compares both and provides the variables that are significant in predicting the price of a house, and How well those variables describe the price of a house.

General Information

Business Problem

A US-based housing company named Surprise Housing has decided to enter the Australian market. The company uses data analytics to purchase houses at a price below their actual values and flip them on at a higher price. For the same purpose, the company has collected a data set from the sale of houses in Australia. The company is looking at prospective properties to buy to enter the market. You are required to build a regression model using regularisation in order to predict the actual value of the prospective properties and decide whether to invest in them or not. The company wants to know: Target : Which variables are significant in predicting the price of a house, and How well those variables describe the price of a house.s.

Goal of the Project

Build a regression model using regularisation in order to predict the actual value of the prospective properties and decide whether to invest. Determine the optimal value of lambda for ridge and lasso regression. This model will then be used by the management to understand how exactly the prices vary with the variables

Business Requiremnt

You are required to model the price of houses with the available independent variables. This model will then be used by the management to understand how exactly the prices vary with the variables. They can accordingly manipulate the strategy of the firm and concentrate on areas that will yield high returns. Further, the model will be a good way for management to understand the pricing dynamics of a new market.

Steps involved

Data Load and Analysis
Data Wragling
Exploratory Data Analysis
Splitting the dataset
Scaling of the variables(RobustScaler)
Modelling
Tuning with Regularization (Ridge & Lasso)
Model Evaluation

Result

We achieved the following results:

Ridge
- Test Accuracy : 0.867015
- Mean Absolute Error : 0.090746
- Durbin-Watson value : 1.9778
- Lamda : 0.00079901
Lasso
- Test Accuracy : 0.894023
- Mean Absolute Error : 0.072316
- Durbin-Watson value : 1.9778
- Lamda : 0.00058701

Conclusion

As it can be observed the Lasso performed better than Ridge and has higher accuracy on test (89>86) with less MSE. Both models are valid as they follows the assumptions of the linear regresion and for final conclusion we will use Lasso as it has feature selection which will help reducing the variable making model easier to understand.

Technologies Used

Pandas - version 1.3.4
NumPy - version 1.20.3
MatplotLib - version 3.4.3
Seaborn - version 0.11.2
Scikit-Learn - version 0.24.2

Acknowledgements

This project was inspired by UpGrad IITB Programme as a case study for the Machine Learning and Artificial Intelligence course.

Contact

Created by [@sukhijapiyush] - feel free to contact me!

License

This project is open source and available without restrictions.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
Advanced-Regression-Regularization.ipynb		Advanced-Regression-Regularization.ipynb
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Subjective Questions and Answers.pdf		Subjective Questions and Answers.pdf
data_description.txt		data_description.txt
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced Regression(Regularization) on Surprise House Dataset

Table of Contents

General Information

Business Problem

Goal of the Project

Business Requiremnt

Steps involved

Result

Conclusion

Technologies Used

Acknowledgements

Contact

License

About

Releases

Packages

Languages

License

sukhijapiyush/Advanced-Regression-Surprise-House

Folders and files

Latest commit

History

Repository files navigation

Advanced Regression(Regularization) on Surprise House Dataset

Table of Contents

General Information

Business Problem

Goal of the Project

Business Requiremnt

Steps involved

Result

Conclusion

Technologies Used

Acknowledgements

Contact

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages