Random Forest Classification on Water Pump Dataset

Paper file name is called: DMA_Project-1.pdf

Started as a pair data science project at university. The other group member @pointonjoel implemented a logistic regression model on the same dataset whilst I implemented a random forest model.

The file containing the code for the Random Forest Classification is rfCLASSIFIER.ipynb

Highest score achieved is 80.21% based on the classification rate (the percentage of rows where the predicted class y^ in the submission matches the actual class, y in the test set).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Braganza_Predictions		Braganza_Predictions
FeatureImportance		FeatureImportance
PreprocessingRF		PreprocessingRF
DMA_Project-1.pdf		DMA_Project-1.pdf
LICENSE		LICENSE
README.md		README.md
Test set values.csv		Test set values.csv
Training set labels.csv		Training set labels.csv
Training set values.csv		Training set values.csv
analysis.ipynb		analysis.ipynb
analysis1.ipynb		analysis1.ipynb
rfCLASSIFIER.ipynb		rfCLASSIFIER.ipynb
training_altitude_data.csv		training_altitude_data.csv
tza_adm2.shp		tza_adm2.shp
tza_adm2.shx		tza_adm2.shx

Provide feedback