Fraud Detection Using Machine Learning

Overview

This project explores machine learning techniques to detect fraudulent transactions in a highly imbalanced dataset. The dataset contains anonymized transaction details with the goal of identifying fraudulent transactions (Class 1).

Dataset

The dataset consists of:

Time: Time elapsed since the first transaction.
V1-V28: Anonymized features.
Amount: Transaction amount.
Class: Target variable indicating fraud (1) or non-fraud (0).

Class Distribution:

Fraudulent Transactions: 492 instances (0.17%)
Non-Fraudulent Transactions: 284,315 instances (99.83%)

Key Models and Results

Model	ROC AUC	Precision	Recall	Notes
Logistic Regression	0.9559	0.83	0.65	Simple and interpretable.
Decision Tree	0.8875	0.75	0.78	Prone to overfitting.
Random Forest	0.9578	0.94	0.82	Robust and feature-rich.
SVM	0.9646	0.96	0.67	Effective but complex.
XGBoost	0.9711	0.89	0.83	Best performance overall.

Files

analysis/DSC478_Final_Project.ipynb: Jupyter notebook containing the analysis and code.
report/Finalreport-PML.pdf: Comprehensive project report.

How to Use

Clone the repository:

git clone https://github.com/tejas-1911/Fraud-Detection-Using-Machine-Learning.git

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Fraud Detection Using Machine Learning

Overview

Dataset

Class Distribution:

Key Models and Results

Files

How to Use

Files

README.md

Latest commit

History

README.md

File metadata and controls

Fraud Detection Using Machine Learning

Overview

Dataset

Class Distribution:

Key Models and Results

Files

How to Use