This project focuses on creating a sentiment classifier model using customer messages. The goal is to solve a binary classification problem that categorizes stock-related sentiment data as positive or negative. A sentiment of 1 indicates positive sentiment, while 0 indicates negative sentiment.
The primary resource used for this project is the Python & Machine Learning for Financial Analysis course on Udemy.
-
Importing Required Libraries:
- Libraries such as
pandas
,numpy
,seaborn
,matplotlib
,nltk
,gensim
, andtensorflow
were used for data manipulation, visualization, and model building.
- Libraries such as
-
Exploratory Data Analysis (EDA):
- Analyze the dataset to understand its structure and explore patterns.
-
Data Cleaning:
- Clean the text data by removing unnecessary punctuation and stopwords, ensuring the dataset is ready for analysis.
-
Data Visualization:
- Visualize the cleaned dataset and plot a word cloud to identify the most frequent terms.
-
Data Preparation:
- Tokenize the text and apply padding to handle varying text lengths for model training.
-
Building the Sentiment Classifier:
- Develop a custom deep neural network for sentiment analysis using an embedding layer and LSTM (Long Short-Term Memory) network.
-
Prediction & Model Evaluation:
- Use the trained model to make predictions and assess performance using metrics like the confusion matrix.
- pandas: For data manipulation and analysis.
- numpy: For numerical operations and handling data arrays.
- seaborn, matplotlib: For visualizations and data plotting.
- nltk, gensim: For text processing tasks like tokenization and stopword removal.
- tensorflow: For building and training the deep learning model (LSTM network).
This project applies sentiment analysis to stock-related data, classifying messages as either positive or negative. The trained model can help identify the sentiment of customer feedback, which can be valuable for financial analysis, trading strategies, and decision-making based on market sentiment.