fit_nlp

An End to End Pipeline for Text Modelling

The purpose of this repository is to implement all the necessary steps required for building a Text Classifier in Python. All these steps have been explained in the Jupyter Notebook(fit_nlp.ipynb).

The core idea of building a Text Classifier involves:

Data Analysis This step involved analyzing the entire data using techniques, cleaning it and transforming it to bring out valuable insights.
Feature Engineering Feature Engineering involves creating a representation of input data using which the model can be trained better. In this repository, I have used Bag of Words Approach to create feature vectors for the input data. Bag of Words is a representation used in Natural Language Processing in which we create a multiset of words for each training example.

The entire pipeline has been comprehensively explained in the IPython Notebook.

Model Selection, Training and Evaluation In this step, firstly a model is selected based on the properties of data and then it is trained on the training sample. In order to prevent underfitting and oerfitting, model is evaluation using evaluation metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
README.md		README.md
fit_nlp.ipynb		fit_nlp.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fit_nlp

About

Releases

Packages

Languages

PyExtreme/fit_nlp

Folders and files

Latest commit

History

Repository files navigation

fit_nlp

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages