Skip to content

Commit

Permalink
Updated repo structure
Browse files Browse the repository at this point in the history
  • Loading branch information
jacobceles committed Jul 12, 2022
1 parent 7ddbf7e commit ca0b96d
Show file tree
Hide file tree
Showing 74 changed files with 20,773 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Topic modeling for NYT articles

## Overview
In this project, we try to identify how trends have changed for 'automation' across decades, by analyzing [The New York Times](https://www.nytimes.com/) articles from 1950 to 2021. We also try to identify other topics that managed to get media attraction consistently over each decade. We use a variety of approaches and techniques for data cleaning and standardization, and also use packages such as Gensim, spaCy, NLTK, BERTopic etc to complete this analysis.

Expand Down Expand Up @@ -33,7 +34,7 @@ To understand how the trends have changed for 'automation' across decades, we lo
</ol>
We build a visualization on top of count/lemmatized to get a better sense of the distrbution of the top 50 words in the corpus across the decades.

#### Trend for other topics
#### Trend for Other Topics
This project approaches the problem using 4 different techniques:
Decade wise topic modeling using LDA
Decade wise topic modeling using BERTopic
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file removed results/BERTopic-DTM.zip
Binary file not shown.
Binary file removed results/BERTopic.zip
Binary file not shown.
Binary file removed results/Count Visualization.zip
Binary file not shown.
Binary file removed results/LDA.zip
Binary file not shown.
71 changes: 71 additions & 0 deletions results/automation/bigram.html

Large diffs are not rendered by default.

71 changes: 71 additions & 0 deletions results/automation/trigram.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

File renamed without changes.
Loading

0 comments on commit ca0b96d

Please sign in to comment.