Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 888 Bytes

readme.md

File metadata and controls

25 lines (16 loc) · 888 Bytes

The two transcripts are from AXRP and 80,000 Hours.

I included in the transcripts not only Jan Leike's answers, but also the questions. The reason is because I wanted to represent the conversations.

Cleaning

I did some cleaning on the transcripts.

  • Removed cold open
  • Removed punctuation
  • Removed headings
  • Changed to US spelling in 80,000 Hours transcript:
    • generalize
    • organization
  • Converted to lower case (except OpenAI, because why not)

Stopwords

I used the standard stopwords from the Python library plus those in the file custom_stopwords.xlsx.

Result

Top 75 words from the transcripts. See top-words.txt for the list.

a word cloud