Skip to content

Replication repository for "Integrating Semantic Directions with Concept Mover's Distance to Measure Binary Concept Engagement."

License

Notifications You must be signed in to change notification settings

Marshall-Soc/cmd_geometry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Integrating Semantic Directions with Concept Mover's Distance: Reproduction Guide

Marshall A. Taylor and Dustin S. Stoltz

This repository contains all R code and data necessary to reproduce our analyses in our "Integrating Semantic Directions with Concept Mover's Distance to Measure Binary Concept Engagement" paper, forthcoming in Journal of Computational Social Science. It is a short note follow-up to our paper in the same journal (Stoltz and Taylor 2019), "Concept Mover's Distance: Measuring Concept Engagement via Word Embeddings in Texts."

In the the original JCSS paper, we put forth a method for measuring concept engagement in texts that uses word embeddings to find the minimum cost necessary for words in an observed document to "travel" to words in a pseudo-document—a document consisting only of words denoting a concept of interest. One potential limitation with our method is that words associated with opposing concepts will be located close to one another in the underlying embedding space, meaning that a document's closeness to one concept will likely have similar closeness to a starkly opposing concept (e.g., "life" and "death"). In this short note, we propose a method for dealing with this "binary concept problem" in CMD by incorporating recent work on word embeddings in cultural sociology. Using aggregate vector differences between antonym pairs to extract a direction in the semantic space pointing toward a pole of the binary opposition ("The Geometry of Culture," American Sociological Review, 2019)—we illustrate how CMD can be used to measure a document's engagement with binary concepts.

To reproduce the figures and regression models in the paper, download all scripts and CSVs to a local folder, and load the packages in the 1_cmdgeo_prep_functions.R script. The remaining scripts are self-contained, and refer to the respective section of the note. Some of the figures require downloading text from Project Gutenberg which may take some time. Note also that our CMDist function has been updated to include semantic directions; as such, you will need to update the package to the most recent version (0.4.1 as of March 25, 2020) in order to replicate the analyses.


About

Replication repository for "Integrating Semantic Directions with Concept Mover's Distance to Measure Binary Concept Engagement."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages