Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Futrell2018 SPRT benchmark using GAMs + control predictors #107

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

hans
Copy link
Contributor

@hans hans commented Nov 4, 2022

I'm starting a benchmark implementation for reading time evaluation that uses control predictors (word length and frequency; spillover effects from previous word(s)) as well as a more advanced statistical model (GAMs).

FWIW this PR is also a fun test case of a benchmark with Conda dependencies (needs R and an R package, which obviously can't be installed via pip).

Still to-do (& happy to accept help if anyone is interested):

  • Predict both RT mean and variance. Recent studies have argued that between-subject RT variance is meaningfully related to surprisal. Use the Gaussian location-scale implementation included in mgcv.
  • Held-out evaluation. Currently the benchmark evaluates on the training data, yikes
  • Test code

data_mask = ~data.isna().any(axis=1)
data = data[data_mask]

# TODO check that columns match formula variable names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo

data["prev_surp"] = data["surprisal"].shift(1)
data["len"] = self.data[data_mask].word_core.str.len()
data["prev_len"] = data["len"].shift(1)
data["freq"] = surprisals # HACK need to look this up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo?

r_mgcv = importr("mgcv")
model = r_mgcv.gam(formula, data=data)

# TODO held out data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo

Comment on lines +81 to +89
surprisals = candidate.digest_text(stimuli)['behavior']
attach_presentation_meta(surprisals, self.data['presentation'])

# exclude first words
surprisals = surprisals[surprisals['word_within_sentence_id'] != 1]
data_mask = self.data['word_within_sentence_id'] != 1

# Fit and evaluate GAM model
model, predictions, targets = self.fit(surprisals, data_mask)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
surprisals = candidate.digest_text(stimuli)['behavior']
attach_presentation_meta(surprisals, self.data['presentation'])
# exclude first words
surprisals = surprisals[surprisals['word_within_sentence_id'] != 1]
data_mask = self.data['word_within_sentence_id'] != 1
# Fit and evaluate GAM model
model, predictions, targets = self.fit(surprisals, data_mask)
model_reading_times = candidate.digest_text(stimuli)['behavior']
attach_presentation_meta(surprisals, self.data['presentation'])
# exclude first words
model_reading_times = model_reading_times[model_reading_times['word_within_sentence_id'] != 1]
data_mask = self.data['word_within_sentence_id'] != 1
# Fit and evaluate GAM model
model, predictions, targets = self.fit(model_reading_times, data_mask)

return score


class SplitHalvesConsistency:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could from ../futrell2018.benchmark import SplitHalvesConsistency (

) since identical. Or we put both benchmarks inside the benchmarks/futrell2018 plugin? I'm fine with either, slightly leaning towards adding this to the futrell2018 plugin

@mschrimpf
Copy link
Member

Hi @hans just checking in on this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants