YuxiChen25 / TF-MDP Public

Notifications You must be signed in to change notification settings
Fork 1
Star 0

Transformers Learn Transition Dynamics

0 stars 1 fork Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
GPT		GPT
Probe		Probe
Probe_training		Probe_training
RL_Training_ConnectFour		RL_Training_ConnectFour
transformer_training		transformer_training
transformer_training_mcts		transformer_training_mcts
transformers_trained		transformers_trained
transformers_trained_mcts		transformers_trained_mcts
README.md		README.md
parse_probe_data.py		parse_probe_data.py
parse_probe_data_random.py		parse_probe_data_random.py

Repository files navigation

Transformers Learn MDP Transitions

This is the codebase for the paper Transformers Learn Transition Dynamics when Trained to Predict Markov Decision Processes.

Through the code above, we achieve the process of training and testing the probes used in the experiment. The exact process is outlined as follows:

Experiment Design

The steps of the experiment are:

Gridworld

Generate training data for the transformer by playing through Gridworld (in RL_Training_Gridworld)
Train transformers on generated training data (in GPT/GridWorld)
Generate embeddings using transformers (in Probe)
Train probes on embeddings and collect data (in Probe/probe.py)

ConnectFour

Generate training data for the transformer by playing through Gridworld (in RL_Training_ConnectFour)
Train transformers on generated training data (in transformer_training and transformer_training_mcts)
Generate embeddings using transformers (in transformers_trained and transformers_trained_mcts)
Train probes on embeddings and collect data (in Probe_training)
Parse data (in parse_probe_data.py)

About

Transformers Learn Transition Dynamics

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages