Gambit

A real-time Multi-Armed Bandit implementation that uses Thompson Sampling

To run:

mvn clean install
mvn exec:java -Dexec.mainClass="mab.Gambit"

The implementation of Gambit here is a proof-of-concept for doing basic real time reinforcement learning using mult-arm bandit, using Thompson Sampling as the explore/exploit policy. Running Gambit will print out in the console a series of steps & decisions where Gambit begins by trying different actions randomly, and as it gains confidence about the performance of a particular action, it will start to select that action more frequently, thus balancing the learning of performance by action with exploiting the highest performing action.

Contact: Brandon O'Brien @hakczar

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src/main/java/mab		src/main/java/mab
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gambit

About

Releases

Packages

Languages

br4nd0n/gambit

Folders and files

Latest commit

History

Repository files navigation

Gambit

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages