Imaginary-MCTS: Autonomous Computer Use Agents with Enhanced Online Planning

To improve computer agent reasoning and boost task completion rates, we propose an online planning algorithm inspired by Monte Carlo Tree Search in which the agent is prompted to imagine the state changes associated with each candidate action, effectively serving as trajectory rollouts that avoid actually executing and candidate actions and then backtracking which is challenging, slow, and error-prone. The agent then scores each candidate action based on its imagined rollout and proceeds with the highest-scoring action. For more information the algorithm and its results please view our write-up. The virtualization, evaluation_examples, and baseline agent in this repository adapt or draw code from the OSWorld repository. This repository forgoes the evaluation scripts and variety of supported virtualization and foundation model providers present in the OSWorld repo in favor of constructing a maximally lightweight computer agent that's easy to understand and experiment with.

OSWorld is a popular computer agent benchmark.

Installation

Clone the repo and install packages

# Clone the OSWorld repository
git clone https://github.com/brendanm12345/imcts_computer_agent

# Change directory into the cloned repository
cd imcts_computer_agent

# Optional: Create a Conda environment for OSWorld
# conda create -n imcts
# conda activate imcts

# Install required dependencies
pip install -r requirements.txt

(from OSWorld) Install VMware Workstation Pro (for systems with Apple Chips, you should install VMware Fusion) and configure the vmrun command. The installation process can refer to How to install VMware Worksation Pro. Verify the successful installation by running the following:

vmrun -T ws list

If the installation along with the environment variable set is successful, you will see the message showing the current running virtual machines.

Verify Correct VM Setup

To verify that your virtualization has been done correctly, run

python3 quickstart.py

If things are working correctly you should see:

The VMWare Fusion application open showing a desktop screen
A right-click get executed in the middle of the desktop screen, showing the Ubuntu pop-up menu like the below image:

If you see a desktop screen prompting you for a password, enter password as the password and run the quickstart.py script again

Now that we have a VM to use, let's run the agent!

Run the Agent

Set ANTHROPIC_API_KEY environment variable with your API key

export ANTHROPIC_API_KEY='changeme'

Run the baseline agent

python3 run.py --path_to_vm vmware_vm_data/Ubuntu0/Ubuntu0.vmx --model claude-3-5-sonnet-latest --result_dir ./results

Run the IMCTS agent

python3 run.py --path_to_vm vmware_vm_data/Ubuntu0/Ubuntu0.vmx --model claude-3-5-sonnet-latest --result_dir ./results --imcts

Note: you may need to update the path_to_vm

The results, which include screenshots, actions, and video recordings of the agent's task completion, will be saved in the ./results directory in this case. The logs containing the agents reasoning wil be saved in the .logs directory

Stanford University, CS 238 Final Project. Authors: Brendan McLaughlin (BS'24, MS'25), Michael Maffezzoli (BS'23, MS'24), under the guidance of Professor Mykel J. Kochenderfer. Grade: A+

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
desktop_env		desktop_env
evaluation_examples		evaluation_examples
.gitignore		.gitignore
README.md		README.md
lib_run_single.py		lib_run_single.py
quickstart.py		quickstart.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Imaginary-MCTS: Autonomous Computer Use Agents with Enhanced Online Planning

Installation

Verify Correct VM Setup

Run the Agent

About

Releases

Packages

Languages

brendanm12345/imcts_computer_agent

Folders and files

Latest commit

History

Repository files navigation

Imaginary-MCTS: Autonomous Computer Use Agents with Enhanced Online Planning

Installation

Verify Correct VM Setup

Run the Agent

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages