Set default value of use_microagents to False to prevent breaking eval #5976

xingyaoww · 2025-01-02T16:07:04Z

End-user friendly description of the problem this fixes or functionality that this introduces

Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

Unintentionally, use_microagent was default to True in evaluations, which is definitely unintended and can potentially hurt results there (eg we don't need to mess with github in evaluation setting).

This PR set the default use_microagents value to false across the system, but do enable it for session (used by UI) and CLI.

For evaluation that actually use this, it is easy to re-enable this by tweaking config.

Tested: microagent still work for UI

Link of any specific issues this addresses

To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:3509456-nikolaik   --name openhands-app-3509456   docker.all-hands.dev/all-hands-ai/openhands:3509456

enyst · 2025-01-02T16:10:49Z

openhands/core/cli.py

@@ -116,6 +116,7 @@ async def main(loop):

    agent_cls: Type[Agent] = Agent.get_cls(config.default_agent)
    agent_config = config.get_agent_config(config.default_agent)
+    agent_config.use_microagents = True  # hard-coded to true since it is user-facing


This means it doesn't obey the config.toml value, though. 🤔

Hmm that's true.. Do you think maybe we can just leave this one off for ppl?

We can:

go the other way around, set it to False in evals and let it configurable for the rest

if that's becoming too much to change in evals code when we have such issue, maybe we can restore the idea we were using before, to write a config.toml for evals

hmm maybe we should turn this off in eval then!

Openhands fix success summary

The AI has successfully incorporated the feedback by making changes that:

Remove the hard-coded use_microagents = True from the CLI code

Set use_microagents=False explicitly in all evaluation benchmarks

This directly addresses the feedback requesting to "turn this off in eval" while maintaining the functionality elsewhere. The solution is comprehensive as it:

Respects the config.toml values by removing hard-coding

Explicitly disables microagents in evaluation contexts

Maintains the feature for non-evaluation use cases

The changes can be summarized for a reviewer as:
"Modified the code to explicitly disable microagents in evaluation benchmarks while maintaining the feature for general use. Removed hard-coding from CLI to respect configuration values, and added explicit use_microagents=False settings in all evaluation benchmark configurations."

rbren · 2025-01-02T16:15:00Z

openhands/core/config/agent_config.py

@@ -27,7 +27,7 @@ class AgentConfig:
    memory_enabled: bool = False
    memory_max_threads: int = 3
    llm_config: str | None = None
-    use_microagents: bool = True
+    use_microagents: bool = False


I think this should default to True, and that eval should override it.

Openhands fix success summary

The feedback has been successfully incorporated. The AI made the following changes:

Changed the default value of use_microagents from False to True in the agent configuration file

Removed hard-coded override in the CLI to respect the configuration value

Added explicit overrides in all evaluation benchmarks to set use_microagents=False

This directly addresses the feedback request that the feature should default to True but be overridden in evaluation contexts. The changes are comprehensive and maintain consistency across the codebase while implementing the requested behavior.

The explanation provided by the AI clearly outlines the changes made and confirms they align with the feedback requirements. This can be confidently marked as resolved.

rbren

I disagree with the change here. I think it's eval's responsibility to turn this off

rbren · 2025-01-02T16:16:44Z

sorry didn't mean to close!

xingyaoww · 2025-01-02T16:28:08Z

OpenHands agent if you see this, please read the diff against main first to understand the purpose of the change. Then, you should revert existing changes, and manually set use_microagents=False for agent_config in get_config function for all evaluation benchmarks under evaluation/benchmarks.

openhands-agent · 2025-01-02T16:28:33Z

OpenHands started fixing the pr! You can monitor the progress here.

…t breaking eval

openhands-agent · 2025-01-02T16:33:50Z

Overview of Changes:

All issues appear to have been successfully resolved. The changes address two main requirements:

Making microagents enabled by default while allowing configuration

Changed default to use_microagents=True in config
Removed hard-coded CLI override
Allows configuration flexibility

Disabling microagents specifically for evaluation

Added explicit use_microagents=False in all evaluation benchmarks
Maintains feature availability for non-evaluation use

Status: ✓ FULLY RESOLVED
No remaining issues are apparent from the provided information. The solution is comprehensive and implements the requested behavior while maintaining proper configuration management.

evaluation/benchmarks/agent_bench/run_infer.py

openhands/server/session/session.py

openhands/core/cli.py

xingyaoww · 2025-01-02T16:57:06Z

@rbren this should be ready for another look!

enyst

LGTM!

rbren · 2025-01-02T21:17:18Z

evaluation/benchmarks/gorilla/run_infer.py

@@ -55,6 +55,8 @@ def get_config(
        workspace_mount_path=None,
    )
    config.set_llm_config(metadata.llm_config)
+    agent_config = config.get_agent_config(metadata.agent_class)


there's probably a better way to DRY this out, but I'm not familiar enough with eval. E.g. maybe a shared config.toml? I dunno

I think it was intentional to explicitly write these out - so eval is not gonna be mysteriously impacted by a local config.toml (which did happen in the early days and took me a while to figure out 😢 )

#5976) Co-authored-by: openhands <[email protected]>

xingyaoww and others added 2 commits January 2, 2025 10:55

only enable microagent in session by default

9a8878d

change default value of use_microagents

38f806c

enyst reviewed Jan 2, 2025

View reviewed changes

rbren reviewed Jan 2, 2025

View reviewed changes

rbren requested changes Jan 2, 2025

View reviewed changes

rbren closed this Jan 2, 2025

rbren reopened this Jan 2, 2025

xingyaoww added the fix-me Attempt to fix this issue with OpenHands label Jan 2, 2025

Fix pr #5976: Set default value of use_microagents to False to preven…

fe7298e

…t breaking eval

xingyaoww commented Jan 2, 2025

View reviewed changes

evaluation/benchmarks/agent_bench/run_infer.py Outdated Show resolved Hide resolved

Update evaluation/benchmarks/agent_bench/run_infer.py

24c39b2

xingyaoww commented Jan 2, 2025

View reviewed changes

openhands/server/session/session.py Outdated Show resolved Hide resolved

Update openhands/server/session/session.py

e3bd35f

xingyaoww commented Jan 2, 2025

View reviewed changes

openhands/core/cli.py Outdated Show resolved Hide resolved

Update openhands/core/cli.py

3509456

enyst approved these changes Jan 2, 2025

View reviewed changes

rbren approved these changes Jan 2, 2025

View reviewed changes

xingyaoww merged commit 9dd5463 into main Jan 2, 2025
18 checks passed

xingyaoww added a commit that referenced this pull request Jan 2, 2025

Set default value of use_microagents to False to prevent breaking eval (

9dd5463

#5976) Co-authored-by: openhands <[email protected]>

xingyaoww deleted the xw/default-microagent branch January 2, 2025 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set default value of use_microagents to False to prevent breaking eval #5976

Set default value of use_microagents to False to prevent breaking eval #5976

xingyaoww commented Jan 2, 2025 •

edited by github-actions bot

Loading

enyst Jan 2, 2025

xingyaoww Jan 2, 2025

enyst Jan 2, 2025

xingyaoww Jan 2, 2025

openhands-agent Jan 2, 2025

rbren Jan 2, 2025

openhands-agent Jan 2, 2025

rbren left a comment

rbren commented Jan 2, 2025

xingyaoww commented Jan 2, 2025

openhands-agent commented Jan 2, 2025

openhands-agent commented Jan 2, 2025

xingyaoww commented Jan 2, 2025

enyst left a comment

rbren Jan 2, 2025

xingyaoww Jan 2, 2025

Set default value of use_microagents to False to prevent breaking eval #5976

Set default value of use_microagents to False to prevent breaking eval #5976

Conversation

xingyaoww commented Jan 2, 2025 • edited by github-actions bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbren left a comment

Choose a reason for hiding this comment

rbren commented Jan 2, 2025

xingyaoww commented Jan 2, 2025

openhands-agent commented Jan 2, 2025

openhands-agent commented Jan 2, 2025

xingyaoww commented Jan 2, 2025

enyst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingyaoww commented Jan 2, 2025 •

edited by github-actions bot

Loading