Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default value of use_microagents to False to prevent breaking eval #5976

Merged
merged 6 commits into from
Jan 2, 2025

Conversation

xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Jan 2, 2025

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

Unintentionally, use_microagent was default to True in evaluations, which is definitely unintended and can potentially hurt results there (eg we don't need to mess with github in evaluation setting).

This PR set the default use_microagents value to false across the system, but do enable it for session (used by UI) and CLI.

For evaluation that actually use this, it is easy to re-enable this by tweaking config.

Tested: microagent still work for UI
image


Link of any specific issues this addresses


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:3509456-nikolaik   --name openhands-app-3509456   docker.all-hands.dev/all-hands-ai/openhands:3509456

@@ -116,6 +116,7 @@ async def main(loop):

agent_cls: Type[Agent] = Agent.get_cls(config.default_agent)
agent_config = config.get_agent_config(config.default_agent)
agent_config.use_microagents = True # hard-coded to true since it is user-facing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means it doesn't obey the config.toml value, though. 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's true.. Do you think maybe we can just leave this one off for ppl?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can:

  • go the other way around, set it to False in evals and let it configurable for the rest
  • if that's becoming too much to change in evals code when we have such issue, maybe we can restore the idea we were using before, to write a config.toml for evals

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm maybe we should turn this off in eval then!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Openhands fix success summary

The AI has successfully incorporated the feedback by making changes that:

  1. Remove the hard-coded use_microagents = True from the CLI code
  2. Set use_microagents=False explicitly in all evaluation benchmarks

This directly addresses the feedback requesting to "turn this off in eval" while maintaining the functionality elsewhere. The solution is comprehensive as it:

  • Respects the config.toml values by removing hard-coding
  • Explicitly disables microagents in evaluation contexts
  • Maintains the feature for non-evaluation use cases

The changes can be summarized for a reviewer as:
"Modified the code to explicitly disable microagents in evaluation benchmarks while maintaining the feature for general use. Removed hard-coding from CLI to respect configuration values, and added explicit use_microagents=False settings in all evaluation benchmark configurations."

@@ -27,7 +27,7 @@ class AgentConfig:
memory_enabled: bool = False
memory_max_threads: int = 3
llm_config: str | None = None
use_microagents: bool = True
use_microagents: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should default to True, and that eval should override it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Openhands fix success summary

The feedback has been successfully incorporated. The AI made the following changes:

  1. Changed the default value of use_microagents from False to True in the agent configuration file
  2. Removed hard-coded override in the CLI to respect the configuration value
  3. Added explicit overrides in all evaluation benchmarks to set use_microagents=False

This directly addresses the feedback request that the feature should default to True but be overridden in evaluation contexts. The changes are comprehensive and maintain consistency across the codebase while implementing the requested behavior.

The explanation provided by the AI clearly outlines the changes made and confirms they align with the feedback requirements. This can be confidently marked as resolved.

Copy link
Collaborator

@rbren rbren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with the change here. I think it's eval's responsibility to turn this off

@rbren rbren closed this Jan 2, 2025
@rbren rbren reopened this Jan 2, 2025
@rbren
Copy link
Collaborator

rbren commented Jan 2, 2025

sorry didn't mean to close!

@xingyaoww
Copy link
Collaborator Author

OpenHands agent if you see this, please read the diff against main first to understand the purpose of the change. Then, you should revert existing changes, and manually set use_microagents=False for agent_config in get_config function for all evaluation benchmarks under evaluation/benchmarks.

@xingyaoww xingyaoww added the fix-me Attempt to fix this issue with OpenHands label Jan 2, 2025
@openhands-agent
Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@openhands-agent
Copy link
Contributor

Overview of Changes:

All issues appear to have been successfully resolved. The changes address two main requirements:

  1. Making microagents enabled by default while allowing configuration
  • Changed default to use_microagents=True in config
  • Removed hard-coded CLI override
  • Allows configuration flexibility
  1. Disabling microagents specifically for evaluation
  • Added explicit use_microagents=False in all evaluation benchmarks
  • Maintains feature availability for non-evaluation use

Status: ✓ FULLY RESOLVED
No remaining issues are apparent from the provided information. The solution is comprehensive and implements the requested behavior while maintaining proper configuration management.

openhands/core/cli.py Outdated Show resolved Hide resolved
@xingyaoww
Copy link
Collaborator Author

@rbren this should be ready for another look!

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -55,6 +55,8 @@ def get_config(
workspace_mount_path=None,
)
config.set_llm_config(metadata.llm_config)
agent_config = config.get_agent_config(metadata.agent_class)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's probably a better way to DRY this out, but I'm not familiar enough with eval. E.g. maybe a shared config.toml? I dunno

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was intentional to explicitly write these out - so eval is not gonna be mysteriously impacted by a local config.toml (which did happen in the early days and took me a while to figure out 😢 )

@xingyaoww xingyaoww merged commit 9dd5463 into main Jan 2, 2025
18 checks passed
xingyaoww added a commit that referenced this pull request Jan 2, 2025
@xingyaoww xingyaoww deleted the xw/default-microagent branch January 2, 2025 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix-me Attempt to fix this issue with OpenHands
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants