Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor navigation commands into its own tool #35

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

ryanhoangt
Copy link
Collaborator

@ryanhoangt ryanhoangt commented Dec 23, 2024

Description

This PR is to:

  • Create a new tool for the 2 nav commands. No changes are made to the str_replace_editor.

TODO:

  • Run eval on OH
  • Add tests for navigator

This is an alternative to #5.

Related Issue

Close #28

@ryanhoangt ryanhoangt marked this pull request as ready for review December 24, 2024 07:16
@ryanhoangt ryanhoangt requested a review from xingyaoww December 24, 2024 07:17
Copy link

@jpollard-cs jpollard-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this! I've left some comments and questions - hopefully they are helpful! I want to dig a bit further into some other things such as output tree rendering, but thought I would share what I have so far.


TOOL_DESCRIPTION = """Custom navigation tool for navigating to symbols in a codebase
* If there are multiple symbols with the same name, the tool will print all of them
* The `jump_to_definition` command will print the FULL definition of the symbol, along with the absolute path to the file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tested cases where the definition is for a dependency which has not yet been resolved / downloaded? I don't mean anything exotic like dynamically downloaded dependencies, but a simple case of finding the definition from an external dependency which hasn't yet been installed by a package manager or what have you? just curious if this can be surfaced in a meaningful way that OH is able to reason about and install the dependencies itself or prompt the user to do so?

with pytest.raises(Exception) as exc_info:
navigator(command='invalid_command', symbol_name='MyClass')

assert 'Unrecognized command' in str(exc_info.value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there further plans to test E2E that OH is using the Navigator CLI when it would by expected to do so or does that introduce too much bias / reduce agency? It would seem that there might at least be some basic tests along these lines where the obvious choice should be to use this tool. I'm not really concerned with influencing the agency of the OH "agent", but more saying that it would be good to have a way to test the efficacy of this tool and it's relevant prompts such that we have confidence that 1.) OH is well aware of it as a valid option when it is contextually appropriate and that 2.) it is able to appropriately call the right CLI commands and effectively navigate the results (understanding this may not be perfect perhaps this could be based on some threshold percentage?). My thinking here is that this would help build confidence that the OH eval results are actually testing the desired functionality. Also are eval results the sole arbiter of whether or not a new feature will make the cut and if so where can I learn more about how that process works and how it is measured? A concern is what might happen if, for instance, this led to significantly lower token usage, but performed slightly worse perhaps because more brute force methods pull more code into the context. It could be that ancillary work to endow OH with a better ability to logically reason about code could unlock improved eval scores. In other words how do you avoid throwing away perfectly good tools?


Command = Literal[
'view',
'create',
'str_replace',
'insert',
'undo_edit',
# 'jump_to_definition', TODO:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if there's a formal board and backlog item to link these TODO items to, but in my experience these kinds of things often never get done otherwise

def get_definitions_tree(
self, symbol: str, rel_file_path: str | None = None, use_end_line=True
):
if not self.git_utils:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why this is required for Navigation commands? perhpaps that will become more obvious as I read on..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see this as being a useful way for eagerly anticipating files that may need to be refreshed from the cache although perhaps that would only make sense if you're running some background indexing process

) # (relative file, symbol identifier) -> set of its REF tags

all_abs_files_iter = (
tqdm(all_abs_files, desc='Parsing tags', unit='file')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tqdm(all_abs_files, desc='Parsing tags', unit='file')
# tdqm is a library which helps us track progress as we're
# iterating through the list of files we got from git utils
tqdm(all_abs_files, desc='Parsing tags', unit='file')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this library is more well-known than I'm aware, but I thought it might be helpful to add some context here given the naming does not convey a clear utility for this import - alternatively alias the import as something like progress_tracker?

parsed_tags = self.ts_parser.get_tags_from_file(abs_file, rel_file)

for parsed_tag in parsed_tags:
if parsed_tag.tag_kind == TagKind.DEF:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a nitpick, but you could separate the file finding and iteration using a generator and then doing the actual processing on the relative files as you iterate through the rel_file yielded by the generator. This would allow you to have separate methods for getting definition and reference tags without having to repeat code. I understand you may still need to go through parsed_tags = self.ts_parser.get_tags_from_file(abs_file, rel_file) for now, but it would allow these methods to evolve independently and make the methods and usages more readable. Again - bit of a nitpick. If there were separate methods to get the definition tags and reference tags perhaps I'd feel a bit more strongly about this 🙃.

Comment on lines +114 to +118
if not def_tags:
# Perform a fuzzy search for the symbol
choices = list(ident2defrels.keys())
suggested_matches = process.extract(symbol, choices, limit=5)
return f"No definitions found for `{symbol}`. Maybe you meant one of these: {', '.join(match[0] for match in suggested_matches)}?"
Copy link

@jpollard-cs jpollard-cs Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this and think it's a thoughtful follow-up as opposed to just bailing with "not found", but could it actually be better to just bail or just hint to the agent that a fuzzy search (or some less biased language)might be a good next step? Sort of playing devils advocate here. In a more traditional program I'd say this is doing too much, but this is a different paradigm I'm still getting used to.

)

# Truncate long lines in case we get minified js or something else crazy
output = '\n'.join(line[:150] for line in output.splitlines())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you get start and end positions from the parsed tags to help with this sort of thing (minified js or what have you)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor navigation commands into its own tool
3 participants