-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow fetching all references for a file, or all files, using public API #397
base: main
Are you sure you want to change the base?
Conversation
57448d9
to
99f7576
Compare
24e6991
to
5f15a4b
Compare
5f15a4b
to
028bd25
Compare
private | ||
|
||
def make_fake_reference | ||
package_name = Array("ilikeletters".chars.sample(5)).join |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Array
call put in to satisfy sorbet.
@gmcgibbon any interest in merging this? |
@rafaelfranca any interest in merging this? It would make analysis of the (actual) dependency graph considerably easier. |
references_result.references.flat_map { |reference| reference_checker.call(reference) } | ||
end | ||
|
||
class FileReferencesResult < T::Struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe T::Struct is slower than a normal struct. Please provide a benchmark, or switch to a plain old Struct (I think we only use Struct in other files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 4 other uses of T::Struct
around the codebase and 4 uses of Struct
.
I did not think about performance differences and chose T::Struct
because it's typed, and I appreciate the added documentation and type safety, especially within this class that is a little messy and doesn't have great tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick benchmark shows that access is about the same, while instantiation is ~5x slower for T::Struct
. I was still able to instantiate 380k T::Struct
within a second on my laptop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess you were looking for benchmarks of the whole thing, running packwerk on a real code base. Will do. I'm curious myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, on this codebase I'm currently looking at: 800k lines of Ruby according to cloc
, 164 packages. Tested on a github codespace.
There is no measurable difference between uncached packwerk check
runs on this branch vs on Shopify/packwerk main. Packwerk reports ~18s, real
is ~20s.
I also tested with cache, and there is no measurable difference either. Packwerk reports ~3.6s, real
is ~6.5s.
lib/packwerk/references_from_file.rb
Outdated
end | ||
|
||
sig { params(relative_file_paths: T::Array[String]).returns(T::Array[Packwerk::Reference]) } | ||
def list_all(relative_file_paths: []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is this the main method you use? Do we need to make the others public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think list
is still useful to get references from a specific file. list_all
goes through FilesForProcessing
and thus respects includes and excludes, which could be confusing if you want to analyze a specific file that may be excluded.
files
is public because it makes testing easier... I'm going to see how much more ugly it'd be to directly mock FilesForProcessing
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit that gets rid of the public files
method in favour of some more elaborate stubbing in the test. It also renames the two remaining public methods so that hopefully their purpose is clearer.
list
->list_for_file
list_all
->list_for_all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmh, I wonder now whether it makes sense to inject a FilesForProcessing instance in the test instead.
Thank you for your review Gannon! |
What are you trying to accomplish?
For introspection (e.g. network graphs & graphs over time) as well as advanced enforcement regimes that may be difficult to implement in a packwerk checker extension it would be great to be able to get a list of all static constant references from a file or between all files.
Extracting constants references is where packwerk's core complexity lies and we should allow its reuse directly, without having to go through the more opinionated and specific logic built on top of it.
What approach did you choose and why?
I listed a naive approach in the first commit that doesn't require any modifications to existing packwerk code. This is the approach I currently use to generate detailed statistics about a monolith that I am working on. However, it uses a slew of private APIs.
In the second commit I propose a new module that creates public API for this functionality.
What should reviewers focus on?
Caveats
Type of Change
Checklist
Additional Notes
This work is sponsored by https://www.onemedical.com/