Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threadsafe mojos #474

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

timw
Copy link

@timw timw commented May 12, 2023

Make all Mojos threadsafe (some had already been marked as thread safe in previous work).

This primarily addresses #86, where the static artifact cache in DefaultThirdPartyHelper could be concurrently read/updated.
This patch retains the lazy initialisation of the cache, using the singleton holder approach, and I updated the similar instantiation in SpdxLicenseList to use the same pattern to be consistent (instead of the double checked locking).

I've reviewed the fields used by all the other mojos that are newly marked as thread-safe, and they all look OK.
Mainly this means they don't make unsafe use of statics or other potentially shared data structures (there are plenty of areas where the Mojos would not be safe for a single instance to be used on multiple threads, but the default Mojo instantiation strategy should preclude those cases)

timw added 3 commits May 12, 2023 12:33
Use singleton holder pattern instead of double checked locking.
Use singleton holder pattern for lazy initialisation of static cache, and make the cache concurrency safe as it can be read and updated simultaneously in multi module builds.
@timw timw force-pushed the threadsafe-mojos branch from 90c5baf to 8dfc370 Compare May 12, 2023 00:33
@olamy
Copy link
Member

olamy commented May 15, 2023

@timw Thanks for the PR. Everything looks great.
Another PR related to concurrent build has been merged. #354
Can you please rebase and fix the conflicts?
Thanks

@timw
Copy link
Author

timw commented May 15, 2023

@olamy - I've had a look at that other commit and had a look at reworking the patch I made.

There is a lot of external synchronization in that commit, following the pattern for existing uses of the cached dependencies, which along with the various interpretations of the dependency map (as a cache or a project dependencies) makes it quite hard to follow and reason about.

The synchronization seems to be in place to make some of the bulk cache reads/updates atomic, although that's not 100% applied across all uses. The change to a synchronized map will (I think) avoid most of the concurrent modification exceptions, but there are remaining cases where bulk reads (which iterate internally) might occur concurrently with map mutations.

We can wrap this up in a couple of ways:

  • encapsulate the dependency map and the various operations operations into a new type (e.g. DependencySet). This is a lot clearer and easier to reason about in terms of data structure consistency and purpose, but changes the ThirdPartyHelper and ThirdPartyTool signatures, which are in an api package, so you may not want to do that. Notably, these interfaces require any implementations to implement external synchronization on access/mutation of the dependencies to be safe against concurrent modifications, so it's not a safe API contract, but I don't know if that api is API and if changing it would break third party extensions.
  • rely on the artifact cache map now being synchronized for data structure integrity, and review/correct all of the uses to apply external synchronization required to avoid concurrent modification issues.
  • change the artifact cache map to a concurrent map to avoid concurrent modification issues, but still rely on external synchronization for atomicity on bulk reads/updates.

The first option is much more maintainable, but I can appreciate if the maintainers need to live with the code they have and can't break API, in which case I'd probably recommend the third option.

@slawekjaranowski
Copy link
Member

@timw I think you can go with first option.

I tried to search who use api package:
https://github.com/search?q=%22import+org.codehaus.mojo.license.api.ThirdPartyHelper%22+language%3AJava&type=code

and I don't see a reason to have strict api without breaks ...

I consider that it is rather a util than api - but for me is for internal use only.
There was never documentation published about such api.

@slachiewicz slachiewicz marked this pull request as draft December 22, 2024 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants