Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update acronym filter to ignore word boundaries. #480

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jsuereth
Copy link
Contributor

Fixes #415

@jsuereth jsuereth requested a review from a team as a code owner November 27, 2024 16:46
Copy link

codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.0%. Comparing base (9f1d9a5) to head (7e12de7).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main    #480     +/-   ##
=======================================
+ Coverage   73.9%   74.0%   +0.1%     
=======================================
  Files         50      50             
  Lines       3903    3938     +35     
=======================================
+ Hits        2885    2917     +32     
- Misses      1018    1021      +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lquerel
Copy link
Contributor

lquerel commented Nov 29, 2024

Thinking back about this problem, I believe this approach is problematic. For example, let's imagine these two metrics:

  • radios.signal.strength
  • fluid.pressure

If we apply these filters title_case | acronym to the previous examples, it gives us:

  • RadiOSSignalStrength
  • FlUIDPressure

Which is probably not what we want either.

I'm thinking about these two approaches:

  • We add an additional filter parameter _case that corresponds to the acronym flag (false by default) which, if true, must be inserted into the filter logic to produce the desired effect. Whether it's possible to configure the crate we're using for this, I don't know.
  • We keep the current filters "as is" but instead of having them produce character strings, they produce an object containing a list of tokens with properties specifying the forbidden transformations per token. This approach is more generic but more complicated. This object must probably implement Display for the final rendering.

There might be other simpler approaches.

@jsuereth
Copy link
Contributor Author

Thinking back about this problem, I believe this approach is problematic. For example, let's imagine these two metrics:

  • radios.signal.strength
  • fluid.pressure

If we apply these filters title_case | acronym to the previous examples, it gives us:

  • RadiOSSignalStrength
  • FlUIDPressure

Which is probably not what we want either.

I'm thinking about these two approaches:

  • We add an additional filter parameter _case that corresponds to the acronym flag (false by default) which, if true, must be inserted into the filter logic to produce the desired effect. Whether it's possible to configure the crate we're using for this, I don't know.
  • We keep the current filters "as is" but instead of having them produce character strings, they produce an object containing a list of tokens with properties specifying the forbidden transformations per token. This approach is more generic but more complicated. This object must probably implement Display for the final rendering.

There might be other simpler approaches.

I see two options here:

  • we find a way for acronyms to interact directly with *_case filters.
  • we have configuration options for. Acronyms filter that imply how to split word boundaries.
    Which one would your other me explore?

@lquerel
Copy link
Contributor

lquerel commented Dec 1, 2024

I’m not sure which one will be the easiest to implement. I think I would start by checking if there is a hook mechanism in the library used for casing, and if there isn’t, I would probably try to address the issue in the acronym filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Custom Capitalizations
2 participants