Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support absence groups #13

Open
slevithan opened this issue Dec 27, 2024 · 0 comments
Open

Support absence groups #13

slevithan opened this issue Dec 27, 2024 · 0 comments

Comments

@slevithan
Copy link
Owner

slevithan commented Dec 27, 2024

Oniguruma supports multiple forms of "absence" operators/functions/groups. The basic form (?~…) is extremely rarely used, but at least has good use cases. Other forms (that start with (?~|) are so exceedingly rare that they're probably not worth supporting (and some are likely not emulatable anyway).

On rarity: The basic form was used by two regexes out of tens of thousands in a sample of real-world Oniguruma regexes used in TextMate grammars. The other forms were not used at all.

If I understand Oniguruma's basic form (?~…) correctly, it can be emulated in JS as (?:(?!…)\p{Any})*. Running a few basic tests in Oniguruma show this to be producing the same results. This is a trivial transformation to do in src/transform.js after first adding support for parsing absence operators in src/tokenize.js and src/parse.js.

Additionally:

  • Absence operators currently throw a custom error here.
  • The behavior of (?~…) is different in Oniguruma and Onigmo. So Ruby regex testers like rubular.com are not helpful.
  • Need to test the effects of quantification, whether absence operators are atomic, etc.
  • Should catch non-basic forms that aren't supported (starting with (?~|) as an error.
    • It might be easy to also support the "absent expression" form (?~|absent|exp).
  • Oniguruma itself doesn't support "absent stopper" and "range clear" forms within lookbehind.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant