Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are rule identifiers matched to one another? #572

Open
bert-github opened this issue Oct 17, 2024 · 2 comments
Open

How are rule identifiers matched to one another? #572

bert-github opened this issue Oct 17, 2024 · 2 comments
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.

Comments

@bert-github
Copy link

(This is part of the review by the Internationalization WG. Sorry for being late – it's entirely my fault.)

4.1. Rule Identifier
https://www.w3.org/TR/2024/WD-act-rules-format-1.1-20240618/#rule-identifier

This identifier must be unique when the rule is part of a ruleset. The identifier can be any text [...]

To know if an identifier is unique (and to be able to use it in one rule to point to another), you need to know when two identifiers are the same. E.g., are capital letters (ABC) the same as lowercase letters (abc)? If a letter can be encoded in Unicode in two ways (e.g., ‘é’ as single character vs separate ‘e’ + acute accent) are those the same?

‘Character Model for the World Wide Web: String Matching’ explains the issues with comparing two strings of text and has recommendations for choosing an algorithm, including for text strings used as identifiers.

@bert-github bert-github added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label Oct 17, 2024
@daniel-montalvo
Copy link
Contributor

Hi @bert-github

Sorry, didn't mean to close before.

The Format does require that the identifiers be unique but we never wanted to prescribe how "unique" must be measured. Different rule writers may have different mechanisms to ensure their identifiers are unique.

For example, the CG always picks up identifiers that are lowercase ASCII characters, to prevent the situations you describe above.

@daniel-montalvo
Copy link
Contributor

Hi @bert-github
Has the group had a chance to review my comment above? Would this explanation be sufficient to mark this as resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Projects
None yet
Development

No branches or pull requests

2 participants