-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a different attribute name than "locator" in CDXJ #41
Comments
It's a difficult situation. It could be name, identifier, or a location. The role of this field in this context is something that might change in a broader perspective or even when this system evolves to other models. Finding a term that is generic enough while being accurate is challenging. I will give more thoughts about it. |
Any further thoughts since October about a better name, @ibnesayeed ? |
We can perhaps call it |
@ibnesayeed Those seem fitting albeit not nearly as "user-friendly" as "locator", which might be a moot point if the intention of ipwb CDXJs is to be machine readable. Any other recommendations beyond |
I've lost the thread -- where is "locator" used as an attribute? |
@phonedude in the CDXJ (index) files we store references to the hashes of the headers and payload blocks of responses in the following manner. - - {"..": "..", "locator": "urn:ipfs/{header_digest}/{payload_digest}", "..": ".."} The term locator was something that @weiglemc questioned about if it is really something that tells about the location of the resources. That's why we were looking for better alternatives. |
Definitely should not be called a "locator", since that would suggest URL, which it clearly is not. URI or URN would be more accurate, but repetitive and not nearly as descriptive as something like "header-payload-digests". |
I would stay away with something like |
Any further thoughts on this naming, @ibnesayeed? Could the field value ever be a Once we change this name, should we have some adaptation considerations for older versions of ipwb that used |
I don't have a good name right now.
Yes! The reason why we used this style in the first place rather than keeping headers and payload hashes under separate attributes, so that we can generalize it. If a record is stored on an HTTP URL we can use that directly or if a content is to be fetched from WARC file we can have something like
Changing this name is about standardizing terminologies used in CDXJ files for archival indexing purposes, irrespective of the tool they are used in. Once such a change is made, we will have a few choices: 1) have an fallback keyword in the replay to look for the old name for a while, 2) provide a migration script/command that changes old CDXJ files in the new style, or 3) if the user base of the tool is small, we can just introduce this breaking change and inform in the release not and the README file. |
The value for this field is a URN, not a "locator" per se. @ibnesayeed Do you have a suggestion for a better name? @phonedude noted this at one point.
The text was updated successfully, but these errors were encountered: