Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Non-normative Section 5.6 of AS2-Vocab inspires problematic usage of tag for microsyntaxes instead of taxonomies #623

Open
trwnh opened this issue Nov 15, 2024 · 7 comments
Labels
Needs primer page Need to add a page at https://www.w3.org/wiki/Activity_Streams/Primer on this topic Next version Things that should probably be resolved in a next version of AS2

Comments

@trwnh
Copy link

trwnh commented Nov 15, 2024

Description

this issue is perhaps not entirely with AS2 as a spec and more with how it is used by some fediverse applications, but it's worth raising here.

tag is defined by AS2-Vocab as: https://www.w3.org/TR/activitystreams-vocabulary/#dfn-tag

One or more "tags" that have been associated with an objects. A tag can be any kind of Object. The key difference between attachment and tag is that the former implies association by inclusion, while the latter implies associated by reference.

however, section 5.6 "mentions, tags, and other microsyntaxes" seems to conflate some things. https://www.w3.org/TR/activitystreams-vocabulary/#microsyntaxes makes the overall general recommendation (normatively within a non-normative section? see #622 for more) that

While such microsyntaxes MAY be used within the values of the content, name, and summary properties on an Activity Streams Object, implementations SHOULD NOT be required to parse the values of those properties in order to determine the appropriate routing of notifications, categorization or linking between objects. Instead, publishers SHOULD make appropriate use of the vocabulary terms provided specifically for these purposes.

this advice is generally sound, especially when the default mediaType for all content is text/html, which is generally pre-rendered and does not use microsyntax.

HOWEVER: this has not stopped implementers from trying to do microsyntax-y things on top of HTML, mostly citing this non-normative section as inspiration.

i'm fairly sure the intent here is for tag to be used for taxonomical purposes (marking up related things so that you can draw associations later), not for microsyntax purposes (marking up content/summary/name to be rendered into rich entities or trigger certain processing requirements). in fact, the normative language (again, non-normative section) seems to demonstrate this intent quite clearly by saying that microsyntaxes SHOULD NOT be required for parsing; that you SHOULD use properties like to or cc for notifications, tag for categorization, or whatever dedicated property makes semantic sense.

Outcome

maybe a Primer page? maybe Needs FEP? maybe Errata?

Discussion

the point of confusion is specifically that the non-normative examples are placed next to normative text in an overall non-normative section. this has led some implementers to treat the examples as perhaps more illustrative than they ought to be. also, the examples themselves are somewhat misleading, since they never actually demonstrate microsyntaxes properly except in Example 158, which uses a Mention in tag as an example of marking up @sally in content. there seems to be an implication (at best?) that the name of the tag corresponds or correlates to a substring of content (hence, microsyntax). this implication doesn't really make sense in a world where the claim is "you SHOULD NOT have to parse microsyntaxes".

in fact, Example 158 in particular perhaps makes more sense if viewed as a continuation of Example 157 where you tag the Person instead of trying to use Mention instead. the point that could have been made is that to is for generating notifications, but tag does not generate notifications. although even this much is further complicated by current fediverse implementations that seemingly require Mention tags in addition to the addressing properties, possibly even using the presence of a Mention in tag to generate notifications despite being warned that they SHOULD NOT do this and despite Example 158 being intended to demonstrate how to tag a Person/Mention without generating a notification.

@nightpool
Copy link
Collaborator

I'm not sure I understand the distinction you're making here in terms of "taxonomical" parsing vs "microsyntax" parsing.

this implication doesn't really make sense in a world where the claim is "you SHOULD NOT have to parse microsyntaxes".

The language you quote is saying that you can't use a Mention to replace a to property. I think that's pretty straightforward. I don't think it's saying you can't use microsyntaxes for "marking up content/summary/name to be rendered into rich entities". That's literally the whole point of the section—it would be completely contradictory to include the section if you didn't want people to use it at all.

Regardless of the original intent, microsyntaxes are clearly very useful and an important part of the specification, so I don't understand why we would remove them.

@trwnh
Copy link
Author

trwnh commented Nov 15, 2024

taxonomical meaning "here are some related things". if i tag a Person then i can later query/filter for all objects that have a tag of that Person. tagging a Mention (subtype of Link) carries problems because you don't actually have the id of that Person anymore, you have an href to them, which has different implications.

in other words,

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://social.example/some-object",
  "type": "Image",
  "summary": "Picture of Sally",
  "tag": {
    "id": "https://social.example/users/72559",
    "type": "Person",
    "name": "Sally"
  }
}
{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://social.example/some-object",
  "type": "Image",
  "summary": "Picture of Sally",
  "tag": {
    "href": "https://social.example/~sally",
    "type": "Mention",
    "name": "Sally"
  }
}

because Mention is a subtype of Link, you can't actually say that https://social.example/some-object is in any way related to https://social.example/users/72559. at best, you can say that https://social.example/some-object is related to... the reference to https://social.example/~sally?

which is not very useful unless you intend to process it as a Link, e.g. taking the natural language summary and treating the name of Sally as a microsyntax, by which you end up rendering the final output as something like <span class="summary">Picture of <a href="https://social.example/~sally">Sally</a></span>. this is using tag as a vehicle for microsyntax, whereas i am of the opinion that tag should be used for "related things" (taxonomy) instead. microsyntax implies a certain processing that possibly justifies a dedicated property for microsyntaxes.

taking another example from elsewhere, we can say that syntactically the following:

  • Go to [this site](https://example.com) is Markdown syntax for an inline link
  • Go to <a href="https://example.com">this site</a> is the equivalent rendered HTML
  • Go to this site can similarly/equivalently be marked up with microsyntax by using a property for microsyntax, which we can call richText to differentiate it from tag:
{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "richText": {
        "@id": "https://w3id.org/fep/xxxx/richText",
        "@type": "@id",
        "@container": "@set"
      }
    }
  ],
  "content": "Go to this site",
  "mediaType": "text/plain",
  "richText": [
    {
      "type": "Link",
      "href": "https://example.com",
      "name": "this site"
    }
  ]
}

which again renders to Go to <a href="https://example.com">this site</a>, if we interpret name as being the search-and-replace target string.

or perhaps you interpret name might mean something else for a Link, such as not the a.innerText but instead something like a.title? in which case, name is no longer appropriate for search-and-replace, and you alternatively can use some extension for indices instead of search-and-replace:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "richText": {
        "@id": "https://w3id.org/fep/xxxx/richText",
        "@type": "@id",
        "@container": "@set"
      },
      "applyToProperty": {
        "@id": "https://w3id.org/fep/xxxx/applyToProperty",
        "@type": "@vocab",
        "@container": "@set"
      },
      "applyFromByte": "https://w3id.org/fep/xxxx/applyFromByte",
      "applyUntilByte": "https://w3id.org/fep/xxxx/applyUntilByte",
      "applyEntity": {
        "@id": "https://w3id.org/fep/xxxx/applyEntity",
        "@type": "@id"
      },
      "LinkMicrosyntax": "https://w3id.org/fep/xxxx/LinkMicrosyntax"
    }
  ],
  "content": "Go to this site",
  "mediaType": "text/plain",
  "richText": [
    {
      "type": "LinkMicrosyntax",
      "applyToProperty": ["content"],
      "applyFromByte": 6,
      "applyUntilByte": 15,
      "applyEntity": {
        "type": "Link",
        "href": "https://example.com",
        "name": "some title"
      }
    }
  ]
}

which renders to Go to <a href="https://example.com" title="some title">this site</a>

in pretty much every case where you have pre-rendered HTML you don't actually need to know what microsyntax was used to generate that HTML. it's frankly irrelevant whether you used Markdown or bbCode or AsciiDoc or reStrucuteredText or org-mode or whatever else.

the only unique arguments and use-cases i can think of for having metadata on a Link used in content generally boil down to wanting to use a unique property like preview, e.g. in a social system where publishers are responsible for including link preview metadata. everything else, like rel or mediaType can be directly represented in HTML:

Go to <a
  href="https://example.com"
  title="some title"
  rel="external"
  type="application/activity+json"
  hreflang="en"
>this site</a>

given this HTML, it's not super useful to have an equivalent linked data node representing the same information, unless you need to parse microsyntax, which you SHOULD NOT need to do.


microsyntaxes are clearly very useful and an important part of the specification, so I don't understand why we would remove them.

that's the thing though -- they're not actually! the spec says you shouldn't need them, because dedicated properties exist. it's almost purely conventional that some software will tag a Mention instead of letting you tag a Person, and likewise, it's purely conventional to attach processing rules and considerations to that. i'd argue that for something like:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://social.example/some-object",
  "type": "Image",
  "summary": "Picture of Sally",
  "tag": {
    "href": "https://social.example/~sally",
    "type": "Mention",
    "name": "Sally"
  }
}
{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://social.example/some-object",
  "type": "Image",
  "summary": "<p>Picture of <a href=\"https://social.example/~sally\">Sally</a></p>"
}

you don't actually need the tag in the latter of these two examples. it doesn't provide any particularly useful information that isn't already given to you. the only thing it tells you is that the Link is specifically a Mention, which... what does that even mean? what are the semantic implications or consequences of saying that your Link is specifically a Mention? i'd expect it to at least do something... can we say that it indicates an intention to generate a Webmention for that link href?

this isn't to say that we should altogether remove Mention, but i do think that it is worth reconsidering for what, why, and how it's supposed to be used. there is a growing trend (which i would call an anti-pattern) for tag to be semantically overloaded in this way -- the expectation is that items in tag are somehow related to transforming or processing the content, when this again SHOULD NOT be the case. for example, inline images should be represented as inline images in the HTML content, not stripped out and then expected to be put back in by consumers and parsers (as, i repeat, you SHOULD NOT be expected to do). an HTML <a> link in content is given a corresponding Link in tag, which is unnecessary unless you plan to do something with it... and it's not clear what you should be doing with it. at the very least, if there are processing expectations or considerations implied by something's inclusion in tag, then that's a pretty good sign that there's an extension property missing that's just waiting to be defined and used.

@trwnh
Copy link
Author

trwnh commented Nov 15, 2024

doing a little archaeology it looks like the following information might be interesting or relevant

january - february 2015

october 2015

  • [Proposal] Remove 'Mention' object type #228 the issue came up again when it was proposed to drop Mention from the vocabulary but it was opposed on the grounds that because the microsyntax differs (@ or no @, preferredUsername vs name, etc?), it could be useful to explicitly markup Mention links. unfortunately it wasn't explored or explained any further than that. the definition remains as "A specialized Link that represents an @mention.", despite not actually representing an @mention always

@nightpool
Copy link
Collaborator

nightpool commented Nov 16, 2024

unfortunately it wasn't explored or explained any further than that

Actually, it was, it was explicitly linked to the Mention example as given in the microsyntaxes section at that time:

    <p>
      In the case a publisher wishes to indicate a mention without an associated
      notification, the publisher can use the
      <code><a href="../activitystreams-vocabulary/#dfn-mention">Mention</a></code>
      object type as a value of the <code><a href="../activitystreams-vocabulary/#dfn-tag">tag</a></code>
      property. The <code>Mention</code> object is a subclass of
      <code><a href="../activitystreams-vocabulary/#dfn-link">Link</a></code>.
    </p>

    <figure><figcaption>Mentions and Tags within an Activity Streams Note</figcaption>
<div class="nanotabs">
  <ul>
    <li><a href="#ex26-jsonld" class="selected">JSON-LD</a></li>
    <li><a href="#ex26-rdfa" class="selected">RDFa</a></li>
    <li><a href="#ex26-turtle" class="selected">Turtle</a></li>
  </ul>
  <div id="ex26-jsonld" style="display: block;">
<pre class="example highlight json"
>{
  "@context": "http://www.w3.org/ns/activitystreams",
  "@type": "Note",
  "content": "Thank you @sally for all your hard work! #givingthanks",
  "tag": [
    {
      "@type": "Mention",
      "href": "http://example.org/people/sally",
      "displayName": "@sally"
    },
    {
      "@id": "http://example.org/tags/givingthanks",
      "alias": "#givingthanks"
    }
  ]
}</pre></div>

which seems substantially similar to the microsyntaxes example as it exists in the spec today. So it was specifically left in because of use-cases like this. But that was only one example of how Mention could be used, and the resolution was to leave it in since it was "useful to implementors". Having a Mention property in the tag to trigger rendering seems to me to be the definition of "useful to implementors" based on how it's implemented today.

@trwnh
Copy link
Author

trwnh commented Nov 16, 2024

i may have unintentionally sidetracked the conversation by talking about Mention (which i maintain is generally less useful than tagging the actor directly) but

  • the main part of this issue is on a non-normative section containing normative language (see AS2-Vocab section 5.6 "microsyntaxes" contains normative text in non-normative section #622 mainly)
  • the secondary part is that the examples have seemingly led to what (at least to me) looks like a misunderstanding and subsequent disregard for the normative language (which is, again, in a non-normative section)
    • a Mention tag is not supposed to generate a notification (to/cc) or markup microsyntax (name/summary/content). i don't have a good mental model for what tagging a Mention is supposed to mean, but that's not the point of this issue
    • something like Hashtag is generally fine because it's being used taxonomically.
    • http://joinmastodon.org/ns#Emoji is not part of AS2 but it is an example of exactly the kind of "extra processing rules" consideration where the contents of tag are expected to possibly transform or mutate the content. this is in opposition to the guidance AS2 gives, and i am saying that perhaps it deserves its own extension property to avoid overloading tag semantically.
    • Link with some metadata that is present on the Link in tag but not present on the actual <a> link in content -- this is the other exemplary (negative) usage that leads to having to process the Link in some way to transform or mutate the content. which is, again, in opposition to the guidance and maybe deserves its own property (to the extent that it deserves to be included in any property)
    • there's some discussion of building on this (likely antipattern) to handle taking links in content and rendering them as rich embeds/previews or "quote posts".

the end result is that

  • some items in tag carry extra processing implications
  • some items in tag do not carry these processing implications
  • it is unclear which items carry or do not carry these processing implications

again i'm finding it hard to talk about this issue in a precise manner because it doesn't strictly deal with the model of AS2 but rather the metamodel of certain fediverse implementations.

the proposed outcomes, but in more detail

which is to say: don't just assume that tag is the appropriate place to stuff any and all microsyntaxes just because some things that can go in tag might correspond to microsyntaxes.

by which i mean: clarify that just because you can tag a Mention doesn't imply anything about the content.

in short: the assumption to break is that tag implies anything about the value of content, instead of tag being about the current object.

@evanp
Copy link
Collaborator

evanp commented Nov 22, 2024

So, I think the core concern here is that the "Microsyntax" section of the vocabulary document shows two examples of connecting microsyntax in the text (@mention and #hashtag) with the tag property. It does not show any other ways of implementing microsyntaxes, for example, transforming ASCII emoticons like :-) into Unicode emoji. So, since the only examples of microsyntax involve the tag property, implementers have assumed that all microsyntax must use the tag property. Not the case!

I think there are several steps that we can take here to help with this issue.

  1. Include an example of a microsyntax that does not require a tag element. This could give better guidance.
  2. Explicitly note that microsyntax does not have to use the tag property.
  3. In a FEP or SocialCG report or RFC or something? define a set of microsyntax properties and a media type like text/social-microsyntax, including a mechanism for extensions.

I think that it would make sense to include this kind of guidance in a next version of AS2 Vocabulary. I'm not sure that the lack of this guidance rises to the level of an erratum, so I don't think we should do that now.

@evanp evanp added Next version Things that should probably be resolved in a next version of AS2 Waiting for Commenter labels Nov 22, 2024
@evanp
Copy link
Collaborator

evanp commented Dec 6, 2024

I also think a Primer page on best practices for defining microsyntaxes as well as documenting known microsyntaxes can be useful. And the next version of AS2 could include not only an example that doesn't create a tag element, but also other guidance. A good reminder of the definition of the tag property would be helpful also.

@evanp evanp added Needs primer page Need to add a page at https://www.w3.org/wiki/Activity_Streams/Primer on this topic and removed Waiting for Commenter labels Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs primer page Need to add a page at https://www.w3.org/wiki/Activity_Streams/Primer on this topic Next version Things that should probably be resolved in a next version of AS2
Projects
None yet
Development

No branches or pull requests

3 participants