Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Sec-CH-UA-Form-Factor a list, add meanings #343

Merged
merged 4 commits into from
Sep 6, 2023

Conversation

djmitche
Copy link
Contributor

@djmitche djmitche commented Jul 18, 2023

This

  • Makes the hint a set (in the form of a sorted list)
  • enumerates the allowed values with non-normative descriptions
  • adds some non-normative language about adding values

Preview | Diff

"Automotive", "Mobile", "Tablet", "TV", "VR", or "XR". Order of the values
in the list is not significant.

<div class="note" heading="">
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This heading="" seems to be required in order to get the green "NOTE:"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's weird! You can also just do:

Note: .... and it'll generate the markdown for you.

https://speced.github.io/bikeshed/#notes-etc does reference the heading attribute. But if what you have works, I'm not particularly worried about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a note at all? It seems like important normative content

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that the language is too vague to be normative, and making it more precise would be counter-productive. But I definitely can't afford an editor's hat, so I'll defer to those with better headware.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions one way or the other here, and it's easy to swap between the two.

index.bs Outdated
Comment on lines 575 to 577
* "VR" refers to an immersive, gesture-oriented device.
* "XR" is similar to "VR" but includes devices that integrate with the
environment around the user.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how to distinguish these two. Maybe this is an example where the list would be useful ("XR" for XR things that aren't VR goggles and "XR, VR" for things that are VR goggles).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's one way we could add it in if we hear of use cases for "just VR". We could start conservative like @nielsbasjes suggests at first, WDYT?

index.bs Outdated
typically carried on a user's person.
* "TV" refers to a large, multi-user device desiged primarily for viewing videos.
* "VR" refers to an immersive, gesture-oriented device.
* "XR" is similar to "VR" but includes devices that integrate with the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not called AR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know! A brief search found a page explaining that AR, VR, and a few other R's are sub-categories of XR. I'm happy to be educated here :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking for some understanding on the difference I found this page which says

Extended Reality includes all its descriptive forms like the Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR). In other words, XR can be defined as an umbrella, which brings all three Reality (AR, VR, MR) together under one term, leading to less public confusion.

So my current understanding is that XR is intended to include AR, VR and MR.

So either we should have (XR) OR (AR, VR, MR).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds pretty good to me @nielsbasjes. Why don't we start with XR, and if someone claims to have a different use case for VR (or MR)... we can consider adding it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got linked to this by a colleague. XR is generally used as a catchall for the spectrum of devices from Augmented Reality to Mixed Reality to Virtual reality, hence WebXR and OpenXR are named as such. If you are trying to differentiate those kinds of devices from other kinds, that makes sense. If you are trying to differentiate head mounted displays (HMDs) from non HMDs, this gets a bit trickier. XR typically includes phone AR and other non-head mounted spatial computing form factors.

@djmitche
Copy link
Contributor Author

@miketaylr I'd be interested in your opinions here, and would appreciate it if you can draw the attention of other interested stakeholders.

@miketaylr miketaylr self-requested a review July 26, 2023 19:44
@miketaylr
Copy link
Collaborator

Will try to get this reviewed before EOW! (was OOO 😎 🙇 )

@nielsbasjes
Copy link

Just to introduce myself: I'm an IT Architect at a really large online retailer. I have been doing large data processing for ~25 years and I have been doing large scale webanalytics for about 18 years now.

I have written several opensource libraries to support these kinds of usecases and I have contributed to (just about) all Apache Software Foundation "Big Data" projects. In light of the definitions of these ClientHints; this project of mine is most relevant to mention: Yauaa https://yauaa.basjes.nl/ (https://github.com/nielsbasjes/yauaa). This library tries to parse and analyse the useragent and clienthints to make this available to (usually) analytics systems.

Looking at the usecases of the output of Yauaa at the place where I work I see some things that would really benefit from having good values in the Sec-CH-UA-Form-Factor header.

Disclaimer: I have NOT discussed any of this below with any of my colleagues; This is all on me.

Essentially the key question is about what kinds of devices should the website work on and which variant of the website should we send to the visitor that just arrived.

Looking at the Client Hints the Sec-Ch-Ua-Mobile is the only one that can be used for this purpose and it is really vague. It only indicates if it is a Phone or not.
It does not indicate if the touch interface is needed because a Tablet (with a touch screen) returns a false here.
So in my library I try to improve it by (for example) also look at the tag that is put before Safari and the Operating system tags. So Safari on Mac OS is Desktop. Safari on iOS is Tablet. Mobile Safari on iOS is Phone.
I consider this to be quite messy.

As I mentioned here there is great diversity you could include in a header like this. But having too much detail would make it a fingerprinting feature.

So just let me propose how I would approach this and hear what you think.

I need a clear indicator for:

  • The screensize and aspect ratio.
  • Type of interaction.
  • Level of attention attention people can/should have.

I'm in doubt if "How mobile" the device is (like wall mounted, handheld, vehicle mounted) would add any value to this.

Note that some of these dimensions are not independent: So a watch, phone and tablet are almost always touch screens for example.

I did some more digging about what VR, AR and MR really mean and on this (Dutch, sorry) site they explained with some images https://cadcompany.nl/blog/vr-ar-en-mr-verschillen/

My summary:

  • VR: You only see the virtual world.
  • AR: The real world is annotated with some things
  • MR: You can have a look at 3d models and walk around them

Now all of this does NOT mean you are using a headset. To give an example the kids app created for my work is a clear example of MR that is intended to work on phones and tablets as demonstrated in this video https://youtu.be/fZZCAs4G9Zg?t=36

So when the key question is "which content should we show" the key question becomes: Can the device mix the real world and the virtual world?

My initial proposal for this header where I'm trying to stay at useful detail without going into fingerprinting detail:

ScreenType indicator: s="ScreenType"

With allowed values:

  • "None": No screen, Headless, Server-to-Server, etc.
  • "Watch": A (usually handheld, usually touch) screen < 2"
  • "Phone": A (usually handheld, usually touch) screen between 2" and 7"
  • "Tablet": A (usually handheld, usually touch) screen between 7" and 14"
  • "Desktop: A (usually movable but not handheld, usually no touch) screen between 15" and 30"
  • "TV": A fixed (usually wall mounted, no touch) large screen > 32"
  • "VR": A VR Headset that CANNOT mix the images from the outside world in view. So only suitable for VR content
  • "XR": A VR Headset that CAN mix the images from the outside world in view. So only suitable for VR/AR/MR content.

Interaction indicator: i="InteractionType"

With allowed values:

  • "None": No screen, Headless, Server-to-Server ... so no human interactions.
  • "Keyboard": A keyboard/mouse interaction
  • "Touch": A touch screen
  • "Game": A gamepad type controller (mini joysticks) as used on Playstation, Xbox, Nintendo switch, etc.
  • "Remote": A controller with only arrow keys, Ok and Cancel buttons: as used with many TVs and Set top boxes (like the "Google Chromecast with Google TV" and "Apple TV").
  • "Gesture: A device that looks at gestures and motion of the user.
  • "Voice": A voice controlled device.

Discussion: Perhaps "Game" and "Remote" should be merged to "Joystick"?

Attention indicator: a="AttentionType"

With allowed values:

  • "None": No screen, Headless, Server-to-Server ... so no human attention at all.
  • "Low": You cannot expect the user to respond fast on something you show. Common usage: Car
  • "Medium": The user should be able to respond within 1 minute. Common usage: TV
  • "High": The user should be able to respond within 5 seconds. Common usage: Normal websites

Examples:

Watch: s="Watch";i="Touch";a="High"
Phone: s="Phone";i="Touch";a="High"
Tablet: s="Tablet";i="Touch";a="High"
Amazon Echo: s="Tablet";i="Touch";a="Medium"
PS5: s="TV";i="Game";a="High"
Nintendo Switch: s="Phone";i="Game";a="High"
Tesla: s="Tablet";i="Touch";a="Low"
Google TV: s="TV";i="Remote";a="Medium"
Apple Vision Pro: s="XR";i="Gesture";a="High"
PS4 VR Headset: s="VR";i="Gesture";a="High"
PS5 VR Headset: s="XR";i="Gesture";a="High"

I would set the standard to require all 3 always and given the intended usecase What website version should I show the visitor I would like to see this as one of the low entropy headers that is sent on the first request.

@nielsbasjes
Copy link

Additional thought:
In some cases you can have multiple screens and multiple interaction types.
A phone with a watch attached: s="Phone|Watch" ?
A laptop with a touch screen: i="Touch|Keyboard" ?

@nielsbasjes
Copy link

Additional: eInk screens should be clear. No colors, no animations, ...

Copy link
Collaborator

@miketaylr miketaylr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let's chat about removing VR then land this.

index.bs Outdated
typically carried on a user's person.
* "TV" refers to a large, multi-user device desiged primarily for viewing videos.
* "VR" refers to an immersive, gesture-oriented device.
* "XR" is similar to "VR" but includes devices that integrate with the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds pretty good to me @nielsbasjes. Why don't we start with XR, and if someone claims to have a different use case for VR (or MR)... we can consider adding it.

"Automotive", "Mobile", "Tablet", "TV", "VR", or "XR". Order of the values
in the list is not significant.

<div class="note" heading="">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's weird! You can also just do:

Note: .... and it'll generate the markdown for you.

https://speced.github.io/bikeshed/#notes-etc does reference the heading attribute. But if what you have works, I'm not particularly worried about it.

index.bs Outdated
Comment on lines 575 to 577
* "VR" refers to an immersive, gesture-oriented device.
* "XR" is similar to "VR" but includes devices that integrate with the
environment around the user.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's one way we could add it in if we hear of use cases for "just VR". We could start conservative like @nielsbasjes suggests at first, WDYT?

@miketaylr
Copy link
Collaborator

I have written several opensource libraries to support these kinds of usecases and I have contributed to (just about) all Apache Software Foundation "Big Data" projects.

@nielsbasjes really appreciate your input!

@nielsbasjes
Copy link

@miketaylr

Why don't we start with XR, and if someone claims to have a different use case for VR (or MR)... we can consider adding it.

My current view of this header is that is an indication on the kind of content the device/agent the user has can handle.
Based on that my point about keeping the VR in addition to the XR is the question if the device/agent the user has can or cannot mix the provided content with the outside world (i.e. it has the sensors/cameras/compute power/... to support this).

@nielsbasjes
Copy link

The goal for this header (the way I see it) is that it tells the website what the device/agent the user has can handle.

This is also what my (bit too long) response earlier was about:

  • What output/content does the device support (screensize, VR/XR, eInk, ...)
  • What kind of interaction does it support (include sensors groups?, Orientation? GPS?).
  • What kind of attention from the user can you expect.

Do you guys have a similar goal/mindset for this header or do you have something completely different in mind?

@djmitche
Copy link
Contributor Author

@nielsbasjes that's great input -- thank you! I appreciate that you've broken things into a lower-level ontology, and that feels like the right way to go here: a few well-defined bits of data that a site can use to make its own decisions about what kind of content to serve, in a manner likely to continue to work in the future when new forms of user-agents are introduced.

That said, it's quite different from what I've proposed in this PR. Would you mind creating an issue to continue discussion, perhaps copy/pasting some of the text you've written above? I'll work to pull in some more feedback, and as that develops decide whether to merge or abandon this PR, and whether to create a PR implementing something closer to your suggestions.

 - Remove "VR" (as it is encompassed by XR)

 - Remove "TV" (as there is no current use-case for this value)

 - Include non-normative language around what a form-factor is and what
   the allowed values mean. This specifically refers to user
   interaction, as distinguished from screen-size or other physical
   aspects of the device.
@djmitche
Copy link
Contributor Author

OK, I've updated this based on the conversation in #344:

  • Remove "VR" (as it is encompassed by XR)
  • Remove "TV" (as there is no current use-case for this value)
  • Include non-normative language around what a form-factor is and what
    the allowed values mean. This specifically refers to user
    interaction, as distinguished from screen-size or other physical
    aspects of the device.

Please let me know what you think.

@arichiv
Copy link
Collaborator

arichiv commented Aug 24, 2023

This adds non-normative meanings for the suggested values. Consider this a very early draft:

  • Let's talk about the definitions of the form factors and what should or should not be included.
  • Let's talk about where I've used overly-normative language.
  • Let's talk about when a multiple-valued list might make sense.

Preview | Diff

If this is close to being committed can you update the PR description?

index.bs Show resolved Hide resolved
index.bs Show resolved Hide resolved
@nielsbasjes
Copy link

This looks very good to me.

Given that a new value is valid if "that users interact with in a meaningfully different way" then I propose 2 additions because to me they are "meaningfully different":

  1. "eInk": terribly slow screen (no animations) and very limited color (if any, usually only greyscale) capabilities.
  2. "Watch": so small you really have to design for it and multitouch does not make sense because you cannot even fit 2 fingers on such a screen.

Also I think it would be a good idea to explicitly define approximate screen sizes in the documentation (similar to what I have done here https://yauaa.basjes.nl/expect/fieldvalues/#deviceclass )

My personal opinion is that the term "Mobile" is vague; I would use "Phone".
This is not important if the documentation on this is made a bit clearer by adding something like the screensize.

Proposal:

  • Tablet: A mobile device with a rather large screen (common > 7")
  • Mobile (or Phone): A mobile device with a small screen (common < 7")
  • Watch: A mobile device with a tiny screen (common < 2").

@djmitche
Copy link
Contributor Author

Thanks! I had chosen "Mobile" to try to indicate that Mobile: ?1 would correspond to this value, but you're right that the term is vague and shouldn't be perpetuated. Choosing a different name allows implementers to decide how Sec-CH-UA-Mobile and Sec-CH-UA-Form-Factor should be related. I suspect that "Phone" is a bit US-centric, but probably the best choice after "Mobile".

I'll add Watch and EInk as options, and include screen size descriptions. I'll also clarify that values should be given as written (including capitalization) to avoid providing additional fingerprinting entropy.

@nielsbasjes
Copy link

I'm Dutch and here the terms "Telefoon" and "Mobieltje" are used by many people interchangeable. The online shop I work for shows them as Smartphone to the customers. So calling it "Phone" is not a US term from where I'm standing.

@miketaylr
Copy link
Collaborator

Still LGTM :)

@djmitche
Copy link
Contributor Author

djmitche commented Sep 6, 2023

I've had some positive private reactions, and nothing suggesting a change, so I'm going to merge this as-is.

@miketaylr miketaylr merged commit 6bcecb6 into WICG:main Sep 6, 2023
github-actions bot added a commit that referenced this pull request Sep 6, 2023
SHA: 6bcecb6
Reason: push, by miketaylr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@nielsbasjes
Copy link

I'll add Watch and EInk as options, and include screen size descriptions. I'll also clarify that values should be given as written (including capitalization) to avoid providing additional fingerprinting entropy.

Seems like this was missing from the actual commit? @djmitche

@lukewarlow
Copy link

Should https://wicg.github.io/ua-client-hints/#dom-uadatavalues-formfactor be a sequence<DOMString> now?

@djmitche
Copy link
Contributor Author

djmitche commented Sep 7, 2023

Both good points. I'll make a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants