-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Priority vector re-prioritization proposal #329
Conversation
FLEDGE.md
Outdated
|
||
The browser-generated `prioritySignals` object contains the following values: | ||
* `browserSignals.one`: This is always 1. It's useful for adding a constant to the dot product. | ||
* `browserSignals.age`: How long ago the user was added to the interest group, in milliseconds. This is the most recent time the use was added to the injterest group, so re-joining an interest group resets the value. This value is always non-negative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Age in msec seems like way more granularity than is warranted here.
- I think we might need a better way to handle age, actually. The use case we've heard about is owners who want their IG priority to drop off over time. The linear nature of a dot product means that if your priority on the second day is half of what it was on the first day, then your priority on day 3 is 0 and on day 4 is negative. Maybe it would be more useful for us to offer multiple capped values, like browserSignals.ageInHoursMax24 and browserSignals.ageInDaysMax30?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
browserSignals.ageinHoursMax24, ageInDaysMax30 sound very useful. If possible, I'd also add ageInMinutesMax60. Adding browserSignals.logAge, as proposed by Matt in #305 could also be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree milliseconds is more granularity than people need, but I'm reluctant to declare only days are useful, or only hours are, or whatnot, though I think sub minute granulatity does seem very unlikely to ever be useful.
It wouldn't be hard to support [Log]AgeIn[Hours|Days|Minutes][Max#], though that might be making things too complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also mean it's no longer a simple vector multiplication - we'd have to check for certain patterns, which would make things a bit more complicated on implementers, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also mean it's no longer a simple vector multiplication - we'd have to check for certain patterns (...)
Not sure if I got that right, could you explain what patterns do you have in mind?
// In my mind, it seemed it would be sufficient if there's a couple new age-related variables supplied by the browser in browserSignals
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[jonasz]: I was referring to my specific suggestion of adding more flexible options.
Wonder how useful the various resolutions are (minutes, hours, days), but I guess providing those three combined gets you a full clock, up to the expiration time.
So we're thinking: ageInMinutes, ageInMinutesMax60, ageInHoursMax24, ageInDaysMax30?
Sorry for the slow follow up. Been busy implementing the rest of this proposal, and only now getting back to figuring out what to do about age.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR to include the requested fields. Feedback welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experience from DSPs is that recency needs a lot more resolution than expected right after a signal has been given. For example of all users that add something to a shopping cart and click on an ad before completing the purchase, 80% of them will have done so within 3-4 days, and of those the majority within the first few hours and day.
So while milliseconds may seem too much, these pipelines are usually tuned to the second/minute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelkleber, @jonasz: Thoughts? Note that the values here are not really accessible by scripts or auctions . You could theoretically keep on modifying the filter and running auctions to try and get the time an IG was added, but there are easier and less finicky ways to probe IGs by using large numbers of auctions, as long as we expose whether any IG won the auction and if so, how long it took. We could just always provide second/minute resolution, if folks find it useful, and I don't think there would be a significant loss in privacy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, there's no significant privacy implications here, it's just about providing values that are useful for the dot-product mechanism. If ageInSecondsMax600 or whatever is also useful, then sure go for it, I'd say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addessing comments on typos - I'll leave the design issue for when I'm back in the second half of next week.
'https://www.another-buyer.com': 1000, | ||
'*': 15, | ||
...}, | ||
'perBuyerPrioritySignals': {'https://www.example-dsp.com': {'signal1': 2.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, perhaps perBuyerPrioritySignals
could be a field within perBuyerSignals['https://www.example-dsp.com']
? That way, the protocol between the seller and the buyer wouldn't have to be updated to handle the priority signals. We'd also lose the natural support for the default (wildcard) priority signals (but currently, unless I'm missing something, I don't see a good use for them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about overloading fields (field A is both data sent directly to buyer scripts, and data used to run these filters). Seems like there's a lot of room for unexpected behaviors there. If someone, for instance, is using a field we suddenly decide has another meaning, we'd be breaking consumers.
If we want to include with in perBuyerSignals, I'd be more comfortable structuring perBuyerSignals as:
{
arbitraryJsonData: <...>
prioritySignals: {...}
}
And dropping everything that's not in one of those two values, to minimize the chance of confusion. That would, of course, be a breaking change, but so would trying to extract priority signals from perBuyerSignals without the format change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was some related discussion on this here:
#266 (comment)
The direction discussed was leaving perBuyerSignals
alone, letting it contain arbitrary buyer json, but introducing a new perBuyerConfigurations
field that would contain the various buyer-specific fields that FLEDGE needs to handle, e.g. prioritySignal
, experimentGroupId
, perBuyerGroupLimits
, and potentially interestGroupBuyers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very different proposal - at least I assume jonasz's suggestion was so that the signals would be passed to worklets, as opposed to be a suggestion on how to better organize fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my read of
That way, the protocol between the seller and the buyer wouldn't have to be updated to handle the priority signals
was
"sellers can avoid redeploying code every time FLEDGE adds a new per-buyer field if the per-buyer fields are consolidated"
I don't see that a perBuyerConfigurations
map clashes with making the signals available to worklets -- actually I'd think it desirable to make perBuyerConfigurations
available to the worklet so that things like experimentGroupId
and prioritySignals
don't have to be redundantly passed through perBuyerSignals
if they're needed by the worklet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very different proposal - at least I assume jonasz's suggestion was so that the signals would be passed to worklets, as opposed to be a suggestion on how to better organize fields.
My proposal was actually purely technical, as @jrmooring summarized.
I agree that if there are other use cases where the buyer needs to pass some configuration to the auction, something like perBuyerConfigurations
would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect we're going to switch to something along the lines of perBuyerConfigurations
when we end the Fledge field trial and feel we can more safely break backwards compatibility. We could also switch to passing in the entire configuration to bidder worklets at the same time.
Add a new parser of the optional per-interest-group priority vector that can be returned as part of priority signals. This CL doesn't do anything with them, it just introduces a parser. github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: I9f849432970ca35b322620a54bc8808e495bd71d Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3835069 Reviewed-by: Maks Orlovich <[email protected]> Commit-Queue: Matt Menke <[email protected]> Cr-Commit-Position: refs/heads/main@{#1036236}
Add new field to AuctionConfig in FLEDGE auctions: perBuyerPrioritySignals, a dictionary about string to double mappings. This new field will be used to allow InterestGroups to be filtered and/or reprioritized based on information in the InterestGroups themselves or fetched from the trusted bidding server. This CL only adds support for parsing the new field and passing it over Mojo. github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: Ic3187944981e74ade6166959ce1cda44c255b49e Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3738502 Reviewed-by: Russ Hamilton <[email protected]> Reviewed-by: Sam McNally <[email protected]> Commit-Queue: Matt Menke <[email protected]> Cr-Commit-Position: refs/heads/main@{#1037080}
Add new fields to InterestGroups related to reprioritizing/filtering InterestGroups before they generate bids. The fields are: * useBiddingSignalsPrioritization: When true, bids may have their priorities adjusted based on bidding signals received from the JSON server. False by default, as when try, the signals for significantly more interest groups may need to be fetched. * priorityVector: Vector multiplied (dot producy) by the AuctionConfig's perBuyerPrioritySignals for the InterestGroup, to produce a new priority. * prioritySignalsOverrides: Values that override the AuctionConfig's perBuyerPrioritySignals. This CL only adds support for parsing the new fields and passing them over Mojo. github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: I45f2ea3d844f92ab2b51982d3ffe9475a8d3e6a1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3747284 Reviewed-by: Russ Hamilton <[email protected]> Reviewed-by: Sam McNally <[email protected]> Commit-Queue: Matt Menke <[email protected]> Cr-Commit-Position: refs/heads/main@{#1037113}
This CL adds support for storing / retrieving / updating the 3 new InterestGroup fields added in https://chromium-review.googlesource.com/c/chromium/src/+/3747284 `prioritySignalsOverrides` has a new behavior in that updates merge the values from the update's JSON dictionary with the old values of the InterestGroup. Other fields are completely overwritten. Also adds the join time to the database, for age calculations. The fields still have no actual effect when running auctions. github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: I5185e0066f6518c9bd56734cd4066ca8d6f21564 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3755604 Reviewed-by: Russ Hamilton <[email protected]> Commit-Queue: Matt Menke <[email protected]> Cr-Commit-Position: refs/heads/main@{#1037134}
This CL defers Javascript generateBid() calls in the bidder worklet process until `priorityVectors` included in the response from the trusted bidding signals server have been passed to the browser process, so the browser process can potentially filter out the interest group based on those signals. This extra call happens even when priority vector is received, since the browser may also limit the number of interest groups generateBid() is called on, depending on whether any interest group owned by that bidder is configured to delay apply the number of interest group limit after the `priorityVector` has been received from the trusted server. To do this, an extra method has been added to the BidderWorklet's GenerateBidClient to inform the caller that the signals have been received, along with providing the new priorityVector. This takes a callback to tell the BidderWorklet to proceed to generate the bid. Deleting the GenerateBidClient instead of invoking the callback will cancel calling the Javascript generateBid() method. associated github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: Ica1e7ec27063265f596298f98fef22131bb32080 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3816982 Reviewed-by: Maks Orlovich <[email protected]> Commit-Queue: Matt Menke <[email protected]> Reviewed-by: Daniel Cheng <[email protected]> Cr-Commit-Position: refs/heads/main@{#1040663}
Adds additional fields to the browserSignals map used in Fledge priority vector multiplications. In particular, add support for: * browserSignals.ageInMinutes * browserSignals.ageInMinutesMax60 * browserSignals.ageInHoursMax24 * browserSignals.ageInDaysMax30 When one or more interest groups of an owner indicate that their priority should be set based on data received from the trusted bidding signals fetch, the Auction will now defer filtering out interest groups based on priority until after the trusted bidding signals have been fetched. associated github PR: WICG/turtledove#329 Bug: 1343389 Change-Id: Ic079484e0bb32d54da881f624d1a9eabfb4f5567 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3852425 Reviewed-by: Maks Orlovich <[email protected]> Commit-Queue: Matt Menke <[email protected]> Cr-Commit-Position: refs/heads/main@{#1040749}
This allows bidder worklets a greater ability to adjust prioritization in future auctions. See WICG/turtledove#329 for PR explaining the details. Bug: 1343389 Change-Id: I27c6c7d08de6f00ea39bd4d3f0b6828e65df120e Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3892899 Commit-Queue: Matt Menke <[email protected]> Reviewed-by: Daniel Cheng <[email protected]> Reviewed-by: Maks Orlovich <[email protected]> Cr-Commit-Position: refs/heads/main@{#1047643}
Co-authored-by: Paul Jensen <[email protected]>
Co-authored-by: Paul Jensen <[email protected]>
Co-authored-by: Paul Jensen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Maks!
Co-authored-by: Paul Jensen <[email protected]>
FLEDGE.md
Outdated
@@ -302,23 +324,34 @@ Buyers have three basic jobs in the on-device ad auction: | |||
|
|||
Buyers may want to make on-device decisions that take into account real-time data (for example, the remaining budget of an ad campaign). This need can be met using the interest group's `trustedBiddingSignalsUrl` and `trustedBiddingSignalsKeys` fields. Once a seller initiates an on-device auction on a publisher page, the browser checks each participating interest group for these fields, and makes an uncredentialed (cookieless) HTTP fetch to a URL of the form: | |||
|
|||
https://www.kv-server.example/getvalues?hostname=publisher.com&keys=key1,key2 | |||
https://www.kv-server.example/getvalues?hostname=publisher.com&keys=key1,key2&interestGroups=name1,name2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way of sending data loses the association between keys and Interest Group (IG) names. For example, we might have IG 1 with key 'key1' and IG2 with keys ['key2', 'key3'].
This may become awkward and limiting for usage on the trusted server in the future. For example, it may often be the case that computing the bidding signals corresponding to some key may overlap with computing the perInterestGroupData for the corresponding IG.
Can we have an API that preserves the relationship between keys and IG names? For example, we could require keys and to be in the same order as IG names, with keys from different IGs separated by semicolons instead of commas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The does potentially make requests bigger, since we need to duplicate keys from multiple IGs. It would also require significant changes to the trusted server code. Are you thinking in terms of things that might be useful down the line, or that would be useful now, with the single perInterestGroupData field?
I'd hope that if something depends on key, it would be something that could be applied to all IGs with the same key, though that may be wishful thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least with our usage, preserving the association means that we could make the request significantly smaller. If we cannot rely on the IG name->key association being available, the IG name potentially needs to be passed twice once as part of the key and once in the interestGroups
field. There are ways one could imagine using the API where one would have many duplicate keys across IGs, but I am not sure if anyone uses the API in that way. (As far as I can tell those designs typically involve moving more computation onto the client, with corresponding latency and data integrity concerns.)
Another related issue, supposing we wish to filter out the IG with negative priority with this API, we would also prefer to return empty bidding signal responses for the corresponding key(s). This either involves looking up a key to IG mapping on the server, duplicating data in the key, or doing redundant computations on the trusted server.
Whether or not all of this will be possible likely depends on details of the trusted server model, but I think it is desirable to keep the options open by preserving the information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least with our usage, preserving the association means that we could make the request significantly smaller. If we cannot rely on the IG name->key association being available, the IG name potentially needs to be passed twice once as part of the key and once in the
interestGroups
field. There are ways one could imagine using the API where one would have many duplicate keys across IGs, but I am not sure if anyone uses the API in that way. (As far as I can tell those designs typically involve moving more computation onto the client, with corresponding latency and data integrity concerns.)Another related issue, supposing we wish to filter out the IG with negative priority with this API, we would also prefer to return empty bidding signal responses for the corresponding key(s). This either involves looking up a key to IG mapping on the server, duplicating data in the key, or doing redundant computations on the trusted server.
Whether or not all of this will be possible likely depends on details of the trusted server model, but I think it is desirable to keep the options open by preserving the information.
Seems like for your use case, it would make more sense to make unique keys for each IG, rather than shared keys between IGs. One of the reasons to use keys, instead of just IG name, is so that IGs can share data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like for your use case, it would make more sense to make unique keys for each IG, rather than shared keys between IGs. One of the reasons to use keys, instead of just IG name, is so that IGs can share data.
Yes, indeed that is how we are using the API. Please see my original concern about losing the association:
This may become awkward and limiting for usage on the trusted server in the future. For example, it may often be the case that computing the bidding signals corresponding to some key may overlap with computing the perInterestGroupData for the corresponding IG.
If we lose the association then this effectively limits the future ability of the trusted bidding server to use the interestGroupNames in computing the key response, or sharing computation between the two different outputs.
Co-authored-by: Paul Jensen <[email protected]>
SHA: dad16ed Reason: push, by JensenPaul Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR introduces a way to dynamically set an interest group's priority and filter them by way of a sparse vector multiplication. The multiplication can either be between a vector in the interest group and a vector in the auctionConfig, or a vector fetched from trusted bidding server and a vector in the auctionConfig. This is aimed at addressing the issues raised in #302 and #305.
This proposal avoids using Javascript, trading off flexibility for performance, since setting up a Javascript environment is very heavy weight, and needs its own process - the process that generateBid() is called in could be used for any bidder-controller prioritization scripts, but in some multi-buyer cases, we're hoping filtering could avoid the need to create an extra process entirely.
This CL mentions setPriority() a couple times, which hasn't yet been added to the explainer.