Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Priority vector re-prioritization proposal #329
Priority vector re-prioritization proposal #329
Changes from 10 commits
c20ae4c
599f748
d4eeaa8
8b2b71e
893e837
156b3d0
8ef5cf4
31e29fd
8aaa6d5
75b398e
10560a1
06bcbd3
146fcdd
800597b
2d8b8bb
6369085
3f8b88d
b11d27d
ac326f1
1143633
0ae2dc7
de7578e
7d3f9df
692ab18
e109c0a
1a7e49a
fb7e7a6
9c67b54
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, perhaps
perBuyerPrioritySignals
could be a field withinperBuyerSignals['https://www.example-dsp.com']
? That way, the protocol between the seller and the buyer wouldn't have to be updated to handle the priority signals. We'd also lose the natural support for the default (wildcard) priority signals (but currently, unless I'm missing something, I don't see a good use for them).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about overloading fields (field A is both data sent directly to buyer scripts, and data used to run these filters). Seems like there's a lot of room for unexpected behaviors there. If someone, for instance, is using a field we suddenly decide has another meaning, we'd be breaking consumers.
If we want to include with in perBuyerSignals, I'd be more comfortable structuring perBuyerSignals as:
And dropping everything that's not in one of those two values, to minimize the chance of confusion. That would, of course, be a breaking change, but so would trying to extract priority signals from perBuyerSignals without the format change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was some related discussion on this here:
#266 (comment)
The direction discussed was leaving
perBuyerSignals
alone, letting it contain arbitrary buyer json, but introducing a newperBuyerConfigurations
field that would contain the various buyer-specific fields that FLEDGE needs to handle, e.g.prioritySignal
,experimentGroupId
,perBuyerGroupLimits
, and potentiallyinterestGroupBuyers
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very different proposal - at least I assume jonasz's suggestion was so that the signals would be passed to worklets, as opposed to be a suggestion on how to better organize fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my read of
was
"sellers can avoid redeploying code every time FLEDGE adds a new per-buyer field if the per-buyer fields are consolidated"
I don't see that a
perBuyerConfigurations
map clashes with making the signals available to worklets -- actually I'd think it desirable to makeperBuyerConfigurations
available to the worklet so that things likeexperimentGroupId
andprioritySignals
don't have to be redundantly passed throughperBuyerSignals
if they're needed by the worklet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposal was actually purely technical, as @jrmooring summarized.
I agree that if there are other use cases where the buyer needs to pass some configuration to the auction, something like
perBuyerConfigurations
would be nice.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect we're going to switch to something along the lines of
perBuyerConfigurations
when we end the Fledge field trial and feel we can more safely break backwards compatibility. We could also switch to passing in the entire configuration to bidder worklets at the same time.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way of sending data loses the association between keys and Interest Group (IG) names. For example, we might have IG 1 with key 'key1' and IG2 with keys ['key2', 'key3'].
This may become awkward and limiting for usage on the trusted server in the future. For example, it may often be the case that computing the bidding signals corresponding to some key may overlap with computing the perInterestGroupData for the corresponding IG.
Can we have an API that preserves the relationship between keys and IG names? For example, we could require keys and to be in the same order as IG names, with keys from different IGs separated by semicolons instead of commas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The does potentially make requests bigger, since we need to duplicate keys from multiple IGs. It would also require significant changes to the trusted server code. Are you thinking in terms of things that might be useful down the line, or that would be useful now, with the single perInterestGroupData field?
I'd hope that if something depends on key, it would be something that could be applied to all IGs with the same key, though that may be wishful thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least with our usage, preserving the association means that we could make the request significantly smaller. If we cannot rely on the IG name->key association being available, the IG name potentially needs to be passed twice once as part of the key and once in the
interestGroups
field. There are ways one could imagine using the API where one would have many duplicate keys across IGs, but I am not sure if anyone uses the API in that way. (As far as I can tell those designs typically involve moving more computation onto the client, with corresponding latency and data integrity concerns.)Another related issue, supposing we wish to filter out the IG with negative priority with this API, we would also prefer to return empty bidding signal responses for the corresponding key(s). This either involves looking up a key to IG mapping on the server, duplicating data in the key, or doing redundant computations on the trusted server.
Whether or not all of this will be possible likely depends on details of the trusted server model, but I think it is desirable to keep the options open by preserving the information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like for your use case, it would make more sense to make unique keys for each IG, rather than shared keys between IGs. One of the reasons to use keys, instead of just IG name, is so that IGs can share data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed that is how we are using the API. Please see my original concern about losing the association:
If we lose the association then this effectively limits the future ability of the trusted bidding server to use the interestGroupNames in computing the key response, or sharing computation between the two different outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
browserSignals.ageinHoursMax24, ageInDaysMax30 sound very useful. If possible, I'd also add ageInMinutesMax60. Adding browserSignals.logAge, as proposed by Matt in #305 could also be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree milliseconds is more granularity than people need, but I'm reluctant to declare only days are useful, or only hours are, or whatnot, though I think sub minute granulatity does seem very unlikely to ever be useful.
It wouldn't be hard to support [Log]AgeIn[Hours|Days|Minutes][Max#], though that might be making things too complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also mean it's no longer a simple vector multiplication - we'd have to check for certain patterns, which would make things a bit more complicated on implementers, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I got that right, could you explain what patterns do you have in mind?
// In my mind, it seemed it would be sufficient if there's a couple new age-related variables supplied by the browser in
browserSignals
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[jonasz]: I was referring to my specific suggestion of adding more flexible options.
Wonder how useful the various resolutions are (minutes, hours, days), but I guess providing those three combined gets you a full clock, up to the expiration time.
So we're thinking: ageInMinutes, ageInMinutesMax60, ageInHoursMax24, ageInDaysMax30?
Sorry for the slow follow up. Been busy implementing the rest of this proposal, and only now getting back to figuring out what to do about age.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR to include the requested fields. Feedback welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experience from DSPs is that recency needs a lot more resolution than expected right after a signal has been given. For example of all users that add something to a shopping cart and click on an ad before completing the purchase, 80% of them will have done so within 3-4 days, and of those the majority within the first few hours and day.
So while milliseconds may seem too much, these pipelines are usually tuned to the second/minute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelkleber, @jonasz: Thoughts? Note that the values here are not really accessible by scripts or auctions . You could theoretically keep on modifying the filter and running auctions to try and get the time an IG was added, but there are easier and less finicky ways to probe IGs by using large numbers of auctions, as long as we expose whether any IG won the auction and if so, how long it took. We could just always provide second/minute resolution, if folks find it useful, and I don't think there would be a significant loss in privacy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, there's no significant privacy implications here, it's just about providing values that are useful for the dot-product mechanism. If ageInSecondsMax600 or whatever is also useful, then sure go for it, I'd say.