-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FLEDGE API Support for Interest Group and Ad Filtering #305
Comments
Getting the buyer's logic from the trusted bidder signals fetch as opposed to having it as part of the interest group raises some issues:
If the decision logic was part of the IG itself, that would fix those issues, though it would make filtering less powerful (e.g., couldn't filter out based on running out of budget, unless that were learned in the context of the page, and passed in as a parameter to the filter from there, as opposed to modifying the filter rules based on remaining budget). Edit: And learning remaining budget in the context of the page probably doesn't work (potentially leaks too much data, and too many potential budgets to learn to make sharing them there reasonable, anyways) |
Here's an alternative proposal. Feedback would be much appreciated. Apologies for any formatting issues, this is copied from a Google doc. Alternative proposal TLDR: Do basically the same thing, but:
Details New data types There are two new data types: priority trees and priority tree inputs. A priority tree is applied to a priority tree input, which results in a number or an error. In the case of an error, the priority tree has no effect (could throw out the bid instead?). { Priority trees The format of a priority tree is a JSON value, with the possibilities being:
In the case of a list, the operation is one of a fixed set of strings, and is used to determine how many other arguments are allowed, and how they are interpreted. Additional or insufficient arguments are considered errors, and will result in ignoring the output of evaluating the tree:
Buyer priority trees and input For buyers, the priority tree input is specified as a new AuctionConfig field. They are specified on a per-buyer basis: perBuyerPriorityTreeInputs = { The special “” value applies to all buyers. If there are both “Buyer” and “” keys present, and Buyer’s priority tree references a key not found in Buyer’s specific priority tree input, then “*” will be checked for a matching key. Interest group priority trees are provided as part of an interest group itself, using the new “priorityTree” field, so they can be evaluated before requesting any resources over a network. If the result of evaluating an interest group’s priority tree is negative, the interest group is not given a chance to bid in an auction. Otherwise, the result of evaluating the priority tree is used as the interest group’s priority when running the auction. Priority trees may also be fetched as part of fetching trusted bidding signals, but in that case, they are only used to skip bidding if the result is negative, instead of setting the priority (TODO: We should make this adjust the priority as well, though that does require some major refactoring). Having them in both locations lets consumers pick between the two options, which results in a performance/flexibility tradeoff (e.g. fetching trees as part of the trusted bidding signals means any filtering decisions must be delayed until the fetch completes, but allows more flexibility to update the trees). It also allows each tree to be used for different things (e.g., fetched the JSON “tree” could just be -1 if an ad campaign has run out of budget, while the tree in the IG could be based on more static preferences based on content of the publisher page). In order for JSON fetches to provide this data, bidding fetches need to be updated. An additional parameter is added to the JSON fetches “&interestGroups=groupName1,groupName2,...” for all the interest groups the fetch is for, and the response is now of the format: { In addition, the server must send a “fledge-bidding-signals-format-version: 2” header, for the response to be interpreted as using the new format, though eventually support for the old format will be removed. Seller priority trees and input Sellers may specify their priority trees as part of their auctionConfig, via the new field: sellerPriorityTree = priorityTree They specify their priority tree input via new fields in their trusted selling signals responses: { There is no change to the requested URL, nor introduction of a versioning scheme, since the seller signals JSON format already uses a top-level dictionary that can be expanded by adding new keys. For a given bid, the AuctionConfig’s priority tree is run against all matching URLs (the renderUrl and componentRenderUrls), and if any of them is negative, the bid is rejected. If all are either positive, or have no URL-specific priority tree input (or there’s no priority tree specified in the auction config), the bid is passed to the seller’s scoreAd() method. The magnitude of the output of evaluating the priority tree has no impact. |
@jonasz: My last post is a proposal to both provide a filtering API, and allow IGs to adjust priority based on a sparse dot product, as I believe you said you were interested in. The API is a little clunky, to accommodate both needs. Feedback would be welcome. Don't want to start implementation until we know if it meets folks needs, and give people a chance to provide alternatives. |
Hi Matt, Thanks for the proposal, this definitely sounds useful. I think the ability to adjust the priority in this way will open up a great area for optimization. Some thoughts / comments:
Best regards, |
Hi Jonasz, Thanks so much for the suggestions! Here are some responses to your ideas:
Good to hear you're interested in this! Implementing this will likely be a major investment (and not have quite the same performance gains when it's done), so nice to get some feedback on the idea before we start implementing.
Is this sort of slow change already supported by the existing interest group priority and setPriority() features?
I wasn't sure how much interest there would be in boolean and conditional support, in particular, but if folks are interested, it should not be difficult to add these operations.
This seems like it could be really useful. This would let the filter have access to information potentially from multiple sites (the site calling joinAdInterestGroup(), and the site(s) modifying the tree or inputs). As long as that extra information isn’t accessible to Javascript, only determines whether or not generateBid() is called, and multiple Javascript calls do not share a Javascript context, this is probably OK, though we do need to be very careful when allowing interest groups to combine data from multiple sites.
With my proposal, each IG would have a filter/priority that comes from the bidder/IG before calling generateBid(). If there is any bidder filtering to apply to ads, the assumption is that the bidder would do that in generateBid() itself. Then the seller gets to apply filters to the render URL, but only after the bidder generates the bid. We can't send the ad URLs to the seller before then because it's likely way too much data to get a full set of scoring signals for, and we can't send an IG's ads to the seller without the IG telling us it's OK to do so. IG's currently have no way to express they're OK with a particular seller except by generating a bid. Giving IGs a way to declare what sellers they're OK with getting this information up front would solve the latter problem (and is something we’re thinking about), but it would not solve the potential data size problem if we sent all ads from all IGs to the seller and got all signals immediately upon auction start. It’s possible that, if we have IG opt-in to sharing data, we could potentially send the names of all IGs participating in the auction to the seller up front, but that would require either entirely reworking the JSON format, or two JSON requests to the seller.
Paul Jensen deserves full credit for this addition |
Not really - note that the campaign config may actually change quite rapidly, and these changes could sometimes increase and sometimes decrease the priority. Once we decrease the priority via
I see. Just to confirm - in my view it should be totally fine if the priority-related fields are write-only from the perspective of |
Yes, or at least it currently seems to us that adding more write-only fields which are only accessible to filters / reprioritization logic should be fine. The resulting priority also shouldn't be exposed to generateBid(), which is already the case for setPriority(), though that's not spelled out in the explainer, currently, which is something we need to fix. |
We’ve been having second thoughts about adding a new language to the web platform, and are now wondering if a sparse vector multiply more along the lines of RTB House’s proposal might be good enough for most consumers to work with. It would satisfy issue #302 also. If this is insufficient, we’re thinking that the best script-based option is to use Javascript scripts instead, which we can run in a single frozen global context along the lines of issue #310, so will hopefully be fairly fast to execute. This proposal is going to focus on an added buy-side API, which we could potentially extend to sell-side, if that turns out to be useful. Any filtering out of bids on the buy-side will also reduce the number of scoreAd() invocations, so a buy-side API alone can reduce both buy-side and sell-side latency and compute resource usage. We’re hoping we can get Javascript per-function-call overhead down enough that if a seller needs filters, it’s performant for them to be embedded in the seller script and the JSON data it takes as input. Since buyers may take some time to generate each bid, and don’t know all the interest groups a user is in, buy-side filtering/reprioritization makes sense, even with greatly reduced overhead for each generateBid() call. DetailsThe new filter works by taking the dot product of two sparse vectors, represented as JSON dictionaries (e.g. { “cars”:1, “politics”:0, “42”:-10 }), together. If the result is less than or equal to 0, the interest group is dropped from the auction. If it’s greater than 0, it replaces the interest group’s priority. Filtering / reprioritization can be done either at auction start or when receiving JSON from the real-time trusted bidding signals server, or both. If it’s only done at auction start for all interest groups owned by a particular buyer, then The perBuyerPrioritySignals : { Where each entry is a per-buyer dictionary of keys to JSON numbers used in the sparse vector multiplication. The “*” field is used for all buyers, with identically keyed buyer-specific fields taking precedence. Keys starting with “ There are also new optional fields in interest group definitions. They are: useBiddingSignalsPrioritization : [true | false], If If a Values in In order for JSON fetches to provide { In addition, the server must send a “X-fledge-bidding-signals-format-version: 2” header, for the response to be interpreted as using the new format, though eventually support for the old format will be removed. The new format is supported even when If a For all sparse multiplications, the browser appends a number of values to the
When fetching interest group updates, the interest group’s new |
This is a proposal to update the format of trusted bidding signals, to allow for different types of data to be provided. The immediate goal is to allow reprioritization/filtering information to be provided, along the lines discussed in issue WICG#305. Maintains backwards compatibility with the old format (for now) so as not to break early experiments, but longer term, we'll want to remove the old format - could perhaps break compatibility once we start requiring JSON come from a validated trusted server. Also adds the list of interest groups to the bidder JSON fetches (only), as those will also be needed for the filter logic, and may be generally useful. This PS does not add anything specific to the filtering/reprioritization logic, or anything specific to one of the proposals in issue WICG#305 - I think we'll need this new format for any filter API that we let use bidding signals. Seller signals, where we may or may not also implement filtering, already uses a format where new top-level dictionaries to be added, so does not need to have its format updated.
Of course a JS function would be more flexible than the sparse multiplication, and I was wondering, what would be the drawbacks of the JS-based approach? Would it be much more complex implementation-wise? |
There are a couple concerns:
So the concerns are basically around performance, rather than implementation (adding two sets of cross-process scoring calls - one before downloading JSON, and one after, is also more complicated to implement than just the after-JSON ones, but that's not the real concern here). V8 is not really designed or optimized for scripts that are loaded, run once, and then immediately discarded - I assume this is the case for Javascript engines in general, though that's not an area I have any expertise in. |
Thanks, this makes sense. The sparse dot product sounds like a promising direction, we would use it if it was supported. (I think it is likely that during the development we would also come across some iterative impovement ideas, like additional operations, special variables, etc.) |
Adding a new language for filtering will increase even more the technical complexity of Fledge. The easiest would be just allow custom implementations in some way and provide a way to get access to all information to buyers and sellers and then each implementer could choose its own implementation. Maybe it could be solved now server side with the proposal to allow user defined function in the trusted server ? Doing this server side would have many advantages because one could scale the servers depending on the computations that are done. Potentially it requires to be able to inject more signals in the trusted server as discussed in the last meeting from 31/08/2022. |
Today's Intent to Ship references this issue:
What are you specifying? |
There is significant latency impact from the overhead for starting worklets and setting up v8 contexts for
generateBid
andscoreAd
. On the other hand, a significant amount of DSP logic ingenerateBid
and SSP logic inscoreAd
is in enforcing various eligibility conditions: ensuring that ads meet publisher and policy requirements for the page, and that the publisher page meets ads' requirements. This means that frequently the API overhead is paid only to drop the interest group from the auction.If the browser could provide an API for directly filtering out ads and interest groups without incurring the expensive overhead, we could realize significant latency improvements. A limited API that does not execute arbitrary JS code should not require the same sandboxing and separate contexts. Note that suggestions in #302 have some overlap since they would provide a limited way of filtering interest groups from the trusted server response.
A fairly powerful way to specify the kinds of eligibility conditions mentioned above is in terms of logical operations on sets of tokens. These could be expressed in a small DSL or directly in terms of a JSON tree representation, e.g. this might represent a publisher requirement not to have ads with 'shoes' or 'sports' tokens:
Each ad would come with some classification into sets of relevant tokens, for which the tree could be evaluated to determine its eligibility. Similarly each ad may have some tree to be applied to the classification of the page. In practice, we expect ad techs to use opaque tokens (possibly numbers) to avoid unnecessary leaking of sensitive data. We suggest having 'AND', 'OR', 'ANDNOT', and 'ORNOT' as possible operators. The nodes field could contain subtrees following the same schema. We expect typical usage to be on the scale of 10s of tokens per ad.
We propose that the API provides a way of setting a filtering condition for each ad on the trusted server response and/or in the interest group object that will be applied to
perBuyerEligibilityTokens
provided in theauctionConfig
. The ads that are filtered out will be (temporarily) removed from the interestGroup input togenerateBid
atrunAdAuction
time.generateBid
will not be run for any interest group that has no eligible ads.Similarly, we suggest that SSPs have the same capability. The
auctionConfig
would have asellerEligibilityCondition
that would be applied to tokens provided by the trusted scoring server.Buyer Filtering Example
Let's illustrate how we expect this to work with an example. Here is a possible filtering tree for one ad:
which can be serialized to JSON as:
This tree would be returned from the trusted server response. A possible API: the filtering tree is returned from the trusted server with a special field that gives a map from renderUrl (as a way to identify the ad) to filtering tree:
If the responses for different trusted bidding keys contain conflicting conditions for the same renderUrl, then the browser is free to select any one.
Then in the call to runAdAuction, the buyer can provide tokens describing the page to be passed in the
auctionConfig
. For example, suppose the publisher page is in the US, discusses politics, and has to do with cars; a DSP might encode these observations via tokens 2, 7, and 9:In the above example, the ad would be eligible since token 7 is present, even though the left part of the tree does not evaluate to true since 4 is not present.
Seller Filtering Example
Seller filtering is similar, but the conditions are provided in the
auctionConfig
and classification of renderUrls into tokens are provided by the trusted seller server. Publisher requirements are passed in a tree inauctionConfig
:Here, we have a simpler tree, since we gave a more complicated example above.
Then, the
trustedScoringSignals
could include a specific field with SSP tokens describing the renderUrl:In this case the ad would be filtered out since token 14 is present.
The text was updated successfully, but these errors were encountered: