Consider routing reports through an (untrusted) ingestion server #2

csharrison · 2023-10-09T18:29:37Z

Rather than sending reports directly to the MPC system, I want to consider routing them through an ingestion server, e.g. operated by the advertiser / publisher / ad tech. I believe this does not regress the security or privacy stance of this API, and it comes with a number of benefits:

The MPC nodes need to keep much less state. The ingestion server is responsible for long-term storage of the raw reports prior to issuing an aggregate query, and so the state the MPC nodes need to keep is only the minimum required for keeping track of anti-replay / privacy budgets.
Minimizing the responsibility of the MPC will make deploying MPC nodes much easier.
This allows us to support some more flexible queries without changing the MPC protocol at all. These include:
- Queries that group together many conversion IDs
- Queries that filter out records (e.g. based on a context ID) e.g. if they are marked as spammy

We should figure out what anti-replay state management would look like in this system, but I think it will end up overall simplifying the system and making it more flexible.

winstrom · 2023-10-09T20:02:02Z

I think this is a good idea for enhancement. There is definitely an interesting question on who should host the mailbox since conversion measurements from the advertiser may give timing information to the ad-tech or publisher if they were to receive that information.

As an initial straw man for the anti-replay logic: the MPC nodes should be able to keep a small state regarding e.g. the last time they ran an aggregation and the reports consumed since then. Each report would then have a timestamp in the encrypted information for the MPC so that the MPC could ignore events created on the device before the closed aggregation period. I imagine more nuance may be needed...

winstrom · 2023-10-10T16:28:28Z

I would also want to discuss data retention time guarantees for such an ingestion system. One of the things that browsers will be able to specify is their trust in a particular set of nodes conducting an aggregation. Having (even encrypted) user data journaled somewhere for long periods of time becomes another attack surface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider routing reports through an (untrusted) ingestion server #2

Consider routing reports through an (untrusted) ingestion server #2

csharrison commented Oct 9, 2023

winstrom commented Oct 9, 2023

winstrom commented Oct 10, 2023

Consider routing reports through an (untrusted) ingestion server #2

Consider routing reports through an (untrusted) ingestion server #2

Comments

csharrison commented Oct 9, 2023

winstrom commented Oct 9, 2023

winstrom commented Oct 10, 2023