Skip to content
This repository has been archived by the owner on Nov 5, 2024. It is now read-only.

Consider routing reports through an (untrusted) ingestion server #2

Open
csharrison opened this issue Oct 9, 2023 · 2 comments
Open

Comments

@csharrison
Copy link

Rather than sending reports directly to the MPC system, I want to consider routing them through an ingestion server, e.g. operated by the advertiser / publisher / ad tech. I believe this does not regress the security or privacy stance of this API, and it comes with a number of benefits:

  • The MPC nodes need to keep much less state. The ingestion server is responsible for long-term storage of the raw reports prior to issuing an aggregate query, and so the state the MPC nodes need to keep is only the minimum required for keeping track of anti-replay / privacy budgets.
  • Minimizing the responsibility of the MPC will make deploying MPC nodes much easier.
  • This allows us to support some more flexible queries without changing the MPC protocol at all. These include:
    • Queries that group together many conversion IDs
    • Queries that filter out records (e.g. based on a context ID) e.g. if they are marked as spammy

We should figure out what anti-replay state management would look like in this system, but I think it will end up overall simplifying the system and making it more flexible.

@winstrom
Copy link
Collaborator

winstrom commented Oct 9, 2023

I think this is a good idea for enhancement. There is definitely an interesting question on who should host the mailbox since conversion measurements from the advertiser may give timing information to the ad-tech or publisher if they were to receive that information.

As an initial straw man for the anti-replay logic: the MPC nodes should be able to keep a small state regarding e.g. the last time they ran an aggregation and the reports consumed since then. Each report would then have a timestamp in the encrypted information for the MPC so that the MPC could ignore events created on the device before the closed aggregation period. I imagine more nuance may be needed...

@winstrom
Copy link
Collaborator

I would also want to discuss data retention time guarantees for such an ingestion system. One of the things that browsers will be able to specify is their trust in a particular set of nodes conducting an aggregation. Having (even encrypted) user data journaled somewhere for long periods of time becomes another attack surface.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants