-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define active and passive modes #15
Comments
Adding some more thoughts to this discussion:
This implies
In this setup, the presence of |
Hi, nara. I don't think nodes overriding rules is specific to either mode. If either mode a node decides to ignore their configuration they can. Exactly what you said about passive is possible in active if you stretch out the service graph and choose different nodes. Right now, there are tracing systems which trigger based on data generated upstream. For example, the first use of this was the "debug" flag sent by chrome plugin to zipkin. It is state that pre-empts the tracing, and it needs to be carried somehow. To take a relationship to something else user-originated, you can consider haystack-blobs, which has a UI where users decide to start attaching request/response to sampled requests. It is nice that you don't feel the need for these upstream use cases, but at some point, just like in b3 today, secondary decisions will need metadata passed somehow, and that is precisely the point of the sampling header. |
In other words, it would be extremely dangerous to just look at a sampling key and assume that means it is sampled. Do so at your own risk and definitely that behavior won't be done here. |
Actually you are right. Because That said, for our use I have implemented to provision the Basically, the request matcher and the sampler are always coupled together and will provision the |
ok please read some things on https://gitter.im/openzipkin/secondary-sampling also, as there's some context to cover. It isn't easy to jump into propagation design, but I'd like to help you understand the corner cases. |
Using this PR I learned my understanding of Let me review these modes with some sample real world use cases. Consider the following call paths: Use case 1:
Sample 100 req/s for the endpoint Solution: Following triggers are setup as a result of dynamic configuration:
The trigger setup on Use case 2:
Sample all requests for the endpoint Solution: Following triggers are setup:
The trigger setup on In the above cases, except for the one marked |
Details look very good @narayaruna |
openzipkin/brave#997 to allow you to specify the rule you mentioned |
While an internal detail to some, there's use noting that not all sampling keys will be provisioned at the head of the network (gateway). In some implementations, it will be easier to perform provisioning of the key at the first sampled hop.
Let's define in the doc "active" and "passive" participation.
"active" is when a participant creates instructions for downstream directly or indirectly. For example, they sample the first request on a key, possibly also creating that key, and as a side-effect add the "spanId" parameter.
"passive" refers to a node that only participates if something upstream has (ex via presence of the "spanId" parameter.
Note: there's a difference between "active" sampling and provisioning of the sampling key itself.
Some deployments will want to limit the control of sampling keys to a gateway role. In those cases, the sampling key will be provisioned, but not sampled unless it literally was the policy to also record the gateway itself. In that scenario, a key would be passed unaltered downstream until a participant activates it by sampling (side-effect being the presence of the "spanId" field.
A short-cut of this is where policy is decentralized and all nodes have special knowledge about upstream properties like user-id, or do not need them to make a decision. In this case, a sampling key would be provisioned at the same place in which it is sampled. In deployments like this, they will never see sampling keys unless they were sampled, in other words.
I think we can't rely on passive only (ex provisioning only at the first point sampled), as that spreads the control plane logic to all nodes. In b3 today, we already know people do things like control IDs upstream for reasons of control and data access.
Also, in many public facing sites, I would expect data available at the first hop to not necessarily be available later. Rather than propagating for example the initial user-id, or IP address etc externally, it could be simpler for those sites to evaluate in the gateway the sampling key and pass it along. This lowers the data dependencies, and also centralizes logic. OTOH, this wouldn't prevent doing the same when a site has the ability and interest to also propagate extra fields everywhere. There are pros and cons.
To elaborate two potential deployment options of auth,cache and not TTL:
Where your gateway controls all sampling keys, it (provisions) and later auth service [samples] that key.
(gateway) -> api -> [auth] -> [cache] -> authdb
The other option is when you have all the data you need externally pushed to auth to make that decision, or the auth needs no extra data (like user id etc). This mode looks like a shortcut as (provisioning) of the key is delayed to the same hop that [samples] it.
gateway -> api -> [(auth)] -> [cache] -> authdb
moved from #14 (comment)
The text was updated successfully, but these errors were encountered: