Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add device.crash event semantic convention #1576

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

bidetofevil
Copy link

Adding the definition for an Event that models a crash in the mobile app. There are still some open questions to be worked out regarding naming and the specific event attribute definitions, but I want to put the draft changes in a PR to get further feedback from a larger group, having done so already via a doc with some other folks in the mobile community.

The PR is not ready to merging right now. I will make it a non-draft when it is.

Merge requirement checklist

@bidetofevil bidetofevil requested review from a team as code owners November 13, 2024 23:07
@bidetofevil bidetofevil marked this pull request as draft November 13, 2024 23:07
Copy link

@LikeTheSalad LikeTheSalad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 🙌 I think it's a great start.

The event body fields MUST be used to describe the state of the application at the time of the crash, not when the event was actually
emitted, which could happen at a much later time (e.g. when the app next starts up).
body:
id: device_crash_state

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it needed to wrap all the body fields within this map? Since it seems like this map is the only field at the root level of the body, I was wondering if we could instead define all of its children as independent fields.

Supplements the `source` field that identifies the specific variation of it [2].
note: >
This version is specifically for the `source` field. It can be a well-defined version of some external format (e.g. Android 15
Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this note it seems like the source_version data might come from different places depending on the case. Maybe there's something I'm missing, though since we're defining source as an enum, I think it would be better (if possible) to define the source_version format for each type of source to avoid potential ambiguity issues.

brief: >
The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046).
note: >
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to check this with an example just to make sure I got it properly.

Considering an event with source = jvm_exception. Let's say that this data_content_type field has application/json as its value. The collector in this case will be able to parse it as json, however, how can it know what fields to look for in that json? My question is mostly about ensuring that the UI can later display this data properly.

Copy link

@jzwc jzwc Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.

In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.

Copy link

@jzwc jzwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your initiative.

I apologize for not attending the SIG calls where this may have been discussed, but I am curious if there is a proposal to include an additional attribute that would help identify the crash report format beyond just the MIME type. Specifically, on iOS, there are several well-known libraries that generate crash reports in their own, somewhat distinct formats. For instance, knowing that a crash report was generated using PLCrashReporter could enhance interoperability. While this information could be incorporated into the MIME type, it would require parsing it out, which introduces another potential source of incompatibility.

Never mind, this is indeed covered by the source attribute.

brief: >
Crash in the native layer caught by a signal handler
- id: aei
value: 'aei'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m curious if the full name (application_exit_info) might be a better approach?

type: string
- id: crashed_service_version
stability: experimental
requirement_level: recommended
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note reads "This is required"...

type: string
- id: crashed_os_version
stability: experimental
requirement_level: recommended
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note reads "This is required"...

This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
examples: [ "application/json" ]
type: string
- id: crashed_service_version
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intention to emphasize, but the crashed_ prefix appears unnecessary in this context, as the attribute is clearly defined within device.crash, isn't it? Ditto crashed_os_version.

brief: >
The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046).
note: >
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
Copy link

@jzwc jzwc Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.

In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.

@lmolkova
Copy link
Contributor

/cc @open-telemetry/semconv-mobile-approvers

stability: experimental
requirement_level: required
brief: >
An ID that uniquely identifies the crash instance obtained from a specific `source`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this be generated? From say, an Android device crash?

@tonzhan2
Copy link

This looks like a great start. I'm curious as to what some examples of crashes could look like in this format. Such as an Android jvm crash. Obviously most of it will be in the body, so I guess of that format is still tbd. Could be useful nonetheless to have some examples of crashes that use this formatting

Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema).
examples: [ "1.0.0" ]
type: string
- id: data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this better be in the body (earlier payload) of the Event (https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-body)? This would also implicate the type any rather than string.

@jzwc
Copy link

jzwc commented Dec 7, 2024

Just another quick thought. It would be generally beneficial is app state / resource attributes like device.app.lifecycle would be allowed as the crash event attributes to pass the respective application and device state at the moment of the crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

5 participants