Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Vertex AI instrumentation #3069

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
Draft

Conversation

aabmass
Copy link
Member

@aabmass aabmass commented Dec 5, 2024

Description

Part of #3041

Adds vertexai instrumentation with updates to match the latest semantic conventions.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Added some new Cassette tests with a lot of sanitization logic which seems to work.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@aabmass aabmass changed the title Vertexai Added Vertex AI instrumentation Dec 5, 2024
Copy link
Contributor

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad to see this! I have some initial comments which are basically an effort to conserve time.

If the first PR has basically exactly the same tests, readme and examples that are in openai, we have less to talk about which means less latency in getting this merged.

Afterwards, we can build on it to additionally add vision model etc. Meanwhile, please keep tests and fixtures exactly the same except for obvious s/openai/vertexai and library invocation differences. Then, if you can self-review highlighting parts unique to vertex for the next person.

Hope this guidance helps!

@pytest.fixture(autouse=True)
def vertexai_init(vcr: VCR) -> None:
# Unfortunately I couldn't find a nice way to globally reset the global_config for each
# test because different vertex submodules reference the global instance directly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind opening a github issue on this in the google project, so we can link it and click there to see the details?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear this isn't a bug or anything. It's just difficult to monkey patch. This more of a note to self during development, I'd prefer to remove this comment as it's a bit distracting



@pytest.mark.vcr
def test_vertexai_generate_content(exporter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe start with plain chat tests from openai (just convert them) I think vision/multi-modal could be better discussed in aggregate across the two?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update tomorrow


vertexai.init(location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar can you use an ENV? even better to basically copy the openai example word for word then adjust the library calls etc. This reduces the amount of attention needed to land your change vs having differences in setup and also library https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation-genai/opentelemetry-instrumentation-openai-v2/examples/zero-code/main.py#L9

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually copied this directly from the Vertex quickstart. I can try adapt it to be the same as the OpenAI sample

Copy link
Member Author

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @codefromthecrypt. Before you do a full review (I'll mark the PR as not draft), I know this PR is big with a lot of boilerplate and copied code from openllmetry package. I could split it into multiple PRs so we don't need to bikeshed on code from openllmetry and can rather just review the changes. Wdyt?

All the tests were disabled in openllmetry so I'd like to get test coverage up before doing any serious refactoring.


vertexai.init(location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually copied this directly from the Vertex quickstart. I can try adapt it to be the same as the OpenAI sample

@pytest.fixture(autouse=True)
def vertexai_init(vcr: VCR) -> None:
# Unfortunately I couldn't find a nice way to globally reset the global_config for each
# test because different vertex submodules reference the global instance directly
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear this isn't a bug or anything. It's just difficult to monkey patch. This more of a note to self during development, I'd prefer to remove this comment as it's a bit distracting



@pytest.mark.vcr
def test_vertexai_generate_content(exporter):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update tomorrow

@codefromthecrypt
Copy link
Contributor

@aabmass since the code style, conventions, semver, approach to handling events etc are very different in openllmetry vs how openai is instrumented, still suggest to start with how openai is done here and adapt vertexai vs the other way. That's just my opinion considering how many review comments may result from doing otherwise. You choose!

@gyliu513
Copy link
Member

gyliu513 commented Dec 6, 2024

@lmolkova @lzchen @nirga @karthikscale3 @alizenhom I can see we have different coding pattern and file layout for instrumentation from different repos

Shall we reach to an agreement for the coding pattern and file layout, filing naming style etc. in this early stage? Thanks!

@aabmass
Copy link
Member Author

aabmass commented Dec 6, 2024

  • Now the Vertex AI is from openllmetry, using openllmetry format and file layout.

@gyliu513 I actually think everything in this draft PR is matching the format and file layout of this repo. Is there something in particular that's different?

@aabmass
Copy link
Member Author

aabmass commented Dec 6, 2024

Taking a step back, can we agree on a step by step process of migrating a package into this repo from openllmetry? As a reviewer it is easier to review PRs that are focused. I think it might be easier to have folks first send a PR copying the code in without any significant changes. Then they can send smaller focused PRs (a PR to get CI running with tox, then a PR migrating to events, etc.) to bring them up to repo style.

There are 26 instrumentations in openllmetry to migrate, and it seems daunting to review a full rewrite for each one.

I can write a proposal for this on an issue if folks would be interested.

@gyliu513
Copy link
Member

gyliu513 commented Dec 6, 2024

@aabmass I totally agree with you, based on the comments from yesterday's discussion, I think @nirga is going to share some migration plan soon.

The first instrumentation for OpenAI is from https://github.com/Scale3-Labs/langtrace-python-sdk/tree/main/src/langtrace_python_sdk/instrumentation/openai , and it has different code style compared with existing traceloop/openllmetry instrumentation code, that's my concern.

Screenshot 2024-12-06 at 12 05 51 PM

@aabmass
Copy link
Member Author

aabmass commented Dec 6, 2024

The first instrumentation for OpenAI is from https://github.com/Scale3-Labs/langtrace-python-sdk/tree/main/src/langtrace_python_sdk/instrumentation/openai

Ah thanks I didn't know this came from langtrace.

and it has different code style compared with existing traceloop/openllmetry instrumentation code, that's my concern.

any specific differences you would like to call out? IMO it's OK for the structure to be different

@xrmx
Copy link
Contributor

xrmx commented Dec 6, 2024

Taking a step back, can we agree on a step by step process of migrating a package into this repo from openllmetry? As a reviewer it is easier to review PRs that are focused. I think it might be easier to have folks first send a PR copying the code in without any significant changes. Then they can send smaller focused PRs (a PR to get CI running with tox, then a PR migrating to events, etc.) to bring them up to repo style.

There are 26 instrumentations in openllmetry to migrate, and it seems daunting to review a full rewrite for each one.

It really depends on the library to trace I guess, haven't looked at vertex ai but for openai it was splitted like this:

  • sync chat client
  • streaming chat
  • async chat
  • events
  • metrics (TODO)

So I would do more or less the same for others too. Maybe we can add a first step adding only the instrumentor class with the tox and workflows to remove some boilerplate for later.

@gyliu513
Copy link
Member

gyliu513 commented Dec 6, 2024

any specific differences you would like to call out? IMO it's OK for the structure to be different

@aabmass please take a look at the __init__.py for langtrace and traceloop for instrumention code, traceloop init file contain all functions that needs to be instrumented, and the major instrumentataion logic is also there. Langtrace is using a file named as patch.py to instrument the code, this is not a big deal, but I prefer we should keep same pattern for all instrumentation code here.

Having same code pattern, coding format, style etc. can help future contribution as well.

@aabmass
Copy link
Member Author

aabmass commented Dec 7, 2024

@gyliu513 I see what you mean. I want to call out that there are already 51 instrumentation packages in this repo, with lots of different patterns and idioms. The techniques used for GenAI instrumentation are not really that different and we still don't have a consensus on how to instrument.

So my preference here is more practical–I think we should lean on existing implementations (langtrace, openllmetry, etc.) and not make updating to the openai-v2 idioms a blocker to initial inclusion. If we can agree to "trust" the existing implementations we are starting with, it will cut down on a lot of review. Once the package is in, we can split up the work and send small focused PRs.

@codefromthecrypt
Copy link
Contributor

TL'DR' Ultimately, if the maintainers of this project are happy with intentional anchoring towards external repos, that's ok with me. Just it is more work to maintain change that crossed multiple projects and there will be a lot of that. So, just be conscious and meanwhile glad to see folks joining and I'll help regardless.


FWIW, my main concern is at least the tests are the same as how we do them in openai-v2, as for example, this helps with things openllmetry doesn't yet do, like content capturing into logs.

The openai-v2 code had a lot of diverse feedback, folks from openllemetry, developing alternatives, as well folks not using openai but experts at this repo.

I believe it is helpful to look at that work in openai-v2, but even having many more people involved, it is indeed not a pure consensus (no code is with an unbounded group).

I strongly believe in house rules. This is not the openllmetry project, this is opentelemetry contrib. We should try our best to follow what the maintainers here have approved, and not intentionally deviate in style or structure. This makes maintenance so much easier especially staring down many potential refactors in the future.

If I've not convinced, that's fine, but thanks for reading!

@codefromthecrypt
Copy link
Contributor

Another thing I will mention is that the authors of "openai-v2" were subject to a higher bar in review. They weren't told to skip steps in the process of review for convenience of making it similar to their original source. While we aren't talking about fairness, I think knowing that we are following a different model now is important. Fast and fix later vs get in in a shape we want.

cc @lmolkova as if this is the case, we should probably appoint zookeepers to go back and fix the new packages once merged. I'm ok doing that.

@codefromthecrypt
Copy link
Contributor

codefromthecrypt commented Dec 9, 2024

can we agree on a step by step process of migrating a package into this repo from openllmetry?

With many vendors involved and the first LLM providers PRs not authored by openllmetry, rather by different folks, with openllmetry also contributing, this begs some Q&A:

  1. Can we assume new providers mean copying from openllmetry?
  • Not so far, but if it helps someone can make a gist about when things are inspired by openllmetry, what are the steps? sounds helpful!
  1. Can we assume we can copy paste any vendor's code?
  • Likely, by the time things complete the code is different than whatever you might start with. Resisting copy/paste of anything not merged here. So, starting small from scratch, is a good way to avoid requiring license NOTICE headers forever.

I started a google doc that isn't yet and may never be sanctioned by this group. I want to emphasize that there are several active, code contributing vendors who also have many of the same instrumentation. The process of merging openai-v2 was the first time multiple folks got together and agreed on a style. That is a moving target as more features are added. The main point is, let's please not assume there is one source project that code comes from, as so far that hasn't been the case.

https://docs.google.com/document/d/16KLUHUUXKyNmpE8Vt9VFcJTiCcYd6AckL222b_DujAU/edit?tab=t.0



def should_send_prompts():
# Previously was opt-in by the following check for privacy reasons:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# https://github.com/traceloop/openllmetry/blob/v0.33.12/packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py#L141-L143
# I guess prompt may be in kwargs instead or in addition?
prompt = kwargs.get("prompt")
if prompt:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we handle prompt twice here? Another one is in Line 165

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment, was wondering the same thing lol hoping to get clarity from TraceLoop folks who originally wrote this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirga ^^

Copy link
Contributor

@nirga nirga Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirga
Copy link
Contributor

nirga commented Dec 13, 2024

@aabmass @codefromthecrypt @gyliu513 @xrmx my 2c for the discussion around structure above -
the structure of code in OpenLLMetry was actually the one we saw commonly used across instrumentations in this repo - where most instrumentation code is just in the __init__.py file (see boto3, Elasticsearch, Kafka, MySQL as some random examples I've picked up). So I think the OpenAI instrumentation we approved is the exception, not the rule.

"gen_ai.system": "vertex_ai",
"gen_ai.operation.name": "text_completion",
"gen_ai.request.model": "gemini-pro-vision",
"gen_ai.response.model": "gemini-pro-vision",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always the case that vertex request and response model are the same? If so, we should probably clarify semconv and only populate the request model in this case. WDYT?

span, event_logger, llm_model, complete_response, token_usage
)

span.set_status(Status(StatusCode.OK))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't set span status in instrumentations to OK - only to ERROR if there is an error. In general case we don't know what constitutes a success.

https://github.com/open-telemetry/opentelemetry-specification/blob/b9b6d8f698f82cd321fbe4fcc15b76d49902c14e/specification/trace/api.md?plain=1#L611

):
complete_response = ""
token_usage = None
async for item in response:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to know if response terminates due to an error? we should try to catch it and record an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. Usually if code can be structured to use a with span: ... block, it also records error on the span. I can dig into it

span, event_logger, llm_model, complete_response, token_usage
)

span.set_status(Status(StatusCode.OK))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
span.set_status(Status(StatusCode.OK))

response.usage_metadata,
)

span.set_status(Status(StatusCode.OK))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
span.set_status(Status(StatusCode.OK))

return await wrapped(*args, **kwargs)

llm_model = "unknown"
if hasattr(instance, "_model_id"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like a slippery slope to rely on internal APIs, is there some way we can avoid it?

**kwargs,
):
tracer = get_tracer(
__name__, __version__, tracer_provider=tracer_provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add schema_url - that's important for consumers to know which version is being emitted.

event_logger = get_event_logger(
__name__,
version=__version__,
event_logger_provider=event_logger_provider,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add schema_url here too

def user_event(
*,
gen_ai_system: str,
# TODO: should I just leave role out since it's not required if "user"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I don't see why it'd be necessary in the event body given it's already in the name.

},
body={
"role": role,
"content": content,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we record content only if it's enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we skip emitting the event altogether if disabled, or emit event without content?

# See the License for the specific language governing permissions and
# limitations under the License.

__version__ = "2.1b0.dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not "2.0b0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how all of the packages are versioned not on the release commit, I guess similar to SNAPSHOT in java. See e.g.

@lmolkova
Copy link
Contributor

lmolkova commented Dec 13, 2024

From my perspective what's important (in descending order):

  • Having consistent user-facing parts (APIs, examples). Consistency with the instrumented library examples is more important than consistency between different genai instrumentation libs.
  • Providing coherent telemetry from instrumentations following otel semconv
  • Making it easy for folks to contribute new and maintain old things
  • Test coverage and testing edge cases

All of this things are fixable while libs are not stable. It's definitely easier to maintain any of these libraries if they follow the same patterns and layout, but it not should be a hard requirement.

I think we can move faster if reviewers would point to specific discrepancies if they see them and maybe we can create a template with all the boilerplate someone would copy when they contribute new instrumentation.


Talking about copy-paste:

I'm not aware of any repos in OTel that prohibit legal copy-paste and I don't see why this repo should.

build-backend = "hatchling.build"

[project]
name = "opentelemetry-instrumentation-vertexai-v2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this package will have feature parity with https://github.com/traceloop/openllmetry/blob/main/packages/opentelemetry-instrumentation-vertexai/

I think we should publish it as opentelemetry-instrumentation-vertexai

@nirga what would be your thoughts on this?

In particular, OTel can start publishing to v1 (or v2) train and OpenLLMetry can still continue v0 train if necessary.

@codefromthecrypt
Copy link
Contributor

codefromthecrypt commented Dec 16, 2024

On if we should copy paste, I wrote this

This is also what I meant by writing "So, starting small from scratch, is a good way to avoid requiring license NOTICE headers forever."

To answer that, @lmolkova clarified we don't mind copy/pasting and pointed a document describing what to do about it

For example this sort of NOTICE entry or similar:

# The foo() function was copied from Project XYZ, and is
# provided here subject to the following:
# [copyright notice]
# [license notice]

Since I gather at this point, the genai team are happy with copy/pasting, next step is to attribute copy/pasted parts back to openllmetry either file by file, a NOTICE in the package root or something else. A soft agreement in the PR comments won't be visible to those using this package, so I think a file change is needed.

@lzchen @xrmx since this is ultimately a python repository and you are maintainers, what should be done here? I didn't see any NOTICE files, so it isn't quite clear in this specific repo the way out?

@aabmass
Copy link
Member Author

aabmass commented Dec 20, 2024

Thanks everyone for your feedback. I'm breaking this out into smaller PRs incorporating the comments here. For now #3123 has the boilerplate and the examples/conftest.py are left as similar as possible to openai-v2. PTAL.

To wrap up the discussions here

  • I will do any code copying from Apache 2 OpenLLmetry in a separate PR to make it very clear and have someone from GC double check the attribution.
  • Discussed the points here with other Python maintainers
    • No big concerns with code being copied from OpenLLMetry.
    • We aren't super concerned with code uniformity (at least not enough to make it a hard requirement). As Nir pointed out, many other packages in contrib stick everything in __init__.py.

Any concerns, lets please discuss on an issue. I'll leave this PR open until I get it all sent out in smaller PRs, but planning to close it after that.

@codefromthecrypt
Copy link
Contributor

Good plan @aabmass thanks for the effort put into it and the summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants