Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Streaming content for token usage #8902

Merged

Conversation

TaoChenOSU
Copy link
Contributor

@TaoChenOSU TaoChenOSU commented Sep 18, 2024

Motivation and Context

OpenAI recently starts providing token usage information on their streaming chat completion API. This ADR opens the discussion on how we should consume that information within our StreamingChatMessageContent data structure.

Description

  1. An ADR documenting 4 approaches.
  2. Implementation of the selected approach.
  3. Fix issues where data in the streaming contents are modified after concatenation (__add__).
  4. Add streaming to model diagnostics operation names so that we can distinguish streaming and non-streaming operations.

Contribution Checklist

@TaoChenOSU TaoChenOSU added the python Pull requests for the Python Semantic Kernel label Sep 18, 2024
@TaoChenOSU TaoChenOSU self-assigned this Sep 18, 2024
@github-actions github-actions bot changed the title Streaming content for token usage Python: Streaming content for token usage Sep 18, 2024
@markwallace-microsoft markwallace-microsoft added documentation and removed python Pull requests for the Python Semantic Kernel labels Sep 18, 2024
@TaoChenOSU TaoChenOSU linked an issue Sep 18, 2024 that may be closed by this pull request
@TaoChenOSU
Copy link
Contributor Author

Proposal 3 and 4 came from an offline discussion. They do include a refactoring of the existing content classes, which is not necessary for handling streaming usage information from OAI. However, I do like those two proposals as they further clean up our code base. Out of the two, I think proposal 4 is cleaner.

@TaoChenOSU TaoChenOSU added the PR: in progress Under development and/or addressing feedback label Sep 20, 2024
@TaoChenOSU TaoChenOSU requested a review from a team as a code owner September 23, 2024 16:21
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Sep 23, 2024
@TaoChenOSU
Copy link
Contributor Author

Upon further deliberation, I think that deprecating StreamingXXXContent may be premature at this time due to the following reasons:

  1. Many AI services yield objects of varying types in their streaming APIs. Implementing the abstraction on our end could potentially lead to confusion.
  2. The deprecation of StreamingXXXContent would result in breaking changes to our AI connectors.
  3. Such deprecation would also cause a misalignment with the .Net SDK.

@markwallace-microsoft markwallace-microsoft removed the python Pull requests for the Python Semantic Kernel label Sep 23, 2024
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Sep 23, 2024
@TaoChenOSU TaoChenOSU removed the PR: in progress Under development and/or addressing feedback label Sep 23, 2024
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Sep 24, 2024

Python Test Coverage

Python Test Coverage Report
FileStmtsMissCoverMissing
semantic_kernel
   kernel.py1994776%148, 159, 163, 313–316, 423, 437–480
semantic_kernel/agents/channels
   open_ai_assistant_channel.py49198%42
semantic_kernel/agents/group_chat
   agent_chat.py116199%91
   agent_group_chat.py67297%113, 135
   broadcast_queue.py72199%35
semantic_kernel/agents/open_ai
   assistant_content_generation.py68297%81–82
   open_ai_assistant_base.py358399%243, 321–322
semantic_kernel/connectors/ai
   chat_completion_client_base.py116298%382, 392
   completion_usage.py8188%17
semantic_kernel/connectors/ai/anthropic/services
   anthropic_chat_completion.py124398%120, 137, 173
semantic_kernel/connectors/ai/azure_ai_inference/services
   azure_ai_inference_chat_completion.py119794%120, 146–149, 159, 180, 202
   azure_ai_inference_text_embedding.py41198%87
semantic_kernel/connectors/ai/embeddings
   embedding_generator_base.py8188%50
semantic_kernel/connectors/ai/google
   shared_utils.py26196%56
semantic_kernel/connectors/ai/google/google_ai/services
   google_ai_chat_completion.py119497%127, 153, 176, 178
   google_ai_text_completion.py63297%98, 121
   utils.py65395%140, 160–165
semantic_kernel/connectors/ai/google/vertex_ai/services
   utils.py66395%141, 161–166
   vertex_ai_chat_completion.py119497%121, 147, 170, 172
   vertex_ai_text_completion.py62297%95, 116
semantic_kernel/connectors/ai/hugging_face/services
   hf_text_completion.py60395%103, 112, 127
   hf_text_embedding.py32584%79–83
semantic_kernel/connectors/ai/mistral_ai/services
   mistral_ai_chat_completion.py118794%118–121, 307–310
semantic_kernel/connectors/ai/ollama/services
   ollama_chat_completion.py60592%95–98, 108, 143
   ollama_text_completion.py55591%87–90, 100, 128
semantic_kernel/connectors/ai/open_ai/prompt_execution_settings
   open_ai_prompt_execution_settings.py94199%112
semantic_kernel/connectors/ai/open_ai/services
   azure_chat_completion.py107595%118, 123, 157, 166, 169
   azure_text_completion.py28293%82, 87
   azure_text_embedding.py30293%84, 89
   open_ai_chat_completion_base.py127596%71, 121, 141, 177, 287
   open_ai_handler.py63395%86, 95–96
   open_ai_text_completion_base.py80298%56, 161
semantic_kernel/connectors/ai/open_ai/settings
   azure_open_ai_settings.py22482%97–100
semantic_kernel/connectors/memory/azure_ai_search
   azure_ai_search_collection.py87298%150, 152
semantic_kernel/connectors/memory/redis
   redis_collection.py160299%146, 316
   utils.py451176%145–146, 164, 166, 173–188
semantic_kernel/connectors/openapi_plugin
   openapi_manager.py58297%110–111
   openapi_parser.py88298%71, 128
   openapi_runner.py105298%181–182
semantic_kernel/connectors/openapi_plugin/models
   rest_api_operation.py129199%242
semantic_kernel/contents
   function_call_content.py97199%201
   streaming_content_mixin.py38295%37, 63
semantic_kernel/core_plugins/sessions_python_tool
   sessions_python_plugin.py134894%69, 82–91, 99
   sessions_python_settings.py39490%84–87
semantic_kernel/data
   vector_store_record_collection.py2491992%410, 470–474, 482–486, 526–529, 536–539
   vector_store_record_utils.py26292%50, 52
semantic_kernel/functions
   kernel_function_decorator.py98199%102
   kernel_function_from_method.py96199%153
   kernel_function_from_prompt.py154795%165–166, 180, 201, 219, 239, 322
   kernel_function_log_messages.py36683%37–43
   kernel_plugin.py187299%472, 475
semantic_kernel/planners
   plan.py2344581%54, 163–165, 197, 214–227, 264, 269, 277–278, 288–291, 308, 313, 329, 332–337, 355, 360, 363, 365, 372, 386–388, 393–397
semantic_kernel/planners/function_calling_stepwise_planner
   function_calling_stepwise_planner.py116497%145, 189–190, 198
semantic_kernel/planners/sequential_planner
   sequential_planner.py64691%71, 75, 109, 125, 134–135
   sequential_planner_extensions.py50982%31–32, 56, 110–124
   sequential_planner_parser.py771284%66–74, 93, 117–120
semantic_kernel/schema
   kernel_json_schema_builder.py129993%53, 90, 186, 194, 205, 213, 228, 232–233
semantic_kernel/services
   ai_service_client_base.py22195%64
semantic_kernel/template_engine/blocks
   code_block.py77199%119
   named_arg_block.py43198%98
semantic_kernel/utils/authentication
   entra_id_authentication.py15940%25–38
semantic_kernel/utils/telemetry
   user_agent.py16288%18–19
semantic_kernel/utils/telemetry/model_diagnostics
   decorators.py171498%372–375
TOTAL1107031897% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
2450 4 💤 0 ❌ 0 🔥 57.659s ⏱️

Copy link
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TaoChenOSU TaoChenOSU added this pull request to the merge queue Sep 25, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 25, 2024
@moonbox3 moonbox3 enabled auto-merge September 25, 2024 19:35
@moonbox3 moonbox3 added this pull request to the merge queue Sep 25, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 25, 2024
@TaoChenOSU TaoChenOSU added this pull request to the merge queue Sep 26, 2024
Merged via the queue into microsoft:main with commit 5c5d761 Sep 26, 2024
25 checks passed
@TaoChenOSU TaoChenOSU deleted the taochen/streaming-content-for-token-usage branch September 26, 2024 00:12
@TaoChenOSU TaoChenOSU linked an issue Oct 18, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation python Pull requests for the Python Semantic Kernel
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Python: OpenAI streaming token usage Python: add streaming usage option to OpenAIPromptExecutionSettings
4 participants