Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue when calling llama 405b from amazon bedrock. #15

Open
boranhan opened this issue Jan 19, 2025 · 3 comments
Open

issue when calling llama 405b from amazon bedrock. #15

boranhan opened this issue Jan 19, 2025 · 3 comments
Assignees

Comments

@boranhan
Copy link

Hi Authors,

I changed the llama405B calling to bedrock since I don't know which server did you use to call llama405B. This is what I added:

    llm = ChatBedrock(model_id="meta.llama3-1-405b-instruct-v1:0", model_kwargs=dict(temperature=temperature), region_name="us-west-2",)
    response = llm.invoke(messages)

This is the output of one data:

content='<forecast>\n(2012-05-12 22:00:00, 699.544)\n(2012-05-12 23:00:00, 476.461)\n(2012-05-13 00:00:00, 418.433)\n(2012-05-13 01:00:00, 395.977)\n(2012-05-13 02:00:00, 383.784)\n(2012-05-13 03:00:00, 390.644)\n(2012-05-13 04:00:00, 411.351)\n(2012-05-13 05:00:00, 449.242)\n(2012-05-13 06:00:00, 534.425)\n(2012-05-13 07:00:00, 734.475)\n(2012-05-13 08:00:00, 811.731)\n(2012-05-13 09:00:00, 889.385)\n(2012-05-13 10:00:00, 950.421)\n(2012-05-13 11:00:00, 976.422)\n(2012-05-13 12:00:00, 1021.48)\n(2012-05-13 13:00:00, 981.351)\n(2012-05-13 14:00:00, 1020.79)\n(2012-05-13 15:00:00, 987.288)\n(2012-05-13 16:00:00, 1041.27)\n(2012-05-13 17:00:00, 1011.98)\n(2012-05-13 18:00:00, 1016.57)\n(2012-05-13 19:00:00, 974.249)\n(2012-05-13 20:00:00, 981.339)\n(2012-05-13 21:00:00, 961.468)\n</forecast>' additional_kwargs={'usage': {'prompt_tokens': 3258, 'completion_tokens': 468, 'total_tokens': 3726}, 'stop_reason': 'stop', 'model_id': 'meta.llama3-1-405b-instruct-v1:0'} response_metadata={'usage': {'prompt_tokens': 3258, 'completion_tokens': 468, 'total_tokens': 3726}, 'stop_reason': 'stop', 'model_id': 'meta.llama3-1-405b-instruct-v1:0'} id='run-d853a72d-7b87-4660-8a6b-264e4b7bb90d-0' usage_metadata={'input_tokens': 3258, 'output_tokens': 468, 'total_tokens': 3726}

However, I get an error message as follows:

INFO:DirectPrompt:Parsing forecasts from completion.
ERROR:Evaluation:Error evaluating task ElectricityIncreaseInPredictionWithSplitContext - Seed 5: 'str' object has no attribute 'choices'
ERROR:Evaluation:Traceback (most recent call last):
  File "/home/ubuntu/fsx/context-is-key-forecasting/cik_benchmark/evaluation.py", line 161, in evaluate_task
    samples = method_callable(task_instance=task, n_samples=n_samples)
  File "/home/ubuntu/fsx/context-is-key-forecasting/cik_benchmark/utils/cache/__init__.py", line 196, in __call__
    samples = self.method_callable(task_instance, n_samples)
  File "/home/ubuntu/fsx/context-is-key-forecasting/cik_benchmark/baselines/direct_prompt.py", line 500, in __call__
    for choice in chat_completion.choices:
AttributeError: 'str' object has no attribute 'choices'

I'm wondering what's put into the chat_completion.choices?

@boranhan boranhan changed the title issue when changing llama 405b to bedrock. issue when calling llama 405b from amazon bedrock. Jan 19, 2025
@ashok-arjun
Copy link
Collaborator

Hi @boranhan, usually choices is a list with n_samples elements and HuggingFace returns its output in choice.message.content for each choice in the list. This format is specific to HuggingFace chat completion.

I notice that in your case, the output format is different (within content) so you would have to change that code to reflect that. I'm not sure of the exact code as I haven't used ChatBedrock.

@ashok-arjun ashok-arjun self-assigned this Jan 20, 2025
@boranhan
Copy link
Author

@ashok-arjun Thanks for the reply. My choice.message.content only contains one answer. Will your llama405B contains several answers in choices? Isn_samples used in the batch inference? if so, ChatBedrock doesn't have the batch inference option. In that case, can I run the model invoke multiple times (in the sequential way)? Since the temperature is set to 1, each invoke result will be different. will that be equivalently as several answers inchoices?

@ashok-arjun
Copy link
Collaborator

ashok-arjun commented Jan 21, 2025

Yes, you should be running it several times. You can simply do that by setting batch_size=1 and batch_size_on_retry=1 when creating the DirectPrompt object. You can do this by setting it in your experiment JSON too.

Then the code will automatically run iteratively until we have n_samples. That'll be equivalent to having several answers (similar to what I have).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants