You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[38], line 1
----> 1 generate_answer(query)
Cell In[37], line 5, in generate_answer(query)
3 inputs = tokenizer([query], max_length=1024, return_tensors="pt")
4 # use generator to predict output ids
----> 5 ids = generator.generate(inputs["input_ids"], num_beams=2, min_length=20, max_length=40)
6 # use tokenizer to decode the output ids
7 answer = tokenizer.batch_decode(ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/transformers/generation/utils.py:1329, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, **kwargs)
1321 logger.warning(
1322 "A decoder-only architecture is being used, but right-padding was detected! For correct "
1323 "generation results, please set `padding_side='left'` when initializing the tokenizer."
1324 )
1326 if self.config.is_encoder_decoder and "encoder_outputs" not in model_kwargs:
1327 # if model is encoder decoder encoder_outputs are created
1328 # and added to `model_kwargs`
-> 1329 model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(
1330 inputs_tensor, model_kwargs, model_input_name
1331 )
1333 # 5. Prepare `input_ids` which will be used for auto-regressive generation
1334 if self.config.is_encoder_decoder:
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/transformers/generation/utils.py:642, in GenerationMixin._prepare_encoder_decoder_kwargs_for_generation(self, inputs_tensor, model_kwargs, model_input_name)
640 encoder_kwargs["return_dict"] = True
641 encoder_kwargs[model_input_name] = inputs_tensor
--> 642 model_kwargs["encoder_outputs"]: ModelOutput = encoder(**encoder_kwargs)
644 return model_kwargs
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/transformers/models/bart/modeling_bart.py:811, in BartEncoder.forward(self, input_ids, attention_mask, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
808 raise ValueError("You have to specify either input_ids or inputs_embeds")
810 if inputs_embeds is None:
--> 811 inputs_embeds = self.embed_tokens(input_ids) * self.embed_scale
813 embed_pos = self.embed_positions(input)
814 embed_pos = embed_pos.to(inputs_embeds.device)
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/torch/nn/modules/sparse.py:162, in Embedding.forward(self, input)
161 def forward(self, input: Tensor) -> Tensor:
--> 162 return F.embedding(
163 input, self.weight, self.padding_idx, self.max_norm,
164 self.norm_type, self.scale_grad_by_freq, self.sparse)
File ~/.pyenv/versions/3.11.3/envs/pinecone/lib/python3.11/site-packages/torch/nn/functional.py:2210, in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
2204 # Note [embedding_renorm set_grad_enabled]
2205 # XXX: equivalent to
2206 # with torch.no_grad():
2207 # torch.embedding_renorm_
2208 # remove once script supports set_grad_enabled
2209 _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2210 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
Expected Behavior
Should run without errors
Steps To Reproduce
On a system with an NVIDIA GPU supported by PyTorch:
To fix the bug, change the following line in function generate_answer(query): inputs = tokenizer([query], max_length=1024, return_tensors="pt")
to: inputs = tokenizer([query], max_length=1024, return_tensors="pt").to(device)
The text was updated successfully, but these errors were encountered:
Is this a new bug?
Current Behavior
When running the example abstractive-question-answering.ipynb I get the following error in cell 18, calling
generate_answer(query)
Expected Behavior
Should run without errors
Steps To Reproduce
On a system with an NVIDIA GPU supported by PyTorch:
Relevant log output
Environment
Additional Context
To fix the bug, change the following line in function
generate_answer(query):
inputs = tokenizer([query], max_length=1024, return_tensors="pt")
to:
inputs = tokenizer([query], max_length=1024, return_tensors="pt").to(device)
The text was updated successfully, but these errors were encountered: