evo and stripedhyena crash the server when doing a simple inference #21

sun-qibo · 2024-08-28T16:10:58Z

I observed similar behavior in evo and tripedhyena, that the model can be loaded successfully but the server crashed once I tried to do a simple inference.

compute ressource: databricks azure cluster with nvidia A100

pkgs: flash-fft-conv and flash-attention well installed so that the model had no problem being loaded

code for stripedhyena:

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_name = "togethercomputer/StripedHyena-Hessian-7B"
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    model_max_length=sys.maxsize,
    trust_remote_code=True,
)

tokenizer.pad_token = tokenizer.eos_token

config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
config.use_cache = True

device = torch.device("cuda")
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    config=config,
    trust_remote_code=True,
).to(device)

input_text = "Question: How many hours in one day? Answer: "
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)

model.generate(input_ids)

code for evo:

from transformers import AutoConfig, AutoModelForCausalLM
from stripedhyena.tokenizer import CharLevelTokenizer

tokenizer = CharLevelTokenizer(512)

hf_model_name = 'togethercomputer/evo-1-131k-base'


model_config = AutoConfig.from_pretrained(
    hf_model_name,
    trust_remote_code=True,
    revision='1.1_fix',
)

model_config.use_cache = True

model = AutoModelForCausalLM.from_pretrained(
        hf_model_name,
        config=model_config,
        trust_remote_code=True,
        revision='1.1_fix',
    )

sequence = 'ACGT'

input_ids = torch.tensor(
    tokenizer.tokenize(sequence),
    dtype=torch.int,
).to(device).unsqueeze(0)

with torch.no_grad():
    logits, _ = model(input_ids)

Both codes crashed the python kernel at the last line.

I was not sure if the issue was caused by configuration of my databrick ressources I also tried randomly other models of the same size e.g. "HuggingFaceH4/zephyr-7b-beta" and there were no problem making the inference. I do not know if there's any other possible incompatibility between stripedhyena and databricks though.

Does anyone also encounter this problem?

The text was updated successfully, but these errors were encountered:

FumaNet · 2025-01-05T13:46:38Z

Sorry for the late reply. I get a KeyError: 'stripedhyena' even before infefrence can start. Could you share which version of the libraries you are using?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evo and stripedhyena crash the server when doing a simple inference #21

evo and stripedhyena crash the server when doing a simple inference #21

sun-qibo commented Aug 28, 2024 •

edited

Loading

FumaNet commented Jan 5, 2025

evo and stripedhyena crash the server when doing a simple inference #21

evo and stripedhyena crash the server when doing a simple inference #21

Comments

sun-qibo commented Aug 28, 2024 • edited Loading

FumaNet commented Jan 5, 2025

sun-qibo commented Aug 28, 2024 •

edited

Loading