Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU memory leak in SentenceTransformerTrainer.train #3204

Open
captify-isemaniuk opened this issue Jan 30, 2025 · 0 comments
Open

GPU memory leak in SentenceTransformerTrainer.train #3204

captify-isemaniuk opened this issue Jan 30, 2025 · 0 comments

Comments

@captify-isemaniuk
Copy link

I used for reference toy example from
https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/other/training_wikipedia_sections.py
with small changes:

  • i added multiple iterations:
    for it in range(5)
  • used only first 6 steps
  • in the end of every iteration i added:
    del train_dataset, eval_dataset, test_dataset, dev_evaluator, train_loss, args, model, trainer

    gc.collect()
    torch.cuda.empty_cache()

    print(f'iter: {it} memory_allocated: {torch.cuda.memory_allocated() / 1024**3}')
    print(f'iter: {it} memory_reserved:  {torch.cuda.memory_reserved() / 1024**3}')  

Results:
iter: 0 memory_allocated: 0.2649421691894531
iter: 0 memory_reserved: 0.3046875

iter: 1 memory_allocated: 0.5132217407226562
iter: 1 memory_reserved: 0.548828125

iter: 2 memory_allocated: 0.7610130310058594
iter: 2 memory_reserved: 0.8125

iter: 3 memory_allocated: 1.0092926025390625
iter: 3 memory_reserved: 1.076171875

iter: 4 memory_allocated: 1.2575721740722656
iter: 4 memory_reserved: 1.33984375

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant