Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better FX offload #817

Merged
merged 1 commit into from
Feb 1, 2024
Merged

Better FX offload #817

merged 1 commit into from
Feb 1, 2024

Conversation

Giuseppe5
Copy link
Collaborator

@Giuseppe5 Giuseppe5 commented Jan 31, 2024

A bit more interesting mechanism to offload call_function with FX graph.
We check the device of all inputs, and if the "main device" (e.g., any accelerator that is not cpu or disk) is uniform, nothing happens.
Otherwise we align the input on the main device.

After the computation, the original devices are restored to avoid memory leaks.

@Giuseppe5 Giuseppe5 merged commit fe12f19 into Xilinx:optimum Feb 1, 2024
22 checks passed
Giuseppe5 added a commit that referenced this pull request Feb 6, 2024
* optimum: initial optimum integration

* Refined solution for offloading

* Fix (optimum): clean-up (#802)

* Fix (optimum): dataloader and forward cleanup (#807)

* Fix (optimum): forward pass + fx (#808)

* FX forward, GPTQ, Export (#809)

* Forward pass with fx and pkv

* Restore eval

* Restore quantization

* Experimental export

* Fix GPTQ + Export

* Fix 2GB ONNX export error

* Fix gptq + speedup

* Feat (offload/fx): better buffer/params + call_functional (#816)

* Fix: typo to setting weight handlers

* Feat (optimum): better call_function FX offload (#817)

* Refactored per row quantization. JIT not working (#818)

* Better structure for QDQ weights (#822)

* Fix (export): flag for torch qcdq export (#823)

* Setup: remove optimum folder (#825)

* Add/fix comments

* Fix llm example

* Misc: pre-commit fix

* Fix (graph/equalize): new transpose interface

* Fix (examples/llm): no constant folding for group quant

---------

Co-authored-by: Nick Fraser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant