Skip to content

Actions: deepspeedai/DeepSpeed

hpu-gaudi2

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,425 workflow runs
1,425 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

hpu-gaudi2
hpu-gaudi2 #1698: Scheduled
February 22, 2025 00:11 In progress master
February 22, 2025 00:11 In progress
Improve overflow handling in ZeRO
hpu-gaudi2 #1697: Pull request #6976 synchronize by tjruwase
February 21, 2025 22:51 56m 59s olruwase/ds_5241
February 21, 2025 22:51 56m 59s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1696: Pull request #7046 synchronize by loadams
February 21, 2025 20:59 58m 30s olruwase/update_nvme_offload_states
February 21, 2025 20:59 58m 30s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1695: Pull request #7046 synchronize by tjruwase
February 21, 2025 12:06 57m 2s olruwase/update_nvme_offload_states
February 21, 2025 12:06 57m 2s
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
hpu-gaudi2 #1694: Pull request #6553 synchronize by gyou2021
February 21, 2025 06:01 Action required gyou2021:configurable_autoTP
February 21, 2025 06:01 Action required
hpu-gaudi2
hpu-gaudi2 #1693: Scheduled
February 21, 2025 00:11 2h 4m 12s master
February 21, 2025 00:11 2h 4m 12s
Bug Fix for offload_states API
hpu-gaudi2 #1692: Pull request #7050 synchronize by tohtana
February 20, 2025 18:23 1h 52m 17s U-rara:bugfix_reload_states
February 20, 2025 18:23 1h 52m 17s
Fix, pipeline model with moe cause error when send grad
hpu-gaudi2 #1691: Pull request #7055 synchronize by hwchen2017
February 20, 2025 18:12 1h 57m 43s wukong1992:fix-pipe-act-grad-comm
February 20, 2025 18:12 1h 57m 43s
Bug Fix for offload_states API
hpu-gaudi2 #1689: Pull request #7050 synchronize by U-rara
February 20, 2025 15:30 59m 44s U-rara:bugfix_reload_states
February 20, 2025 15:30 59m 44s
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
hpu-gaudi2 #1688: Pull request #6553 synchronize by loadams
February 20, 2025 15:29 Action required gyou2021:configurable_autoTP
February 20, 2025 15:29 Action required
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1683: Pull request #7054 synchronize by hwchen2017
February 20, 2025 05:49 56m 30s wukong1992:fix-bf16-moe-refresh-params
February 20, 2025 05:49 56m 30s
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1681: Pull request #7054 synchronize by wukong1992
February 20, 2025 03:09 Action required wukong1992:fix-bf16-moe-refresh-params
February 20, 2025 03:09 Action required
hpu-gaudi2
hpu-gaudi2 #1680: Scheduled
February 20, 2025 00:11 2h 30m 10s master
February 20, 2025 00:11 2h 30m 10s
Training multiple models
hpu-gaudi2 #1679: Pull request #7018 synchronize by loadams
February 19, 2025 23:37 57m 30s olruwase/zero_multi_models
February 19, 2025 23:37 57m 30s
add autoTP training zero2 tests
hpu-gaudi2 #1678: Pull request #7049 synchronize by tjruwase
February 19, 2025 18:50 1h 11m 53s inkcherry:minor_fix_version2
February 19, 2025 18:50 1h 11m 53s
Fix, bf16 optimizer remove dup loop
hpu-gaudi2 #1677: Pull request #7054 synchronize by tjruwase
February 19, 2025 18:44 57m 4s wukong1992:fix-bf16-moe-refresh-params
February 19, 2025 18:44 57m 4s
Enable ZeRO set/get APIs for NVMe offload
hpu-gaudi2 #1676: Pull request #7046 synchronize by loadams
February 19, 2025 17:47 57m 19s olruwase/update_nvme_offload_states
February 19, 2025 17:47 57m 19s
Variable batch size and LR scheduler
hpu-gaudi2 #1674: Pull request #7020 synchronize by bm-synth
February 19, 2025 15:44 Action required bm-synth:variable_batch_size_and_lr
February 19, 2025 15:44 Action required
Fix, pipeline model with moe cause error when send grad
hpu-gaudi2 #1673: Pull request #7055 opened by wukong1992
February 19, 2025 11:53 Action required wukong1992:fix-pipe-act-grad-comm
February 19, 2025 11:53 Action required
Enabled configurable auto Tensor Parallelism (TP) for the inference of diverse models
hpu-gaudi2 #1671: Pull request #6553 synchronize by delock
February 19, 2025 07:27 Action required gyou2021:configurable_autoTP
February 19, 2025 07:27 Action required
Add DeepseekV3 AutoTP.
hpu-gaudi2 #1670: Pull request #7045 synchronize by Yejing-Lai
February 19, 2025 02:05 Action required Yejing-Lai:lyj/deepseekv3
February 19, 2025 02:05 Action required