Skip to content

Tags: Aliang-CN/DeepSpeed

Tags

v0.15.1

Toggle v0.15.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Handle an edge case where `CUDA_HOME` is not defined on ROCm systems (m…

…icrosoft#6488)

* Handles an edge case when building `gds` where `CUDA_HOME` is not
defined on ROCm systems

v0.15.0

Toggle v0.15.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix torch check (microsoft#6402)

v0.14.5

Toggle v0.14.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Allow accelerator to instantiate the device (microsoft#5255)

when instantiating torch.device for HPU it cannot be fed with HPU:1
annotation, but only "HPU".
moving the logic to accelerator will allow to solve this issue, with
single line change.

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Joe Mayer <114769929+jomayeri@users.noreply.github.com>

v0.14.4

Toggle v0.14.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[XPU] support op builder from intel_extension_for_pytorch kernel path (

…microsoft#5425)

#Motivation
From our next release, xpu DeepSpeed related kernels would be put into
intel_extension_for_pytorch. This PR is to add new op builders and use
kernel path from intel_extension_for_pytorch. More ops like MOE and WOQ
will be added.

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

v0.14.3

Toggle v0.14.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Monitor was always enabled causing performance degradation (microsoft…

…#5633)

The Boolean expression for the monitor to be enabled was incorrect, as
instead of using the `enabled` field, it used the comet configuration
object, making the expression always True.

This caused performance degradation (we've observed ~10% drop) as it
erroneously invoked the events logging flow along with the expensive
calculation of `loss.mean().item()`.

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

v0.14.2

Toggle v0.14.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update PyTest torch version to match PyTorch latest official (2.3.0) (m…

…icrosoft#5454)

v0.14.1

Toggle v0.14.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix the FP6 kernels compilation problem on non-Ampere GPUs. (microsof…

…t#5333)

Refine the guards of FP6 kernel compilation. Fix the `undefined symbol`
problem of FP6 kernels on non-Ampere architectures.

Related issue: microsoft/DeepSpeed-MII#443.

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>

v0.14.0

Toggle v0.14.0's commit message
Update version.txt

v0.13.5

Toggle v0.13.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix fused_qkv model accuracy issue (microsoft#5217)

Fused_qkv model can not correctly choose the fused_qkv type. Need to
update the module_name_matches.

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

v0.13.4

Toggle v0.13.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add script to check for `--extra-index-url` (microsoft#5184)

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>