Skip to content

2080ti咋玩,模型T5 加载一半显存就炸了 #166

Closed as not planned
Closed as not planned
@xujin1184104394

Description

为何开了sequence parallelism 但是每张卡还需要单独加载整个T5语言模型? 开了8卡也同样炸显存。。。。

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions