Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read model weights #91

Merged
merged 6 commits into from
Dec 14, 2024
Merged

Read model weights #91

merged 6 commits into from
Dec 14, 2024

Conversation

shavit
Copy link
Contributor

@shavit shavit commented Apr 20, 2024

Read model weights from safetensors.

Related #86

@shavit shavit marked this pull request as ready for review April 20, 2024 16:41
@shavit shavit marked this pull request as draft April 20, 2024 16:48
  * Separate Safetensor from the weights
  * Rename test tensors to include type
  * Rename ModelWeights to Weights
  * Throw error for unsupported data types
  * Remove model weights from LanguageModel.Configurations
@shavit
Copy link
Contributor Author

shavit commented Apr 22, 2024

Weights are now accessible independently of configurations for flexibility and without MLModel or conversion.

@shavit shavit marked this pull request as ready for review April 22, 2024 19:03
@pcuenca
Copy link
Member

pcuenca commented Apr 24, 2024

That's very cool @shavit! May I ask what use case you have in mind for this?

Edit: Oh, I see it's for your embeddings work. Any highlights on the general architecture you're considering?

@shavit
Copy link
Contributor Author

shavit commented Apr 24, 2024

Yes

  • In the first commit I added weights property to Configurations, but since LanguageModel requires a model, I separated the weights initialization.
  • The weights can go into the Embedding module instead, since the language models in this module are tied to CoreML, and don't need this at the moment.
  • The Safetensor and others can save updated weights into a file.

The plan is to use models directly from local files or the hub, and maybe support more types like mlx/numpy.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool work, thanks a lot for contributing! I only have one suggestion regarding one of the tests.


func testLoadWeightsFromFileURL() async throws {
let repo = "google/bert_uncased_L-2_H-128_A-2"
let modelDir = try await hubApi.snapshot(from: repo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest you use matching globs to only download the two files we need and avoid additional downloads. There's a version that supports multiple patterns.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge, in my opinion! Are you planning any more tweaks?

@shavit
Copy link
Contributor Author

shavit commented Dec 14, 2024

No, that's it.

@pcuenca pcuenca merged commit d656ad4 into huggingface:main Dec 14, 2024
1 check passed
@pcuenca
Copy link
Member

pcuenca commented Dec 14, 2024

Thanks a lot @shavit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants