Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: huggingface/transformers
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: huggingface/transformers
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v4.48-release
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 13 commits
  • 340 files changed
  • 9 contributors

Commits on Jan 10, 2025

  1. v4.48-release

    ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    e39c9f7 View commit details
    Browse the repository at this point in the history
  2. ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests (

    #35459)
    
    * Introduce 5 integration tests for the 4 model classes + torch export
    
    * ModernBert: reuse GemmaRotaryEmbedding via modular
    
    * Revert #35589, keep rope_kwargs; rely on them in modular_modernbert
    
    * Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"
    
    This reverts commit 11b44b9.
    
    * Don't set rope_kwargs; override 'self.rope_init_fn' call instead
    tomaarsen authored and ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    42b8e79 View commit details
    Browse the repository at this point in the history
  3. Add Moonshine (#34784)

    * config draft
    
    * full encoder forward
    
    * full decoder forward
    
    * fix sdpa and FA2
    
    * fix sdpa and FA2
    
    * moonshine model
    
    * moonshine model forward
    
    * fix attention with past_key_values
    
    * add MoonshineForConditionalGeneration
    
    * fix cache handling and causality for cross attention
    
    * no causal attention mask for the encoder
    
    * model addition (imports etc)
    
    * small nit
    
    * nits
    
    * Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py
    
    Co-authored-by: Joshua Lochner <admin@xenova.com>
    
    * add rope_theta
    
    * nits
    
    * model doc
    
    * Update src/transformers/models/auto/configuration_auto.py
    
    Co-authored-by: Joshua Lochner <admin@xenova.com>
    
    * imports
    
    * add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES
    
    * updates modular
    
    * make
    
    * make fix-copies
    
    * ruff check examples fix
    
    * fix check_modular_conversion
    
    * nit
    
    * nits
    
    * nits
    
    * copied from -> imports
    
    * imports fix
    
    * integrate attention refacto
    
    * modular edge case
    
    * remove encoder
    
    * convolutions params in config
    
    * run modular_model_converter
    
    * make
    
    * Update docs/source/en/model_doc/moonshine.md
    
    Co-authored-by: Joshua Lochner <admin@xenova.com>
    
    * MoonshineModelTest
    
    * correct typo
    
    * make style
    
    * integration tests
    
    * make
    
    * modular convert
    
    * name conversion update (up_proj -> fc1 etc)
    
    * update config
    
    * update MLP
    
    * update attention
    
    * update encoder layer
    
    * update decoder layer
    
    * update convolutions parameters
    
    * update encoder
    
    * remove INPUTS_DOCSTRING
    
    * update decoder
    
    * update conditional generation
    
    * update pretrained model
    
    * imports
    
    * modular converted
    
    * update doc
    
    * fix
    
    * typo
    
    * update doc
    
    * update license
    
    * update init
    
    * split config in file
    
    * two classes for MLP
    
    * attention from GLM
    
    * from GlmRotaryEmbedding
    
    * split MLP
    
    * apply arthur's review suggestions
    
    * apply arthur's review suggestions
    
    * apply arthur's review suggestions
    
    * auto feature extractor
    
    * convert modular
    
    * fix + make
    
    * convert modular
    
    * make
    
    * unsplit config
    
    * use correct checkpoint
    
    * wrap generate
    
    * update tests
    
    * typos
    
    * make
    
    * typo
    
    * update doc
    
    ---------
    
    Co-authored-by: Joshua Lochner <admin@xenova.com>
    2 people authored and ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    af2d7ca View commit details
    Browse the repository at this point in the history
  4. [test-all]

    ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    8ce1e95 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d6f446f View commit details
    Browse the repository at this point in the history
  6. push a fix for now

    ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    7cf6230 View commit details
    Browse the repository at this point in the history
  7. Fix flex_attention in training mode (#35605)

    * fix flex
    
    * add test
    
    * style
    Cyrilvallez authored and ArthurZucker committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    59e28c3 View commit details
    Browse the repository at this point in the history
  8. [WIP] Emu3: add model (#33770)

    * model can convert to HF and be loaded back
    
    * nit
    
    * works in single batch generation but hallucinates
    
    * use the image tokens
    
    * add image generation
    
    * now it works
    
    * add tests
    
    * update
    
    * add modulare but it doesn't work for porting docstring :(
    
    * skip some tests
    
    * add slow tests
    
    * modular removed the import?
    
    * guess this works
    
    * update
    
    * update
    
    * fix copies
    
    * fix test
    
    * fix copies
    
    * update
    
    * docs
    
    * fix tests
    
    * last fix tests?
    
    * pls
    
    * repo consistency
    
    * more style
    
    * style
    
    * remove file
    
    * address comments
    
    * tiny bits
    
    * update after the new modular
    
    * fix tests
    
    * add one more cond in check attributes
    
    * decompose down/up/mid blocks
    
    * allow static cache generation in VLMs
    
    * nit
    
    * fix copies
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/emu3.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * fix VAE upsampling
    
    * Update src/transformers/models/emu3/modular_emu3.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * address comments
    
    * state overwritten stuff explicitly
    
    * fix copies
    
    * add the flag for flex attn
    
    ---------
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people committed Jan 10, 2025
    Configuration menu
    Copy the full SHA
    6bc0fbc View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2025

  1. [Phi] bias should be True (#35650)

    bias should be True
    ArthurZucker committed Jan 20, 2025
    Configuration menu
    Copy the full SHA
    612bfd0 View commit details
    Browse the repository at this point in the history
  2. Fix condition when GA loss bug fix is not performed (#35651)

    * fix condition when GA loss bug fix is not performed
    
    * max loss diff is 2.29
    
    * fix typo
    
    * add an extra validation that loss should not vary too much
    techkang authored and ArthurZucker committed Jan 20, 2025
    Configuration menu
    Copy the full SHA
    b00807f View commit details
    Browse the repository at this point in the history
  3. Patch moonshine (#35731)

    * udpate expected logits for T4 runners
    
    * update doc
    
    * correct order of the args for better readability
    
    * remove generate wrap
    
    * convert modular
    eustlb authored and ArthurZucker committed Jan 20, 2025
    Configuration menu
    Copy the full SHA
    3b09464 View commit details
    Browse the repository at this point in the history
  4. v4.48.1

    ArthurZucker committed Jan 20, 2025
    Configuration menu
    Copy the full SHA
    785b5cf View commit details
    Browse the repository at this point in the history
  5. revert my changes

    ArthurZucker committed Jan 20, 2025
    Configuration menu
    Copy the full SHA
    2e752ea View commit details
    Browse the repository at this point in the history
Loading