Skip to content
View v-prgmr's full-sized avatar
🍉
🍉

Highlights

  • Pro

Block or report v-prgmr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. creating_Phi2_MoE_using_mergekit.md creating_Phi2_MoE_using_mergekit.md
    1
    # microsoft/phi-2 for creating Mixture of Experts (MoE)
    2
    The [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) is a small language model with 2.7billion parameters. Because of its small size, opensource license and thanks to finetuning techqniques like QLoRA, one can (fairly) quickly finetune a base model for performing downstream tasks and creating an expert phi-2 model. It would be interesting to combine the individual experts into a Mixture of Experts (MoE) to make the MoE perform the tasks of the individual experts. Follow the steps below to create your own version of a MoE based out of phi-2.
    3
    - Special mention to [Maxime Labonne](), [Aratako](https://github.com/Aratako), [Paul Ilioaica](https://github.com/paulilioaica) for showing the opensource community that the mergekit can be tweaked to make a MoE out of phi-2 experts.
    4
    - Big shoutout to [Charles O. Goddard](https://github.com/cg123), the author of mergekit for creating and letting us play with [mergekit](https://github.com/arcee-ai/mergekit)
    5
    
                  
  2. mergekit mergekit Public

    Forked from arcee-ai/mergekit

    Tools for merging pretrained large language models.

    Python 19

  3. Object-Detection---SSD-in-Tensorflow Object-Detection---SSD-in-Tensorflow Public

    Single Shot Multibox Detector implementation in Tensorflow for custom object detection.

    C++ 2 1