-
University of Science and Technology of China
- Anhui, Hefei
Highlights
- Pro
Stars
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Efficient computing methods developed by Huawei Noah's Ark Lab
This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"
Generative Representational Instruction Tuning
UniGen: A Unified Framework for Dataset Generation via Large Language Model
A huggingface transformers implementation of "Transformer Memory as a Differentiable Search Index"
Finetune mistral-7b-instruct for sentence embeddings
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
Search Engines with Autoregressive Language models
Codebase for [Paper] Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023
Code repo for the ICML 2024 paper "Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation"
Tevatron - A flexible toolkit for neural retrieval research and development.
Source code of DRAGIN, ACL 2024 main conference Long Paper
MambaOut: Do We Really Need Mamba for Vision?