Follow me on X • 🤗 Hugging Face • 💻 Medium
I am a product-minded machine learning engineer lead, a open source small language model/ LLM researcher, and a blogger.
I lead ML R&D, and prototypes in following products.
- Arbor, which tailors a daily update of your professional topics with AI
- SuperAcc, a banking grade document intelligence SaaS for FIs
I mostly work on LLM data curation/ filtering for pretraining data, small language model training and synthetic data generation.
- AnyClassifier - One Line To Build Zero-Data Classifiers in Minutes, And A Step Towards The First AI ML Engineer
- Textbook quality classifiers with throughput of > 2000 docs/s
- Small language model training from scatch
- GoFormer - Language Model That Plays Go
- Can language model plan?
Huggingface: kenhktsui
Medium: kentsui
Twitter: kenhktsui
Linkedin: Ken Tsui
Gitlab (commits in my job): kenhktsui