Gemini is a multimodal large language model released by Google.
Gemini is the next leap from Google PaLM 2 models, which was release May 2023. Major markers of Gemini's improvements over PaLM 2 include:
- Multi Modality support
- Dramatic improvements in Text benchmarks
(General: MMLU; Reasoning: Big-Bench Hard, DROP, HellaSwag; Math: GSM8K, MATH; Code: HumanEval, Natural2Code;)
- Record setting benchmarks in Multimodality
(Image: MMMU, VQAv2, TextVQA, DocVQA, Infographic VQA, MathVista; Video: VATEX, Perception Test MCQA; Audio: CoVoST 2, FLEURS;)