From 23bceb6c08a45adae41500d93055e44e70051121 Mon Sep 17 00:00:00 2001 From: zR <2448370773@qq.com> Date: Tue, 19 Dec 2023 17:36:21 +0800 Subject: [PATCH 1/3] Change model License --- MODEL_LICENSE | 21 +++++++++------------ README_zh.md | 18 +++++++++--------- composite_demo/main.py | 11 ++++++++--- 3 files changed, 26 insertions(+), 24 deletions(-) diff --git a/MODEL_LICENSE b/MODEL_LICENSE index 77cef3a1..1262d17d 100644 --- a/MODEL_LICENSE +++ b/MODEL_LICENSE @@ -8,9 +8,9 @@ The CogVLM License 2. License Grant -Subject to the terms and conditions of this License, the Licensor hereby grants to you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free copyright license to use the Software. - -The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. +Under the terms and conditions of this license, the Licensor hereby grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free copyright license. +This license permits you to use all open-source models in this repository for academic research free. Users who wish to use the models for commercial purposes must register [here](https://open.bigmodel.cn/mla/form). +Registered users may use the models for commercial activities free of charge, but must comply with all terms and conditions of this license. 3. Restriction @@ -32,9 +32,10 @@ This license shall be governed and construed in accordance with the laws of Peop Note that the license is subject to update to a more comprehensive version. For any questions related to the license and copyright, please contact us at license@zhipuai.cn. -7. Llama2 and EVA-CLIP2 license +7. Commercialization Application -For CogVLM-17B version, Llama2 license (https://ai.meta.com/llama/license/) and EVA license (MIT, https://github.com/baaivision/EVA/blob/master/LICENSE) are applied. +For commercial activities involving the models mentioned in this repository, registration and approval must be completed [here](https://open.bigmodel.cn/mla/form). +Without this, you are not allowed to use this model for commercial purposes. 1. 定义 @@ -45,8 +46,8 @@ For CogVLM-17B version, Llama2 license (https://ai.meta.com/llama/license/) and 2. 许可授予 根据本许可的条款和条件,许可方特此授予您非排他性、全球性、不可转让、不可再许可、可撤销、免版税的版权许可。 - -上述版权声明和本许可声明应包含在本软件的所有副本或重要部分中。 +本许可允许您免费使用本仓库中的所有开源模型进行学术研究,对于希望将模型用于商业目的的用户,需在[这里](https://open.bigmodel.cn/mla/form)完成登记。 +经过登记的用户可以免费使用本模型进行商业活动,但必须遵守本许可的所有条款和条件。 3.限制 @@ -66,8 +67,4 @@ For CogVLM-17B version, Llama2 license (https://ai.meta.com/llama/license/) and 本许可受中华人民共和国法律管辖并按其解释。 因本许可引起的或与本许可有关的任何争议应提交北京市海淀区人民法院。 -请注意,许可证可能会更新到更全面的版本。 有关许可和版权的任何问题,请通过 license@zhipuai.cn 与我们联系。 - -7. Llama2 和 EVA-CLIP2 许可 - -针对 CogVLM-17B 版本, Llama2 许可条件 (https://ai.meta.com/llama/license/) 和 EVA 许可条件 (MIT, https://github.com/baaivision/EVA/blob/master/LICENSE) 同时适用于模型权重。 \ No newline at end of file +请注意,许可证可能会更新到更全面的版本。 有关许可和版权的任何问题,请通过 license@zhipuai.cn 与我们联系。 \ No newline at end of file diff --git a/README_zh.md b/README_zh.md index 958cf2bd..93dc1c94 100644 --- a/README_zh.md +++ b/README_zh.md @@ -16,13 +16,13 @@ Agent、Grounding等多种能力。

CogVLM

📖 Paper: CogVLM: Visual Expert for Pretrained Language Models

-

CogVLM 是一个强大的开源视觉语言模型(VLM)。CogVLM-17B拥有100亿的视觉参数和70亿的语言参数,支持490*490分辨率的图像理解和多轮对话。.

-

CogVLM-17B 17B在10个经典的跨模态基准测试中取得了最先进的性能包括NoCaps, Flicker30k captioning, RefCOCO, RefCOCO+, RefCOCOg, Visual7W, GQA, ScienceQA, VizWiz VQA 和 TDIUC 基准测试.

+

CogVLM 是一个强大的开源视觉语言模型(VLM)。CogVLM-17B拥有100亿的视觉参数和70亿的语言参数,支持490*490分辨率的图像理解和多轮对话。

+

CogVLM-17B 17B在10个经典的跨模态基准测试中取得了最先进的性能包括NoCaps, Flicker30k captioning, RefCOCO, RefCOCO+, RefCOCOg, Visual7W, GQA, ScienceQA, VizWiz VQA 和 TDIUC 基准测试。

CogAgent

📖 Paper: CogAgent: A Visual Language Model for GUI Agents

-

CogAgent 是一个基于CogVLM改进的开源视觉语言模型。CogAgent-18B拥有110亿的视觉参数和70亿的语言参数, 支持1120*1120分辨率的图像理解。. 在CogVLM的能力之上,它进一步拥有了GUI图像Agent的能力。.

+

CogAgent 是一个基于CogVLM改进的开源视觉语言模型。CogAgent-18B拥有110亿的视觉参数和70亿的语言参数, 支持1120*1120分辨率的图像理解。在CogVLM的能力之上,它进一步拥有了GUI图像Agent的能力。

CogAgent-18B 在9个经典的跨模态基准测试中实现了最先进的通用性能,包括 VQAv2, OK-VQ, TextVQA, ST-VQA, ChartQA, infoVQA, DocVQA, MM-Vet, 和 POPE 测试基准。它在包括AITW和Mind2Web在内的GUI操作数据集上显著超越了现有的模型。

@@ -69,15 +69,15 @@ Agent、Grounding等多种能力。 Agent、Grounding等多种能力。 - **News**: ```2023/12/8```: - 我们已将cogvlm-grounding-generalist的检查点更新为cogvlm-grounding-generalist-v1.1,训练过程中增加了图像增强,因此更加稳健。查看[详情](#introduction-to-cogvlm). + 我们已将cogvlm-grounding-generalist的检查点更新为cogvlm-grounding-generalist-v1.1,训练过程中增加了图像增强,因此更加稳健。查看[详情](#introduction-to-cogvlm)。 -- **News**: ```2023/12/7``` CogVLM现在支持**4-bit**量化!您只需要11GB的GPU内存就可以进行推理!查看[详情](#CLI). +- **News**: ```2023/12/7``` CogVLM现在支持**4-bit**量化!您只需要11GB的GPU内存就可以进行推理!查看[详情](#CLI)。 -- **News**: ```2023/11/20```我们已将cogvlm-chat的检查点更新为cogvlm-chat-v1.1,统一了聊天和VQA的版本,并刷新了各种数据集上的SOTA。查看[详情](#introduction-to-cogvlm) +- **News**: ```2023/11/20```我们已将cogvlm-chat的检查点更新为cogvlm-chat-v1.1,统一了聊天和VQA的版本,并刷新了各种数据集上的SOTA,查看[详情](#introduction-to-cogvlm)。 -- **News**: ```2023/11/20``` 我们在🤗Huggingface上发布了 **[cogvlm-chat](https://huggingface.co/THUDM/cogvlm-chat-hf)**, **[cogvlm-grounding-generalist](https://huggingface.co/THUDM/cogvlm-grounding-generalist-hf)/[base](https://huggingface.co/THUDM/cogvlm-grounding-base-hf)**, **[cogvlm-base-490](https://huggingface.co/THUDM/cogvlm-base-490-hf)/[224](https://huggingface.co/THUDM/cogvlm-base-224-hf)**. 使用transformers 快速 [推理](#situation-22-cli-huggingface-version) +- **News**: ```2023/11/20``` 我们在🤗Huggingface上发布了 **[cogvlm-chat](https://huggingface.co/THUDM/cogvlm-chat-hf)**, **[cogvlm-grounding-generalist](https://huggingface.co/THUDM/cogvlm-grounding-generalist-hf)/[base](https://huggingface.co/THUDM/cogvlm-grounding-base-hf)**, **[cogvlm-base-490](https://huggingface.co/THUDM/cogvlm-base-490-hf)/[224](https://huggingface.co/THUDM/cogvlm-base-224-hf)**,使用transformers 快速 [推理](#situation-22-cli-huggingface-version)。 -- ```2023/10/27``` CogVLM双语版本已经在线上可用!欢迎[试用](https://chatglm.cn/) +- ```2023/10/27``` CogVLM双语版本已经在线上可用!欢迎[试用](https://chatglm.cn/)。 - ```2023/10/5``` CogVLM-17B v1.0 发布。 @@ -87,7 +87,7 @@ Agent、Grounding等多种能力。 * 点击此处进入 [CogVLM & CogAgent Web Demo](http://36.103.203.44:7861/)。 -如果您需要使用代理和接地功能,请参考[Cookbook - Task Prompts](#task-prompts) +如果您需要使用代理和接地功能,请参考[Cookbook - Task Prompts](#task-prompts)。 ### 选项2:自行部署CogVLM / CogAgent diff --git a/composite_demo/main.py b/composite_demo/main.py index 784a0bdf..714725f0 100644 --- a/composite_demo/main.py +++ b/composite_demo/main.py @@ -29,6 +29,14 @@ """ import streamlit as st + +st.set_page_config( + page_title="CogVLM & CogAgent Demo", + page_icon=":robot:", + layout='centered', + initial_sidebar_state='expanded', +) + from enum import Enum from utils import encode_file_to_base64, templates_agent_cogagent, template_grounding_cogvlm import demo_chat_cogvlm, demo_agent_cogagent, demo_chat_cogagent @@ -84,13 +92,10 @@ class Mode(str, Enum): if tab == Mode.CogVLM_Chat.value and grounding: selected_template_grounding_cogvlm = st.selectbox("Template For Grounding", template_grounding_cogvlm) - if tab == Mode.CogAgent_Agent.value: with st.sidebar: selected_template_agent_cogagent = st.selectbox("Template For Agent", templates_agent_cogagent) - - if clear_history or retry: prompt_text = "" From 0fbf8cd5af86e4c3d5ec4a38764c7ad0e86e6935 Mon Sep 17 00:00:00 2001 From: zR <2448370773@qq.com> Date: Tue, 19 Dec 2023 17:51:34 +0800 Subject: [PATCH 2/3] Update MODEL_LICENSE --- MODEL_LICENSE | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MODEL_LICENSE b/MODEL_LICENSE index 1262d17d..b6f087f2 100644 --- a/MODEL_LICENSE +++ b/MODEL_LICENSE @@ -11,6 +11,7 @@ The CogVLM License Under the terms and conditions of this license, the Licensor hereby grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free copyright license. This license permits you to use all open-source models in this repository for academic research free. Users who wish to use the models for commercial purposes must register [here](https://open.bigmodel.cn/mla/form). Registered users may use the models for commercial activities free of charge, but must comply with all terms and conditions of this license. +The license notice shall be included in all copies or substantial portions of the Software. 3. Restriction @@ -48,6 +49,7 @@ Without this, you are not allowed to use this model for commercial purposes. 根据本许可的条款和条件,许可方特此授予您非排他性、全球性、不可转让、不可再许可、可撤销、免版税的版权许可。 本许可允许您免费使用本仓库中的所有开源模型进行学术研究,对于希望将模型用于商业目的的用户,需在[这里](https://open.bigmodel.cn/mla/form)完成登记。 经过登记的用户可以免费使用本模型进行商业活动,但必须遵守本许可的所有条款和条件。 +上述版权声明和本许可声明应包含在本软件的所有副本或重要部分中。 3.限制 From e4e23039591aca6a023660996517da692b9f5309 Mon Sep 17 00:00:00 2001 From: zR <2448370773@qq.com> Date: Tue, 19 Dec 2023 18:08:24 +0800 Subject: [PATCH 3/3] Update MODEL_LICENSE --- MODEL_LICENSE | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/MODEL_LICENSE b/MODEL_LICENSE index b6f087f2..a61c4b83 100644 --- a/MODEL_LICENSE +++ b/MODEL_LICENSE @@ -33,10 +33,10 @@ This license shall be governed and construed in accordance with the laws of Peop Note that the license is subject to update to a more comprehensive version. For any questions related to the license and copyright, please contact us at license@zhipuai.cn. -7. Commercialization Application +7. Llama2 and EVA-CLIP2 License + +For CogVLM-17B version, Llama2 license conditions (https://ai.meta.com/llama/license/) and EVA license conditions (MIT, https://github.com/baaivision/EVA/blob/master/LICENSE) Also applies to model weights. -For commercial activities involving the models mentioned in this repository, registration and approval must be completed [here](https://open.bigmodel.cn/mla/form). -Without this, you are not allowed to use this model for commercial purposes. 1. 定义 @@ -69,4 +69,8 @@ Without this, you are not allowed to use this model for commercial purposes. 本许可受中华人民共和国法律管辖并按其解释。 因本许可引起的或与本许可有关的任何争议应提交北京市海淀区人民法院。 -请注意,许可证可能会更新到更全面的版本。 有关许可和版权的任何问题,请通过 license@zhipuai.cn 与我们联系。 \ No newline at end of file +请注意,许可证可能会更新到更全面的版本。 有关许可和版权的任何问题,请通过 license@zhipuai.cn 与我们联系。 + +7. Llama2 和 EVA-CLIP2 许可 + +针对 CogVLM-17B 版本, Llama2 许可条件 (https://ai.meta.com/llama/license/) 和 EVA 许可条件 (MIT, https://github.com/baaivision/EVA/blob/master/LICENSE) 同时适用于模型权重。 \ No newline at end of file