Posted 2023-03-07Updated 2023-03-07My Note15 minutes read (About 2303 words)

My-Note-3 | Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation

Tags：ACL 2021 ；NLG ；Subword

背景：

基于 pretrain-finetune 的模型在 pretrain 时使用统一的语料库 (one-size-fits-all vocabulary)，而 finetune 时的语料库则随具体任务而不同。
pretrain 时的语料库和 finetune 时的语料库中的 subwords 分布往往会有所不同，这导致了:

pretrain 时学到的 subwords 划分会更加细粒度从而覆盖其更大的语料库，但这会使得 finetune 时的 exposure bias 更严重且计算开销更大
在 pretrain 时不常见，但在 finetune 时常见的 token 可能会被错误划分为 subwords，导致语义保留不佳

本文：

目标：改善上下游任务中由于 subwords 分布不同导致的下游任务中部分token表示不佳 (under-represented)
方法：单独训练一个 embedding generator，输入一个 token，根据其 subwords 和 hyperwords 的词向量来得到该token的词向量，改善下游任务中 under-represented token 的词向量表达
模型：AVG-EG; ATT-EG; PATT-EG
创新点：(1) 从克服 subwords 分布差异 的角度出发，有效改善了模型; (2) 计算效率高，在简单地单独训练后可以即插即用 (plug-and-play) ; (3) 对 under-represented token 的词向量表示进行了多个方法的探索，包括平均，基于注意力机制和基于语素 (morphemes) 信息的方法。

Posted 2023-02-15Updated 2023-02-15a few seconds read (About 0 words)

note-2

Posted 2023-02-13Updated 2023-03-07My Note9 minutes read (About 1319 words)

My-Note-1 | Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Tags：ACL 2021 ；Contrastive Learning ；MNMT ；Aligned Augmentation

原文链接

背景

多语言翻译模型有以下优点：

更高效且更容易部署
在不同语言之间共享参数会方便知识转移 (knowledge transfer)

但已有的多语言翻译模型已有以下不足：

效果不如对应语言的双语模型
大多关注于以英语为中心 (English-centric) 的翻译任务

本文：

目标：(1)提升多语言翻译模型的性能；(2)不局限于以英语为中心
方法：(1) 使用对比学习，将不同语言中的同义句子拉近，不同义句子拉远；(2) 使用Aligned Augmentation (AA) 的数据增强方式来获取正负样本
模型：mRASP2
创新点：(1) 利用对比学习将不同语言的同义词向量拉近，即拉到了同一个空间；(2) 利用了单语语料 (monolingual)，并将 AA 拓展到了单语的预训练任务上

Posted 2022-12-30Updated 1985-10-26a minute read (About 123 words)

Hello World

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Quick Start

Create a new post

1	$ hexo new "My New Post"

More info: Writing

Run server

1	$ hexo server

More info: Server

Generate static files

1	$ hexo generate

More info: Generating

Deploy to remote sites

1	$ hexo deploy

More info: Deployment

My-Note-3 | Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation

背景：

本文：

note-2

My-Note-1 | Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

背景

本文：

Hello World

Quick Start

Create a new post

Run server

Generate static files

Deploy to remote sites

Categories

Recents

Archives

Tags

Subscribe for updates

follow.it