index.html~

---
title: ""
date: 
layout: mainpage
categories: 
tags: 
- home
published: true
comments: 
---
<p>
Since November 2019, I'm professor at <a href="https://www.espci.fr">ESPCI (École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris)</a> and
</p>
<ul class="org-ul">
<li>my research affiliation is in the <a href="https://www.lamsade.dauphine.fr/wp/miles/">Miles project</a> of the <a href="https://www.lamsade.dauphine.fr/">LAMSADE (Laboratoire d'Analyse et de Modélisation de Systèmes pour l’Aide à la Décision)</a> in <a href="https://www.dauphine.psl.eu/">Paris-Dauphine University</a> and <a href="https://www.psl.eu/">PSL (Paris Sciences &amp; Lettres)</a>.</li>
<li>Head of  the <a href="https://psl.eu/en/programmes-gradues/programme-data">DATA program of PSL</a>.</li>
<li>Co-head of the <a href="https://www.lamsade.dauphine.fr/en/research/groups/data-science.html">Data Science group of the LAMSADE</a> with <a href="https://www.lamsade.dauphine.fr/~elhaddad/">Joyce El Haddad</a>.</li>
</ul>


<div id="outline-container-org6000ff2" class="outline-2">
<h2 id="org6000ff2">News</h2>
<div class="outline-text-2" id="text-org6000ff2">
<ul class="org-ul">
<li><a href="https://allauzen.github.io/research/positions">Research positions</a>: For September 2024, two postions on Frugality in machine learning and speech processing
<ul class="org-ul">
<li>1 PhD position</li>
<li>1 postdoc position</li>
</ul></li>
</ul>


<ul class="org-ul">
<li>LeBenchmark 2.0 is available: more data, two additional tasks. <a href="https://arxiv.org/abs/2309.05472">Check the preprint</a>.</li>

<li>With the same team as the ICML 2023 paper (see below), we published a pre-print paper on <a href="https://arxiv.org/abs/2309.16883">Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing</a>.</li>

<li>Our paper "<a href="https://arxiv.org/abs/2305.16173">Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration</a>" is  at ICML 2023 ! Congratulations to my colleagues: Blaise Delattre, Quentin Barthélemy and  Alexandre Araujo.</li>

<li>Our paper on <a href="https://arxiv.org/pdf/2112.08458.pdf">Curriculum learning for data-driven modeling of dynamical systems</a> is now published in the  European Physical Journal.</li>

<li>With Alex Araujo, Aaron Havens, Blaise Delattre, and Bin Hu, we have a paper accepted at ICLR 2023 (in the top 25%): <a href="https://openreview.net/forum?id=k71IGLC8cfc">A Unified Algebraic Perspective on Lipschitz Neural Networks</a></li>

<li>A very nice web-page on <a href="https://soda-inria.github.io/ken_embeddings/">Relational Data Embeddings by Alexis Cvetkov</a>.</li>
</ul>

<a href="https://soda-inria.github.io/ken_embeddings/">
<img src="/assets/figs/entity_types_with_names.png" alt="Entity mapping" style="width:300px; margin:0px auto;display:block"/>
</a>


<ul class="org-ul">
<li>With Alexis Cvetkov and Gaël Varoquaux from the DirtyData team, we
have a paper on <a href="https://hal.archives-ouvertes.fr/hal-03647434/file/final.pdf">Analytics on Non-Normalized Data Sources: More
Learning, Rather Than More Cleaning</a>.</li>
</ul>

<a href="https://hal.archives-ouvertes.fr/hal-03647434/file/final.pdf">
<img src="/assets/figs/entity.png" alt="Entity mapping" style="width:300px; margin:0px auto;display:block"/>
</a>


<ul class="org-ul">
<li>The E-SSL ANR Project is accepted and will start soon. E-SSL stands
for <b><b>"Efficient Self-Supervised Learning for Inclusive and
Innovative Speech Technologies"</b></b>. Bravo to Titouan Parcollet, the
LIA and the LIG ! <b><b>A funded PhD position on "Fair and Inclusive
Self-Supervised Learning for Speech"</b></b> will be co-supervised between
Paris (LAMSADE) and Grenoble (LIG).</li>

<li>With Laurent Meunier, Blaise Delattre et Alex Araujo, we have a paper at ICML 2022:  <a href="https://arxiv.org/abs/2110.12690">A Dynamical System Perspective for Lipschitz Neural Networks</a>.</li>

<li>with the incredible team of LeBenchmark (from Grenoble and Avignon), we have a paper at <b>NeurIPS 2021</b> in the <i>datasets and benchmarks track</i> ! Visit  <a href="https://github.com/LeBenchmark/github">our repos</a>.</li>
</ul>


<div id="table-of-contents" role="doc-toc">
<h2>Table of Contents</h2>
<div id="text-table-of-contents" role="doc-toc">
<ul>
<li><a href="#org6000ff2">News</a></li>
<li><a href="#org0befa5c">Publication</a></li>
<li><a href="#orgf8d1728">Main research interest</a></li>
<li><a href="#org1e50265">Older news</a></li>
<li><a href="#org3715ab8">Talks on deep-learning for NLP</a></li>
<li><a href="#org58727e3">Contact</a></li>
</ul>
</div>
</div>
</div>
</div>


<div id="outline-container-org0befa5c" class="outline-2">
<h2 id="org0befa5c">Publication</h2>
<div class="outline-text-2" id="text-org0befa5c">
<p>
See <a href="http://scholar.google.com/citations?hl=en&amp;user=B2-gXkkAAAAJ">my Google Scholar page</a>
</p>
</div>
</div>

<div id="outline-container-orgf8d1728" class="outline-2">
<h2 id="orgf8d1728">Main research interest</h2>
<div class="outline-text-2" id="text-orgf8d1728">
<p>
Professor at ESPCI and Researcher in MILES Team of the LAMSADE, my research topics are Natural Language Processing and  Deep learning for physical data
</p>
</div>
</div>


<div id="outline-container-org1e50265" class="outline-2">
<h2 id="org1e50265">Older news</h2>
<div class="outline-text-2" id="text-org1e50265">
<ul class="org-ul">
<li>With <a href="https://smontariol.github.io/">Syrielle Montariol</a>, we have a paper at ACL 2021: <i>Measure and Evaluation of Semantic Divergence across Two Languages</i>. The camera ready version will be soon available.</li>

<li><a href="https://gdr-tal.ls2n.fr/etal-2021/">The summer school ETAL</a> (from the GDR TAL of CNRS) will take place in Lannion (14 to 18 of June 2021). I'm very happy to teach two courses. Check out the program.</li>

<li><a href="https://arxiv.org/abs/2104.11462">LeBenchmark</a> is out:  A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech. Comming soon at Interspeech 2021.</li>

<li><b><b>FlauBERT</b></b> and <b><b>Flue</b></b> are  now available: <a href="https://github.com/getalp/Flaubert">check the github repository</a> or read <a href="https://arxiv.org/abs/1912.05372">our paper</a> on arxiv. 
<ul class="org-ul">
<li><b><b>FLauBERT</b></b> is a a French BERT trained on a very large and heterogeneous French corpus</li>
<li><b><b>Flue</b></b> is an evaluation setup for French NLP systems similar to the popular GLUE benchmark</li>
</ul></li>
</ul>


<ul class="org-ul">
<li>Read our paper on <a href="https://arxiv.org/abs/1906.07672">Control of chaotic systems by Deep Reinforcement Learning</a>. Comming soon, the journal version accepted in The Royal Society’s physical sciences research journal. The journal page is <a href="https://royalsocietypublishing.org/doi/10.1098/rspa.2019.0351">this one</a>.</li>
</ul>
<a href="https://arxiv.org/abs/1906.07672">
<img src="/assets/figs/ks_h.png" alt="Kuramoto-Sivashinski" style="width:400px; margin:0px auto;display:block"/>
</a>


<ul class="org-ul">
<li><b><b><a href="https://allauzen.github.io/research/positions">Research positions are available for 2021</a>: 2 PhD positions</b></b>. Coming soon a postdoc position (closed for PhD)</li>
</ul>


<ul class="org-ul">
<li>With Aina Gari Soler and <a href="https://perso.limsi.fr/marianna/">Marianna Apidianaki</a>, we have a paper at
<i>*Sem 19</i> on <a href="https://www.aclweb.org/anthology/S19-1002">Word Usage Similarity Estimation with Sentence
Representations and Automatic Substitutes</a>.</li>
</ul>


<a href="https://www.aclweb.org/anthology/S19-1002.pdf">
<img src="/assets/figs/word_usage.png" alt="Word usage" style="width:300px; margin:0px auto;display:block"/>
</a>

<ul class="org-ul">
<li>See also, the paper in
the Semantic Deep Learning (SemDeep-5) workshop at IJCAI:
<a href="http://www.dfki.de/~declerck/semdeep-5/papers/wic_SemDeep-5_paper_4.pdf">LIMSI-MULTISEM at the IJCAI SemDeep-5 WiC Challenge: Context
Representations for Word Usage Similarity Estimation</a>.</li>
</ul>


<ul class="org-ul">
<li>Syrielle Montariol's paper on <a href="https://arxiv.org/abs/1909.01863">Empirigcal Study of Diachronic Word Embeddings for Scarce Data</a> is available.</li>
</ul>


<ul class="org-ul">
<li><a href="http://atala.org/content/apprentissage-profond-pour-le-traitement-automatique-des-langues">The special issue on Deep Learning for NLP</a> of the TAL journal is now
online, with 3 nice papers.</li>

<li>November the 23th, a talk at <a href="https://www.sciencesmaths-paris.fr/fr/horizon-maths-2018-intelligence-artificielle-957.htm">Horizon Maths 2018 : Intelligence
Artificielle</a>, organized by the FSMP.</li>

<li>With Matthieu Labeau : <a href="https://aclanthology.coli.uni-saarland.de/papers/C18-1261/c18-1261">Learning with Noise-Contrastive Estimation:
Easing training by learning to scale</a>, in COLING 2018. The paper was
one of the <a href="https://coling2018.org/coling-2018-best-papers/">"Area Chair Favorites"</a>. Meet Matthieu in Santa-Fe.</li>

<li>Invited Talk at the day "NLP and AI" of the French conference on AI
(PFIA) in Nancy (06/07/2018). My talk was about <i>Language models: large
vocabulary challenges</i>.</li>

<li>With Hinrich Schütze and Sophie Rosset, we are editors of a
Special issue on <b>Deep-Learning for NLP</b> of the TAL journal. Submit
a paper ! See <a href="https://tal-59-2.sciencesconf.org/">the official page for more detail</a>.</li>

<li>Our paper <a href="https://arxiv.org/abs/1603.05962">Document Neural Autoregressive Distribution Estimation</a>
with Stanislas Lauly, Yin Zheng and Hugo Larochelle has been
published in <b>JMLR</b>.</li>

<li>With Matthieu Labeau: <b>Best paper award</b> at <a href="https://sites.google.com/view/sclem2017/schedule"> SCLeM 2017</a> with our paper <i>Character and Subword-Based Word Representation for Neural Language Modeling Prediction</i>. If you attend EMNLP workshops, see the talk and the poster, September 7, 2017 in Copenhagen.</li>

<li>Opening talk at <a href="http://seminaire-dga.gforge.inria.fr/2016/DeuxiemeJourneedefistechnologiquesdelacybersecurite_fr.html">the DGA/IRISA seminar on Artificial Intelligence and Security</a>, 25/04/17. The slides are available <a href="https://perso.limsi.fr/Individu/doc/aa_rennes17.pdf">here</a>.</li>

<li>Our paper in Machine translation Journal on <b>A comparison of discriminative training criteria for continuous space translation models</b> is now <a href="http://link.springer.com/journal/10590">online</a></li>
</ul>


<ul class="org-ul">
<li>Matthieu Labeau presents in EACL our paper :  <a href="http://www.aclweb.org/anthology/E/E17/E17-2003.pdf">An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters</a></li>

<li>Invited talk at <a href="https://indico.math.cnrs.fr/event/1799/">the Workshop "Statistics/Learning at Paris-Saclay”</a>, Institut des hautes
études scientifiques (IHES), 19/01/2017. The slides are available <a href="https://perso.limsi.fr/Individu/doc/v0_aa_ihess.pdf">here</a>.</li>

<li>Our (French) paper in the journal TAL is now online:
<a href="http://www.atala.org/Apprentissage-discriminant-de">Apprentissage discriminant de modèles neuronaux pour la traduction automatique</a>.
You can check the <a href="http://www.atala.org/-Numero-1-">whole issue</a>
which contains other great papers.</li>

<li>The special issue on <b>Deep Learning for Machine Translation</b> in the
journal <b>Computer Speech and Language</b> should be online soon.</li>

<li>LIMSI Papers at the conference on statistical machine translation,
aka <a href="http://statmt.org/wmt17/">WMT17</a>.</li>

<li>Check the website of the Digicosme working group on <b><a href="https://gt-deepnet.limsi.fr/">Deep Nets and learning representation</a></b> (in French).</li>
</ul>


<ul class="org-ul">
<li>3 papers at the first conference on statistical machine translation,
aka <a href="http://statmt.org/wmt16/">WMT16</a>.</li>
</ul>


<ul class="org-ul">
<li>Tutorial on Deep learning and NLP applications at the
<a href="http://www.irit.fr/cimi-machine-learning/node/9">Workshop on Learning with Structured Data and applications on Natural Language  and Biology</a>: the slides are available in
<a href="https://perso.limsi.fr/allauzen/doc/aa_deep_nlp.pdf">pdf</a> (the file file is quite large, about 9MB).</li>

<li>Two Papers at EMNLP this fall 2015

<ul class="org-ul">
<li><a href="http://www.aclweb.org/anthology/D/D15/D15-1025.pdf">Non-lexical neural architecture for fine-grained POS Tagging</a></li>
<li><a href="http://www.aclweb.org/anthology/D/D15/D15-1121.pdf">A Discriminative Training Procedure for Continuous Translation Models</a></li>
</ul></li>

<li>Three Papers at WMT (see <a href="http://statmt.org/wmt15/papers.html">this website</a>), including one with Jan Niehues on ListNet-based MT
Rescoring.</li>
<li><b>Prix Ex-aequo du meilleur article à TALN 2015</b>
<a href="http://www.atala.org/TALN-RECITAL-2015-22eme-conference">Apprentissage discriminant des modèles continus de traduction</a></li>

<li>Check the website of the Digicosme working group on
<b><a href="https://gt-deepnet.limsi.fr/">Deep Nets and learning representation</a></b> (in French). We will have nice invited speakers for
the upcoming sessions:

<ul class="org-ul">
<li>April the 8th: Antoine Bordes (Facebook)</li>
<li>April the 16th: Florence D'Alché-Buc and Romain Brault
(Telecom-ParisTech), along with a discussion lead by Aurélien
Decelle (LRI) on "Approximate Message Passing with Restricted
Boltzmann Machine Priors"</li>
<li>May the 22nd: Edward Grefenstette (Google-Deepmind)</li>
<li>June the 18th: Stéphane Mallat (CMAP)</li>
</ul></li>
</ul>


<ul class="org-ul">
<li>Think about submiting to the 3rd edition of the workshop on
<b><a href="https://sites.google.com/site/cvscworkshop2015">Continuous Vector Space Models and their Compositionality (CVSC)</a></b>, with a nice set of
Keynote speakers:

<ul class="org-ul">
<li>Kyunghyun Cho (Montreal)</li>
<li>Stephen Clark (Cambridge)</li>
<li>Yoav Goldberg (Bar Ilan)</li>
<li>Ray Mooney (Texas)</li>
<li>Jason Weston (Facebook AI Research)</li>
</ul></li>

<li>December 2014: Two papers at IWSLT, one of them is on
<a href="https://perso.limsi.fr/allauzen/doc/dokhanh_iwslt14.pdf">Discriminative Adaptation of Continuous Space Translation Models</a>.</li>
<li>There will be a third edition of our workshop on <b>Continuous Vector
Space Models and their Compositionality (CVSC)</b>, colocated with
<a href="http://acl2015.org/">ACL 2015</a> in Beijing.</li>
<li>October 2014: <a href="http://amta2014.amtaweb.org/Proceedings.aspx">AMTA paper</a> <i>Combining Techniques from different NN-based Language Models
for Machine Translation</i> with <a href="http://scholar.google.com/citations?hl=fr&amp;user=fO9cszYAAAAJ">Jan Nieheus</a> is now published</li>
<li>July the 3rd: Two papers in  <a href="http://www.taln2014.org/site/programme/programme-detaille-sessions-orales/">TALN</a></li>
<li>June 2014, <a href="http://www.statmt.org/wmt14/papers.html">WMT14</a>: 3 papers !</li>
<li>June the 3rd, 2014: talk at <a href="http://www.aftal.fr/jadt2014/?page_id=13">l'atelier corpus multilingues</a>, JADT 2014 (in French)</li>
<li>April 27th 2014: <b><a href="https://sites.google.com/site/cvscworkshop2014/">Workshop on Continuous Vector Space Models and their Compositionality</a></b>, this is the second edition.</li>
<li>March 25th 2014: talk at the joint one day workshop on <a href="http://www.afia.asso.fr/tiki-index.php?page=Journ%C3%A9e+commune+AFIA+-+ATALA+2014">"Langue, apprentissage automatique et fouille de données"</a> (in French)</li>

<li>Habilitation à diriger des recherches
(<a href="http://fr.wikipedia.org/wiki/Habilitation_universitaire">see this page to know what does it mean)</a>:

<ul class="org-ul">
<li>January the 30th, 2014 at LIMSI</li>
<li>Title: Statistical models for machine translation</li>
<li>my <a href="http://perso.limsi.fr/Individu/allauzen/doc/aa_slides.pdf">slides</a></li>
<li>my <a href="http://perso.limsi.fr/Individu/allauzen/doc/hdr_allauzen.pdf">report</a></li>
</ul></li>
</ul>
</div>
</div>


<div id="outline-container-org3715ab8" class="outline-2">
<h2 id="org3715ab8">Talks on deep-learning for NLP</h2>
<div class="outline-text-2" id="text-org3715ab8">
<ul class="org-ul">
<li>Opening talk at <a href="http://seminaire-dga.gforge.inria.fr/2016/DeuxiemeJourneedefistechnologiquesdelacybersecurite_fr.html">the DGA/IRISA seminar on Artificial Intelligence and Security</a>, 25/04/17. The slides are available <a href="https://perso.limsi.fr/Individu/allauzen/doc/aa_rennes17.pdf">here</a>.</li>

<li>Invited talk at <a href="https://indico.math.cnrs.fr/event/1799/">the Workshop "Statistics/Learning at Paris-Saclay”</a>, Institut des hautes
études scientifiques (IHES), 19/01/2017. The slides are available <a href="https://perso.limsi.fr/Individu/allauzen/doc/v0_aa_ihess.pdf">here</a>.</li>

<li>Tutorial on Deep learning and NLP applications at the
<a href="http://www.irit.fr/cimi-machine-learning/node/9">Workshop on Learning with Structured Data and applications on Natural Language  and Biology</a>: the slides are available in
<a href="https://perso.limsi.fr/Individu/allauzen/doc/aa_deep_nlp.pdf">pdf</a> (the file file is quite large, about 9MB).</li>
</ul>
</div>
</div>

<div id="outline-container-org58727e3" class="outline-2">
<h2 id="org58727e3">Contact</h2>
<div class="outline-text-2" id="text-org58727e3">
<p>
My postmail address (email prefered): 
</p>

<p>
Alexandre Allauzen 
Université Dauphine - PSL, Laboratoire LAMSADE, 
Place du Maréchal de Lattre de Tassigny, 
75 775 Paris Cedex 16, 
France.
</p>
</div>
</div>