Jean-Pierre Colson is Professor of Translation Studies and of Linguistics at the University of Louvain (Louvain-la-Neuve, Belgium), where he was the Chairman of the Louvain School of Translation and Interpreting between 2015 and 2021. He has played for many years an active part within Europhras, the European Association for Phraseology, and is a member of the Europhras Board. He was for 3 years the Managing Editor of the Yearbook of Phraseology. He has published about 120 scientific papers on phraseology, computational linguistics and translation studies. He is also an international expert in several institutions: external expert for the COST projects of the European Union, expert in linguistics for the Belgian university foundation, remote expert for ANEP, the Spanish agency for the evaluation of scientific projects, and is regularly a member of the Programme Committee of international conferences on Phraseology, Computational Linguistics and Translation Studies.
Phraseology Extraction: From Corpora to Deep Learning
AI (Artificial Intelligence) is now present almost everywhere, from weather forecast to medicine and from financial investments to linguistics. Although some have criticized it as a hype whose cycle will soon be reaching its limits, there is no denying that AI has had major implications for applied linguistics in recent years, in particular for MT (Machine Translation). In several fields of applied linguistics, there is now a growing competition between corpus linguistics on the one hand, with a recourse to traditional statistics and huge corpora, and machine learning on the other, in particular deep learning based on neural networks. Will corpus linguistics gradually give way to deep learning? My point of view is that the automatic extraction of phraseology or multiword expressions may provide a partial answer to this question, as I will try to illustrate by a number of recent experiments. In addition, exploring phraseology from the point of view of machine learning, i.e. by drawing conclusions from its algorithmic representation, may provide fresh insights into the interaction between idiomaticity, fixedness and culture.