Frédéric Blain (University of Wolverhampton)
Quality Estimation for Machine Translation
Quality Estimation (QE) is the task of predicting the output quality of language generation tasks, such as Machine Translation, when no human reference is available. In Machine Translation, QE is, for instance, used to predict whether an automatically generated translation is “good enough” (e.g. for post-editing) with the purpose of increasing the productivity of a human translator; or reliable (e.g. for gisting) as the trustworthiness of machine translations is sometimes questionable.
Traditionally framed as a supervised feature-based machine learning problem, QE, like many other Natural Language Processing tasks, has recently entered the era of Deep Learning. Even though state-of-the-art architectures, which significantly outperform the more traditional models, come at a (computing) cost, they offer many exciting opportunities for research and development.
This tutorial aims to providing a broad introduction to the field of Machine Translation Quality Estimation, from traditional feature-based QE models to current state-of-the-art, as well as to introduce the current challenges. In the second, more practical, part, we will explore the currently available resources (datasets, toolkits, pre-trained models, shared tasks) one needs to know to get started and train their own system.
Esther Bond (Slator)
Human-in-the-loop translation in action
Advances in machine translation have led to the widespread deployment of MT in professional translation workflows in the past few years. From post-editing to interactive translation workflows, language service providers (LSPs) of all shapes and sizes have now integrated this foundational technology into their translation and localization operations. New businesses are being founded on the basis of MT and appealing to outside investors through the AI agency thesis.
This tutorial explores the impact of MT-enabled human translation workflows on the professional language services industry, delving into pricing models, language industry investment, and MT-resistant pockets of the industry.
Félix do Carmo (University of Surrey)
Is automatic post-editing the end of the line for the “human-in-the-loop”?
Automatic post-editing is a computer task that aims at learning to correct machine translation errors from examples of post-edited machine translation output. If the purpose of this task could be fully accomplished, post-editing, which some see as the last resort for human translators, could also be swiped from their lists of jobs. The reality is that automatic post-editing’s probably unattainable purpose makes it an especially challenging field for research. The focus on the illusive last mile in approximating optimal translation quality may even make it unattractive for researchers that aim at low risk and high rewards. However, it gives us insight into strategies to tackle the devil in the details of translation effort, and it gives us hints on how to adapt big data approaches to real business use cases.
In this interactive tutorial, in which there will be room for open discussion, we will explore several dimensions of automatic post-editing. We will discuss the different methods it uses, including the centrality of edit distances for its evaluation, and we will discuss what we can learn from it regarding what separates computer methods from human activities, and the role these methods and activities play in the world of business. The answer to the question in the title is that there is no end of the line for translators because translation is a human process, with machines in the loop, which will evolve according to how human communication evolves.
Clara Ginovart Cid (Datawords)
Drill the three post-editing skills
Editing machine translation is becoming similar to editing fuzzy matches from translation memories. The skills, though, are nuanced. You do not spot errors in the same way, you do not decide with the same method and the principles or guidelines whereby you strategize and apply editions do not correspond to the principles that guide other editing activities.
After the foundations on translation are set (perfect understanding of the source and target language and culture, translation techniques etc.), spotting machine translation errors is the first skill to be mastered for most professional translators today. After an error (or more than one) is spotted, do you decide to erase and translate, or to edit? After the decision is made in a split second, what instructions guide you in your editing process?
These three skills -error spotting, decision-making, and successful application of guidelines- are the object of study in this tutorial.
David Orrego-Carmona (Aston University)
Limits and possibilities of eye tracking in subtitle-reading research
In this tutorial, we will explore the conceptualisation, execution and dissemination of subtitle-reading studies using eye-tracking methods. Eye-tracking methods have been used for almost three decades to study subtitle reading. Eye tracking is particularly useful to study the reading of subtitles and texts on screens because it provides reliable, detailed data of the reading process that can later be triangulated to understand reading behaviour and engagement. It is important, however, to enhance the potential of advanced data collection methods with robust theoretical and methodological frameworks.
This tutorial will provide an interactive space for participants to reflect on these practices, pose their questions, and work collaboratively towards concrete solutions to their issues. We will discuss different eye-tracking measurements and their relevance to the study of subtitle reading. Combining qualitative and quantitative approaches, we will explore the application of mixed methods to reception studies in translation studies. Using hands-on examples, participants will have the opportunity to reflect on different applications, challenges and opportunities offered by eye-tracking methods.
Bianca Prandi (University of Mainz)
InterpretBank represents the state of the art in the field of computer-assisted interpreting tools, i.e. software applications designed to assist interpreters along various phases of their workflow. Due to its sophistication, it has also been employed in a number of research projects dedicated to process-oriented technological support for interpreters.
In this tutorial, I will present the rationale behind the creation of CAI tools in general and InterpretBank in particular and offer an introduction to the area of computer-assisted interpreting tools, with a focus on the potential advantages and risks deriving from their integration into an interpreter’s workflow. The tutorial will provide a hands-on introduction to the working modes offered by InterpretBank and to the key functions available for terminology-related work, such as tools for glossary creation and management, automatic terminology extraction from preparation documents and seamless access to the terminological resources during the assignment.
I will conclude the tutorial with a description and practical demonstration of the automatic speech recognition function offered by this AI-powered CAI tool.
Participants will be able to try out the tool themselves by downloading the free 14-day demo version.
Tharindu Ranasinghe Hettiarachchige (University of Wolverhampton)
Multilingual MT Quality Estimation
Most studies on Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. These are major hurdles to apply QE models practically.
In this tutorial the participants will explore multilingual models for QE. The tutorial will introduce multilingual QE models with a hands-on session on building multilingual QE models with transformers. The tutorial will conclude with comparing results between language specific QE models and multilingual QE models.
Moritz Schaeffer (University of Mainz)
How to study Error Recognition during Post-editing
The present tutorial will present two studies which investigate how humans spot errors in existing machine translations. The error recognition process is central to understanding the interaction between translators and machine-translated text and reveals fundamental aspects of how the human translation process unfolds in time. The studies compare novices with professionals and finds fundamental differences in how errors are recognised which can be attributed to participants’ amount of professional experience. In addition, different error types and the time course in the recognition process are central in these studies. The comparison between different error types and their effect on participants’ reading and typing behaviour during post-editing makes it possible to study how the human’s model of translation reacts to that of the MT system. As such, error recognition has fast become the default mode translators operate in as post-editors. This means that translation production –as an object of study– is shifted to the design of automated systems, while understanding how humans spot and correct errors is likely to be the central focus of studies trying to uncover human cognitive translation mechanisms, but it will also be extremely useful for those who design interfaces that connect the machine and the human translator. This tutorial will look at ways of how error recognition during post-editing can be studied empirically on the basis of eye tracking and keylogging data.