Re-thinking Machine Translation Post-Editing Guidelines

Celia Rico Pérez, Universidad Complutense de Madrid

The Journal of Specialised Translation 41 (2024), 24-46

Creative Commons Attribution 4.0 International


Machine Translation Post-Editing (MTPE) is a challenging task. It frequently creates tension between what the industry expects in terms of quality and what translators are willing to deliver as an end product. Conventional approaches to MTPE take as a point of departure the distinction between light and full MPTE, but the division gets blurred when implemented in an actual MTPE project where translators find difficulties in differentiating between essential and preferential changes. At the time MTPE guidelines were designed, the role of the human translator in the MT process was perceived as ancillary, a view inherited from the first days of MT research aiming at the so-called Fully Automatic High Quality Machine Translation (FAHQMT). My proposal challenges the traditional division of MTPE levels and presents a new way of looking at MTPE guidelines. In view of the latest developments in neural machine translation and the higher quality level of its output, it is my contention that the traditional division of MTPE levels is no longer valid. In this contribution I advance a proposal for redefining MTPE guidelines in the framework of an ecosystem specifically designed for this purpose.


Post-editing guidelines, machine translation, machine translation quality, MTPE.

1. Introduction

Defined as “the process of improving a machine‐generated translation with a minimum of manual labor” (Massardo et al. 2016: 14), Machine Translation Post-Editing (MTPE) frequently creates a tension between what the industry expects in terms of quality and what translators1 are willing to deliver as an end product. In this respect, one of the crucial aspects in an MTPE project is to decide on guidelines to be followed.

Guidelines for MTPE were first advanced by Allen in what has now become a seminal contribution (Allen 2003). Two different post-editing levels were defined then, according to the final use of the translated text: light post-editing for inbound texts (i.e. those that are not to be published), and full post-editing for outbound texts (i.e. those bound to wider dissemination). For the first type, light post-editing involved minimal intervention from the translator, while for the second, full post-editing aimed at producing human-quality output. Conventional approaches to MTPE take as a point of departure this distinction (Nitzke and Hansen-Schirra 2021) but the division gets blurred when implemented in an actual MTPE project where translators find difficulties in differentiating between essential and preferential changes and engage in full post-editing (O’Brien 2011a: 19). As a result, the dissociation between levels of MTPE seems irrelevant. Somehow this division between full and light MTPE was motivated at a time when MT was almost exclusively used for translating large volumes of technical documentation — as is the case in the automotive and aerospace industries — where the choice between MT for gisting purposes or publication was relevant. At the time these guidelines were designed, the role of the human translator in the MT process was perceived as ancillary, a view inherited from the first days of MT research aiming at the so-called Fully Automatic High Quality Machine Translation (FAHQMT), and MTPE was conceived as an “undesirable final step in MT development” (Vieira 2019: 319).

The paradigm shift experienced by MT in recent years with the advent of neural machine translation (NMT) calls for a different approach to MTPE. With MT engines leaving the research labs and opening up to broader and generalised practice — contrasting with previous implementation in highly specialised technical contexts — MT is now a real alternative to human translation even in commercial contexts where it was not used just a decade ago2. In this context, when MT is broadly used for almost any purpose it is only natural that MTPE strategies should evolve in line with the technology.

This paper presents a proposal for redefining MTPE guidelines. In the sections that follow, I first argue why defining MTPE guidelines constitutes a challenge and why the two traditional categories of light versus full MTPE are no longer valid. After this discussion, I review four main aspects in MTPE that contribute to creating tension in the process and directly affect the way MTPE is approached. These refer to the following: a) translators’ expectations towards MTPE guidelines; b) the blurred nature of MTPE, moving in a certain terminological instability between revision and translation; c) the difficulty in determining quality levels in MTPE and the associated concept of “fit for purpose translation” (Bowker 2020, Way 2018); and d) how the types of errors produced in NMT output directly affect the way MTPE is performed. The central part of the article rests in section four. I first present the MTPE guidelines ecosystem, which is based on three key elements: situated information, the text to be post-edited, and MTPE instructions. This is followed by a consideration of how these elements in the ecosystem contribute to easing MTPE tension factors, fostering translators’ agency in the process and contributing to creating clear and unambiguous MTPE directions3.

2. The challenge in defining MTPE guidelines

Following guidelines is a prerequisite for adequately conducting MTPE, but deciding on MTPE guidelines is not an easy task. When it comes to establishing criteria to be implemented in a real scenario with a decisive impact on costs, turnaround time and quality, directions provided by the relevant literature on the subject seem somewhat insufficient. MTPE specifications are either general recommendations that need further development, or rules specifically tailored for a particular MTPE project, which are difficult to replicate across different scenarios.

The authoritative reference to MTPE guidelines is ISO 18587:2017, where the process is described as “a more complex form of work than revision of human translation” (do Carmo 2020: 41). The standard identifies eight requirements for full MPTE that aim at producing an output indistinguishable from human translation, using as much MT output as possible, a direction that is also indicated for light MPTE. In drawing up MTPE guidelines, the typical approach is to proceed considering a series of aspects such as type of MT engine, description of source text, client’s expectations, volume of documentation to be processed, turnaround time, errors to be corrected, document life expectancy, and use of the final text (Allen 2003; Guerberof-Arenas 2013; O’Brien 2011a). From then on, a distinction is made into rapid, partial or full post-editing, with expectations on translation use playing a key role in the definition of correction strategies. Hence, an inbound translation approach would lead either to MT with no post-editing (when texts are used for information browsing) or rapid MTPE (for perishable texts). On the other hand, an outbound translation approach would compel partial or even full MTPE, depending on the quality of the translated output and the final use of the text. Actual implementations of these principles, both in the translation industry and in experimental settings for research purposes, usually take MTPE guidelines for granted and only mention them in passing (see, for instance, Carl et al. 2011; Koglin and Cuhna 2019; Koponen 2016; Lacruz and Shreve 2014; Sakamoto and Yamada 2020).

In a thorough analysis of MTPE guidelines, Hu and Cadwell (2016) reveal that even if the division between full and light is considered a standard, different organizations implement them differently, tailoring them according to their internal requirements. The authors even find some overlaps, especially in the description of light MTPE, while for full MTPE the main differences lie in the specification of style and the expected quality of the target text, depending on the use and text type (Hu and Cadwell 2016: 351). In the academic context, experiments towards testing MTPE guidelines are scarce. Flanagan and Paulsen (2014) examine how three MA translation trainees interpret MTPE guidelines using TAUS (2010) criteria and report that trainees have difficulties interpreting the guidelines, sometimes even causing frustration during the task (Flanagan and Paulsen 2014: 271). This is primarily due to trainee competency gaps, but also to the wording of the guidelines which, at some point, introduced conflicts in the way MTPE was to be performed. On a similar note, Koponen and Salmi (2017) report how different student translators also appear to interpret the task differently due to difficulties in interpreting guidelines. The authors involve five participants in a pilot study with the aim of analysing the edits made in an English-Finnish post-editing task. Guidelines play an essential role in the study and, in some cases, their interpretation is not straightforward, especially when determining the necessity of edits (Koponen and Salmi 2017: 145). The authors suggest that a possible way of mitigating this drawback is by providing more detailed information and training regarding necessary and unnecessary edits, depending on particular language combinations (Koponen and Salmi 2017: 144-145).

Even if these experiments are limited in scope as the number of participants is rather small, they are in line with findings in real industry settings. In this context, the subject of guidelines is also scarcely investigated, but still, the work of Nunziatini and Marg (2020) is revealing in this respect. The authors present an interesting example of how MTPE guidelines are specifically tailored to the needs of an LSP4 when the traditional division between full and light MTPE is considered “too abstract and inflexible both for translation buyers and linguists” (Nunziatini and Marg 2020: 1). The problem is, as the authors indicate, that there are grey areas not covered by this division and that clients are not really familiar with different MTPE levels. Translators, on their part, are often not entirely sure which approach would meet clients’ expectations. The solution implemented involves aligning all stakeholders on what types of errors are acceptable for a given text and target audience. This view from the industry is confirmed by Guerrero and Gene (2021) when they refer to “gaps and pains” when drafting MTPE guidelines. Their work reports findings from the MTPE Training GALA Special Interest Group which gathers representatives from all stakeholders in the translation industry5. According to the authors, the “gaps and pains” refer to the following aspects: inconsistency in standards, lack of transparency in existing guidelines which are usually kept as internal documents by LSPs, overlapping between MTPE guidelines and existing translation assignment instructions, subjectivity, and the blurring of the classical distinction between MTPE levels. As a result, translators miss the real scope of the MTPE project, tend to engage in full MTPE and show lack of agreement on style between the different guidelines available (Guerrero and Gene 2021: 9-10).

In this context, it holds true that translators need specific linguistic and technical directions6 that help them overcome uncertainty and take the appropriate decision when confronted by the task with a “certain degree of tolerance and the ability to draw clear boundaries between purely stylistic improvements and required linguistic corrections” (Krings 2001: 16). After all, as Allen (2003: 306) pointed out, “what most people really want to know is what are the actual post-editing principles that support the post-editing concept.” In my view, it is time to rethink this division of MTPE levels since applying two clear-cut levels of MTPE not only results in subjective decisions but also leaves out a grey zone where the translator is left alone with no guiding principles. In fact, as O’Brien and Conlan (2019: 84) note, the increasing use of NMT moves the boundaries between what is “human translation” and what is “machine translation” in a way that the distinction gets blurred. These moving boundaries also affect MTPE and the way the task should be performed, creating some tension in the process.

3. Tension factors in MTPE

When examining the different voices in translation that refer to MTPE we usually observe some tension among the actors involved. On the one hand, the exponential growth in digital content has forced a change in models of translation with an “increased focus on productivity and pressure on cost, along with further technologisation” (Moorkens 2017: 465-466). Accordingly, the industry tends to place MT at the core of its business model, assigning translators a subsidiary role when post-editing. On the other hand, different expectations from translators foster the acceptance/resistance debate over the MTPE task and a consideration of what it really entails: is it just revising and editing, or is it another form of translation? This controversy gets intensified when translators do not get clear instructions on how to proceed, and clients, who are not experts in quality methodologies, are unable to determine what types of errors are acceptable in an MTPE request (Nunziatini and Marg 2020). The concept of acceptability presents, then, an additional tension in MTPE since determining what is acceptable or not is subordinate to the quality of the MT output and the final use of the translated text. As a consequence, the notion of translation quality is revised under a new paradigm that introduces the idea of “fit for purpose” translation (Bowker 2020; Way 2018), where quality is measured against the relative excellence of the final text in its use for a particular purpose. This represents moving away from idealistic quality assumptions of early developments in MT, and a dissociation from the long-awaited desire of the computer science community for MT to reach human parity (Toral 2020). While it is true that NMT engines do provide better output quality as compared to previous MT systems, the human translator is still key in identifying errors and deciding whether they need to be post-edited.

In the following subsections I specifically refer to the different aspects that, in my opinion, contribute to creating tension in MTPE, namely: a) translators’ expectations towards MTPE guidelines; b) the nature of MTPE; c) the movable nature of MTPE quality; and d) the effect of NMT error typology. The discussion of these four factors will establish the ground I will then use to further address the challenge of defining MTPE.

3.1 Translators’ expectations towards MTPE guidelines

In the extensive literature on the subject of translators’ attitudes towards MTPE (Blagodarna 2018; Cadwell et al. 2018; Ginovart et al. 2020; Ragni and Vieira 2022; Vieira 2018; Teixiera and O’Brien 2018, among others) there is little mention of how translators perceive MTPE guidelines or how these affect the way the task is performed. Factors affecting attitudes refer to price, productivity, effort or cognitive load, but guidelines are seldom mentioned. To my knowledge, Guerberof-Arenas (2013) is the first investigation on the effect MTPE guidelines have on how translators perceive the post-editing task. This study presents an analysis of the opinion of a group of 24 translators and 3 reviewers, who reported an open and flexible attitude towards MT with some exceptions, “mainly because the quality of certain MT segments was poor or the instructions too cumbersome to follow” (Guerberof-Arenas 2013: 92-93). Price is also signalled as a definitive cause for dissatisfaction, especially when combined with the many demands from clients regarding the final quality of the text and the numerous changes to be made if the quality expected was very high (Guerberof-Arenas 2013: 82-86). This suggests that an improvement in MTPE guidelines might contribute to better job satisfaction. More recently, Vieira (2018), investigating translators’ blog and forum postings, found that resistance from translators might be more a question related to business practices than to technology itself. It is my contention that providing a simplified way of performing MTPE, one that is in line with how NMT performs, might contribute to better job satisfaction. In this respect, it is interesting to note that studies of how MTPE is actually conducted in the translation workflow (Silva 2014 or Plaza-Lara 2020, for example) do not make an explicit reference to MTPE guidelines and it seems that they are taken for granted. Ginovart et al. (2020) do mention MTPE instructions in their broad survey of 66 translation stakeholders (including project managers, MT specialists, linguists and academics) based in 19 different countries. The survey thoroughly investigates how MTPE guidelines are defined and implemented, but their effect on job satisfaction or task perception is not explored.

In the usual narratives of resistance to MTPE, production processes are seen as linear, with translators “cleaning up” errors introduced by the machine (Mellinger 2018: 311). When MTPE takes place at the end of the production line, the distinction of light and full post-editing is a natural consequence of this conception. However, as do Carmo and Moorkens (2021) indicate, MTPE is almost always done “using modern CAT tools [where] MT suggestions appear intermingled with translation memory matches as resources for translators to check and edit” (do Carmo and Moorkens 2021: 39). This is seen by the authors as a natural evolution that “makes MT a resource added to TM, and thus, the distinction between editing TM suggestions and post-editing MT suggestions becomes less obtrusive.” Data from practices in the industry confirm this, with a typical project using only about 9% of MT suggestions and companies relying heavily on recycled previously translated content (as reported in TAUS 2020). The most usual workflow includes a combination of MTPE, TM and human translation. If this is the case, why should criteria for MTPE still stick to the division between light and full? MTPE is no longer performed as a separate task from translation and this should have a consequence in the way instructions are designed.

3.2 The nature of MTPE

The connection of MT to TM in a single platform has the immediate effect of blurring the traditional boundaries between both technologies (O’Brien and Colan 2019: 84). The source of the translation data gets also blurred when the translator is presented with segments that come either from the TM database or which are originated by the MT system. When the translator is offered two possible alternative translations for a given segment, one coming from the TM database and the other originated by the built-in MT system, why should each segment be treated differently? What is more, should the editing of a TM match be done differently than post-editing an MT suggestion? In this context, the difference between editing and MTPE often gets lost with the integration of MT and TM in CAT systems (Jakobsen 2019; Sánchez-Gijón et al. 2019). In a complementary line of argumentation do Carmo and Moorkens (2021) understand that for MTPE to be a form of revision we need to assume that the MT system provides a completed full translation, but this is not really the case: “MT text is only an ‘output’ or a set of ‘suggestions’ or ‘hypotheses’ for the translation of a text” (Do Carmo and Moorkens 2021: 35-41). Only the translator is responsible for the final text. The authors take a step further and advance that MTPE should be considered as a type of translation. If this is so, why should MTPE guidelines differ from those of translation? We can consider MTPE a dynamic process where the translator constantly interacts with the product of MT, revising the translation as the machine generates it. It is in this interaction that translators take full control of the process. This challenges the contention that MT is central to the translation process and resituates the human element at the centre, thus, invalidating the reductionist idea of the machine and the translator competing for quality.

3.3 The movable quality of the post-edited text

As the concept of MTPE progresses and gets closer to translation, I see a need to revisit the concept of quality. Translators need to determine what constitutes quality in order to decide when a segment should be post-edited. However, defining quality is not straightforward since MTPE introduces a grey zone where the threshold for accepted quality is movable, and translation is no longer correct or incorrect but rather acceptable for a given purpose. As Vashee (2021) advances, the business value of a translation in the industry is usually not defined by linguistic perfection but by a combination of factors: its utility to the consumer, basic understandability, availability-on-demand, and the overall impact on customer experience. In general terms, “useable accuracy” matters more than perfect grammar and fluency. In this respect, Moorkens (2017) points to the movable characteristic of quality and defines acceptability thresholds according to text lifespan. At some point, even raw MT output may be considered “a worthwhile risk” (Moorkens 2017: 471) for highly perishable texts. This is well illustrated, for instance, in the model that Nitzke et al. (2019: 246) design for guiding the decision of using MT. According to this model it is advisable to perform MTPE on texts with an expected quality level of 60%-80% when the risk level of the texts is low. For texts that require above 80% up to 100% quality, the model recommends not using MT. In a similar tone, Plaza-Lara (2020: 173) shows how quality in MTPE is subordinate to the quality of the MT output. These approaches to quality reveal that the product of MTPE no longer aspires to a translation similar to that produced by a human translator, but rather as responding to the final use of the text. From this point of view, the notion of MTPE levels (be they full or light) may lose relevance and give way to a different concept of MTPE in which the translator focuses on checking the correct use of terminology and approving the translated content. This type of MTPE is in line with a more flexible way of understanding quality, the so-called “fit for purpose” (Bowker 2020; Way 2018).

Somehow this new conceptualisation of quality is a consequence of the technologisation of translation. As Doherty (2017: 131) indicates “the evolution and widespread adoption of translation technologies — especially MT — have resulted in a plethora of typically implicit and differently operationalised definitions of quality and respective measures thereof”, which affects the decisions taken for evaluating quality and “involve tensions between human subjectivity and machine objectivity.” As the prescribed linear workflow evolves and MT gets mixed with TM, “computer-assisted translation and MT systems further blur the lines of translation quality, insofar as a third agent of text production is introduced” (Mellinger 2018: 319). The new translation workflow has a direct consequence on quality and the way the final text is produced (via MT) adds an extra layer of mediation to translation and revision with “external artefacts” that collaborate with the human translator “toward the end goal of a quality translation” (Mellinger 2018: 321). The translator is no longer the single creator of the target text but shares this responsibility with the machine.

3.4 The effect of NMT error typology

An additional aspect that affects the criteria for defining MTPE guidelines is the type of errors that NMT engines produce. Different studies show that NMT systems outperform statistical MT systems in many language combinations (particularly for morphologically rich languages), and tend to produce more fluent translations, with improvements in grammar but a possible degradation in lexical transfer (Bentivogli et al. 2016; Neubig et al. 2015; Wu et al. 2016). Other improvements involve a reduction of word order errors, and fewer morphological mistakes, which are anyway balanced by mixed results for perceived adequacy (Moorkens 2017: 471). The research conducted by Castilho et al. (2017) indicates similar results. It explores possible improvements in NMT output in three particular domains: e-commerce product listings, patent domain and Massive Open Online Courses (MOOCs). Results show that NMT performs well in terms of fluency but is inconsistent for adequacy, with a greater number of errors of omission, addition and mistranslation (Castilho et al. 2017: 118). Obviously, we should take these findings with care as these are lab tests that provide incomplete and limited results, with errors not reproducible when a different MT system or language combination is used (do Carmo and Moorkens 2021: 42). In any case, as Vieira (2019) points out, the fact that NMT provides fluent translations means that errors might be more difficult to spot. In this respect, the study conducted by Koponen and Salmi (2017) with a group of post-editors reports that a significant number of edits (34%) were unnecessary, even if they were correct, but they did not represent actual errors in MT. They investigated the quality of corrections for the pair English-Finnish, questioning the assumption that the changes performed by translators are correct and represent actual errors in MT. Similar findings are reported by de Almeida (2013) and Temizöz (2016). A possible cause for this may be that the practical implementation of guidelines is not necessarily clear to translators, who may have different interpretations, especially when style is concerned (Koponen and Salmi 2017: 140). In dealing with MT errors, translators tend to overedit as they feel the urge to improve “all linguistic aspects because they want to achieve perfect quality, even though the guidelines state otherwise” (Nitzke and Gros 2021: 21), actively looking for mistakes. Preferential changes referred to lexicon (using synonyms or different terms), syntax reordering, changing style according to register preferences, inserting and deleting words, grammatical changes such as verb tense, using different spelling variants, and inserting or deleting commas (Nitzke and Gros 2021: 28). Similar findings are reported by Daems and Macken (2021). The authors explore whether there is a difference between revising human translation and post-editing machine translation, and the results show that a significant number of preferential changes were made in all conditions. In the case of post-editing, translators suggested more changes than when they believed they were revising a text produced by a translator (Daems and Macken 2021: 68-69).

As a consequence, translators need to be aware of which error types they might encounter when post-editing, recognising “the special status of the suggestions presented by MT systems, and their unpredictable quality level” (do Carmo and Moorkens 2021: 40). In this context I see a need for re-thinking MTPE guidelines in a way that is better adapted to the actual situation in MTPE, one that caters for translators’ expectations and needs, the true nature of the task and the actual capabilities of the MT system, and the types of errors that might be expected.

4. MTPE guidelines ecosystem

In designing MTPE guidelines I use the framing of the ecosystem as, in my opinion, it allows for a broad conceptualization of the different aspects involved as discussed so far. The metaphor of the translation ecosystem originates from situational models of translation that conceptualise the translation process as a complex system. This includes not only the translator, but also other people (cooperation partners), their specific social and physical environments as well as their cultural artefacts (Risku 2004:19). It is in this broad context that I make a proposal for defining MTPE guidelines.

The remainder of this paper presents the different elements in the MTPE ecosystem: 1) situated information; 2) the text to be post-edited; and 3) MTPE instructions. Figure 1 below represents the guidelines ecosystem, with a mapping of its elements to tension factors and an indication of expected positive effects on the MTPE process.

Figure 1. MTPE guidelines ecosystem, tension factors and positive effects

After presenting the different parts of the MTPE guidelines ecosystem, I discuss how these contribute to easing MTPE tension factors, fostering translators’ agency in the process and contributing to creating clear and unambiguous MTPE directions.

4.1 Situated information

Situated information refers to the contextual information needed for defining MTPE guidelines. I borrow the concept of situated information from Krüger’s (2016a, 2016b) situational model of translation technology as it provides a comprehensive view that is suitable for examining MTPE (Rico Pérez and Sánchez Ramos 2023). Inspired by situated translation theory (Risku 2004, 2010), Krüger creates a model applicable to translation technology that is premised on the assumption that the translator is the central agent in the translational ecosystem. The essential components of this ecosystem are: 1) the psychosocial factors that affect the cognitive process of translation; 2) the different types of artefacts (or resources) that facilitate the translation process; and 3) the cooperation partners and users (Krüger 2016a). These three aspects are essential for defining MTPE guidelines and are reviewed below.

4.1.1 Psychosocial aspects

When translators are situated as the central agents of the MTPE process, collecting data on psychosocial aspects is essential as these help them to tune in their expectations to real data. According to Krüger’s situated model, psychosocial aspects refer to the working environment and the professional status, as well as factors such as time pressure and motivation (Krüger 2016a: 318-327). These are integral components that affect the cognitive process of translation and also the post-editing task. Among the aspects to be included in the MTPE guidelines ecosystem, translators should be provided with information on rates, the required scope of the service and the schedule allowed for the project. The question of MTPE rates usually raises controversy among translators since MTPE is often seen as a way of saving money by LSPs which “is very likely the source of resentment expressed amongst translators and PMs [Project Managers].” when referring to MTPE (Sakamoto 2018: 8). This is why information on rates, and how these are calculated, is important to translators. Calculating rates is not an easy task, and different approaches can be taken, either as an ex-ante model (establishing a rate before the completion of the project) or as an ex-post model (calculating the actual work performed according to productivity) (Plaza-Lara 2020: 171). As for the other two aspects, project scope and schedule, the information provided should serve the translator to understand how the MTPE project fits in the overall service commissioned by the client, how time is to be managed, whether there are any pre-production processes that affect the MTPE project, how the translator is receiving the raw MT output, and the MTPE project workflow.

4.1.2 Resources

In Krüger’s situated model, translation technology forms an integral part of the translator’s cognition together with other environmental artefacts (Krüger 2016b: 121). The complete list of resources examined by Krüger (2016a: 320-326) refers to general working aids (office equipment, furniture, communication devices), digital research and communication resources (corpora, blogs, forums), translation technology in a wider sense (text processing software, file manager, checkers, etc.) and translation technology in a narrow sense (TM systems, terminology management, alignment tools, MT systems and project management tools). It is the latter category that, in my opinion, precisely affects the MTPE guidelines ecosystem as it determines the very nature of the project and how it is to be performed. In this respect, the translator specifically needs information on the following aspects:

4.1.3 Cooperation partners and users

Cooperation partners are an essential part of the MTPE ecosystem and, according to Krüger (2016a: 316), take the following roles: translation initiator, commissioner, ST producer, TT user, TT receiver, co-translator, proof-reader and project manager. In order to perform an adequate MTPE task, translators need information on these partners and the different responsibilities and requirements since, in the end, these also determine how the MTPE project is conceived, who is to cooperate with whom, how the work is organised and which are the expectations of the final user of the translation.

4.2 The text

The second component in the MTPE ecosystem is the text (Figure 1). Information about the text to be post-edited is essential to the translator on the following aspects:

4.3 MTPE instructions

This is the last element in the MTPE guidelines ecosystem and one of its fundamental parts. Ideally, criteria for MTPE guidelines should take into account the following aspects:

In order to better organise instructions I suggest preparing two complementary sets: general instructions and language-specific instructions.

4.3.1 General instructions

When outlining general instructions, we need to take into account translators’ expectations towards clear and unambiguous guidelines (see section 3.1 above). It is in this respect that I suggest breaking away from the traditional approach of light vs. full MTPE, as already discussed, and advance a proposal that formulates general instructions as grouped into two complementary tasks: check and correct7. In the first task (check), the translator examines the MT output against the source text, while in the second (correct), work concentrates on the MT output in order to make the necessary corrections according to the quality threshold determined previously by the preceding elements in the MTPE ecosystem. The different actions involved in each task are shown in figure 2.

Figure 2. Check and correct: actions involved

4.3.2 Language-specific instructions

Together with general instructions, language-specific guidelines are relevant since, as Sarti et al. state (2022), NMT post-editing performance is highly language-dependent and influenced by source-target typological relatedness. Language-specific rules for MTPE indicate, for example, the use of a particular language locale, lexical collocations or specific sentence structures (see, for instance, the work of Mah 2020). It is key to provide translators with a set of representative examples for each of the languages so they know what to expect from the MT output, how to deal with the different error types and what MTPE implies in each case.

4.3.3 Matching MTPE guidelines to MTPE tension factors

Framing MTPE guidelines in the ecosystem as described above has, in my opinion, the positive effect of easing tension factors in the MTPE process. I will return now to Figure 1 and show how each element in the ecosystem matches each tension factor. The correspondences are drawn in different lines (see Figure 1).

Line b establishes a more specific relationship of resources and cooperation partners to the movable quality of the post-edited text. As argued above, quality in MTPE is a movable concept determined by a series of factors such as the utility of the text, its impact on customer experience or its lifespan. Different quality thresholds can be defined in a more flexible way of understanding quality. In this connection, translators are able to understand this movable quality when they are equipped with information on which MT engine is used and the expected output quality, which are the client’s expectations and how the final text is to be used. The positive effect of providing translators with all this information is that they secure agency in the MTPE process by fully understanding the nature of the task and approaching quality as dependant on several factors.

5. Concluding remarks

The decision-making process involved in defining MTPE guidelines is not a straightforward task. MTPE levels are subordinate to the quality of the MT output, with several other issues involved: subjectivity in the task, client’s expectations, effort and expected productivity. In this task, translators are traditionally presented with a set of guidelines divided into light vs full MTPE levels. In view of the latest developments in NMT and the higher quality level of the output they provide, my proposal is that this traditional division is no longer valid. NMT systems outperform previous developments in many language combinations and tend to provide highly fluent output with fewer errors. Accordingly, MTPE guidelines should break away from MTPE levels as these get blurred.

This paper challenges the traditional division of MTPE levels and presents a proposal for a new way of looking at MTPE guidelines. After a review of the relevant literature on MTPE guidelines, I have discussed why defining criteria for MTPE is not a straightforward task. I have then presented some factors that create tension in MTPE and influence the way guidelines should be designed. The core of the paper concentrates on the definition of a set of MTPE criteria in what I call the MTPE ecosystem. This ecosystem contributes to creating clear and unambiguous directions, with the translator regaining agency in the MTPE process. This paper is an attempt at designing a model for MTPE guidelines which supports translators in the process. The model is conceived in a way that tension factors can be eased and negative attitudes towards MTPE are prevented. I have argued how guidelines tend to be either too general or too context-specific to be replicated straightaway. In this sense, the model I present is a valuable instrument as it collects in a single source all aspects influencing the translator decision so that MTPE guidelines can be easily drawn, adequately supported with actual examples and, what is more important, shared and replicated along different MTPE projects.

The model still needs to be validated and, in this connection, I see potential avenues for future research in a series of aspects. The first refers to the question of the informational or cognitive load involved in the proposed ecosystem. I pointed out that many current MTPE guidelines are perceived by translators as being overly cumbersome, and, consequently, the MTPE model I present also needs to be tested on this regard. In the translation industry, the rationale for using MTPE relies on the assumption that it requires less effort than translation “from scratch”, but, as different studies show (Koglin and Cunha 2019; Koponen 2016; Moorkens 2018; Vieira 2017, among others), the cognitive load to identify errors and decide on corrections is high. Therefore, the likelihood of translators being able to access and having the time to process the information required in the ecosystem model needs to be tested. The question would be ideally addressed by participant-oriented empirical studies of the adoption and perception of the MTPE ecosystem in the line of the works of Cadwell et al. (2016), Rossi and Chevrot (2019) or Schnierer (2019). A second and complementary question is how the MTPE model can be actually implemented in a working environment. In studying working conditions, Liu (2020) reports that translators assign a great value to accessing all kinds of information that might help them understand the translation assignment, including the possibility of communicating with clients, authors or end-users, or knowing the intended use of the translation. Even if this study does not explicitly refer to MTPE guidelines, we can infer that having access to them might facilitate the translators’ work. It has been established in several studies (see, for instance, Drugan 2017) that translators can struggle to get even basic information about a translation’s broader context, and at some points they may not be able to get support when they face problems or challenges at work. Implementing MTPE guidelines can take the form of simple data sets each containing the required information for each of the elements in the ecosystem. We find this type of implementation for post-editing rule-based machine translation output in Rico Pérez and Díez Orzas (2013), where the data sets provide practical information on the PE project in terms of full versus light post-editing. A potential area for research would be how these data sets adapt to MTPE ecosystem requirements, and whether putting them into practice contributes in any way to improving job satisfaction. It is my contention that providing a simplified way of performing MTPE can contribute to a better understanding of the task and, consequently, to reducing resistance from translators, who might feel a greater sense of agency and have a greater confidence in the utility of MT.



Celia Rico Pérez is currently a Visiting Professor in Translation Technology at Universidad Complutense de Madrid. Her research focuses on the use of machine translation and other translation technologies, machine translation post-editing and quality evaluation.

ORCID: 0000-0002-5056-8513



  1. In this article I use “translators” instead of “post-editors” as I firmly believe that the complexities of MTPE are such that it can be considered as a special form of translation. In this respect, see, for instance, do Carmo and Moorkens (2021: 40-42). I am aware that this use presents some controversy since the real status of the translator versus the post-editor is far from resolved (Sakamoto 2019).↩︎
  2. Consider, for instance, how the EUATC Survey reported in 2013 that only a minority of LSPs actually used machine translation (EUATC 2013: 4), in contrast to findings from the ELIS survey in 2023 indicating that “machine translation continues to be the dominant trend in all segments of the industry” (ELIS 2023: 5).↩︎
  3. While recognising that translation quality is notoriously non-deterministic, and that MTPE guidelines are usually the benchmark for quality assessment, the guidelines I present here refer to a list of actions aimed at avoiding subjectivity (or preferential choices) in the PE process. I am also aware that these guidelines are only a proposal which still needs to be validated in real contexts.↩︎
  4. The case study reports work at Welocalize in Italy and the United States.↩︎
  5. MTPE Training GALA Special Interest Group aims at drafting a common MTPE training protocol made by and for all stakeholders.↩︎
  6. In this article “directions” and “guidelines” are used interchangeably as synonyms.↩︎
  7. These instructions are inspired by the standard PE guidelines in the industry (mainly TAUS 2010), which I have revised in the light of the PE ecosystem.↩︎