Machine Translation post-editing
Human vs machine translation

Humans vs. Machines: Translation Industry Version

In November 2016, Google released a new version of Google Translate and makes the buzz all over the world with the “extraordinary quality” of the translations that the new engine provides. Dozens of articles comparing the old version vs. the new version in various languages, translating and back translating literary texts (reputed to be more difficult to translate), occupied the columns and blogs of general and specialized media for months.

What Google released technically was the introduction of a Neural translation model in their machine translation engine. This major improvement in the quality of the translations provided (for free!) by Google Translate started to change the opinions of the general public on the future of the translation industry and the place of machine translation in it. The development of new tools and access to higher quality machine translations pushed the big clients of the industry to consider cutting costs by getting post-editing services instead of translations. Machine Translation post-editing then became more widely offered by translation service providers as a professional service. Big players and TMS editors accelerated the training of their own MT engines. The translation industry started to change profoundly.

We wanted in this article to remind you what Machine Translation was, what it has become and what are the changes that it creates in the translation industry.

The history of Machine Translation

Machine translation (MT) has a long history that goes back to 1949 when it first appeared in Warren Weaver’s Memorandum on Translation. The first researcher in the field, Yehosha Bar-Hillel, initiates his studies at MIT in 1951. The MT research team from Georgetown follows with a public demonstration of its system in 1954. Machine Translation is also one of the first non-numerical applications for computers. A large scale of researchers started to evolve as the Association for Machine translation and the Computational Linguistics emerged in the U.S. and the National Academy of Sciences formed a council (ALPAC) to study MT.

MT is brought into force in 1970, with the French Textile Institute to translate abstracts from and into French, English, German and Spanish. Various MT companies are launched including Trados (1984), which is the first to enhance and market the Translation Memory (TM) technology in 1989.

In recent years, machine translation technology has evolved into a new dimension, and a promising future wink at us as research on Neural Machine Translation continues to build a new age for the industry.

Neural Machine Translation

Using the technological benefits of Deep Learning, Neural Machine Translation is nowadays dominating the Machine Translation development efforts across continents. As a subfield of computational linguistics, Neural Machine Translation is basically an application of Deep Learning in which an algorithm is trained by massive amounts of datasets of translated texts. It basically uses a prediction model, encoding the source text into numbers first and decoding it back into a final text in the requested language. Very simply explained, from all the possibilities, the algorithm picks the group of words that have the highest probability to be seen together and shows it as the final translation. The more translation data you have to feed your machine with, the closest you get to a more natural-sounding translation, as stylistic touches and the context are also detected by the algorithms. But you have to make sure the inputs are very clean to properly train the engine. Thus, the quality of your Neural Machine Translation engine depends on the quantity but also the quality of the original translation data you use to train your algorithms.

So far, even though machine translation engines have improved the quality of the output, there is a trust issue on content and texts that will be used professionally. What comes next is a work-around for acceptable quality translations with a hybrid process, machine translation followed by human verification, a.k.a post-editing.

The post-editing trend in the translation industry

As digitally available translation data is growing exponentially and Deep Learning opened the gates for a higher quality machine translation, the first reflex of professional translation service buyers was to see if they could cut costs using machine translation in their translation processes. As the MT output is not yet “client-ready” but still significantly improved, consumers of translation services started asking for Machine Translation post-editing for faster and cheaper translations with “acceptable” quality. The post-editor will not rephrase nor retranslate but fix mistakes, if any, on the meaning and make the text consistent. The tone might be a little robotic, but the meaning will remain the same.

As the COVID pandemic accelerated e-commerce content in multiple languages, the need for fast and cheap translations became real for a large number of SMEs. Trade shows and events were canceled, travel banned for months and SMEs started to depend more on online sales than their traditional sales methods. In this context, Machine Translation post-editing imposed itself as a valuable solution for these SMEs. This phenomenon was observed by a large number of Language Service Providers and a brief explanation was also given by Florian Faes, founder of Slator, and the language industry event SlatorCon, that was held remotely and online in July 2020.

Even for big corporations developing their own MT engines, the Machine Translation post-editing service seems to be the only one that is needed from a Language Service Provider (LSP). For example, IBM’s Watson Language Translator service is used internally by IBM as a translation tool but it seems that the output still needs to be verified.

As the demand shapes the offer, LSPs started to regularly offer the translation post-editing service as a cheaper but valid translation option to big corporations, to industry veterans’, and professional translators’ big dismay. The case against this trend is also growing and has solid points.

Human Translation, still the best option for high-quality content

In the machines vs. humans race, Machine Translation is catching up with huge leaps forward while humans also advance in delivering faster-using tech tools, but apparently not fast enough. Human translation remains the best option for high-quality and creative content today. And there are a few valid points supporting this.

The lack of processes in the quality and risk management while using Machine Translation in critical industries (medical and pharmaceutical for example for consequences on the health of people, and finance and legal on the financial consequences on corporations) is posing a problem for the implementation of large-scale MT solutions in those industries. The responsibility in case of translation mistakes is difficult to solve. No one really wants to take the risk. Post-editing is making its way, as there is human verification and it limits the risks, but still, it does not solve the problem on where to put the Machine Translation engine in the chain of responsibility.

If your content is not highly technical, you need a less formal tone or you want to be creative, well, machines are not there yet. Machine Translations are mostly fed by official translations (available for free in large volumes and many languages), different language versions of corporate websites. That is where the massive datasets are available. When it comes to creative content, it is more difficult for a machine to duplicate the style and feel of the text, as there is not enough input in most languages. By the way, in corporations that use Machine Translation and Machine Translation post-editing to create new versions of their website for example, or to translate their posts and blog articles, it is advised to the person who will create the source content (often in English) to refrain from using specific figures of speech in the text. The quality of the source text is then limited to what will be translatable correctly by a machine, which in turn creates a mediocre overall feeling on the available content. If you use professional human translations, you do not really have to think about these to get a high-quality text in any language…

One last point, about quality and concerns on the widespread usage of machine translation post-editing vs. human translations, is the following: as the post-edited machine translations create new massive datasets that feed the engines, the “evolution” and the training of the machines become problematic. If we want to get high-quality results from a machine translation solution, we should feed it with high-quality food. Up to now, most of the machine translation solutions were trained by good quality human translations. What will happen when most of the newly available translation datasets result from machine translation post-editing? Well, the concerns are that you cannot really obtain better results from a machine if you do not use the right data.

In short, if you are looking for fast, cheap, and mediocre quality -but mostly correct- translations, you can go for machine translation post-editing. If you really mind the way you address people in the target language you want your content translated into, if you are looking for content that will generate real traffic and revenue, go for a double set of professional eyes: Professional translation followed by proofreading is your solution, and many translation service providers will be more than happy to serve you.

Published on Apr 6, 2021 by KEREM ONEN

Great! Thank you.