Machine Translation will not take your job, honest!

It’s a common theme. In [5, 10, 20] years, machine translation (MT) will be so good that there will be no human translators left. And, indeed, there are some trends that make this idea look tempting. The move towards statistical machine translation has allowed machines to learn from the texts they are given, allowing them to process at higher levels and produce more convincing results. But this won’t mean that they will replace humans, let’s see why.

The first reason that human translators will still have is that human language is slippery. Even if you were to compile a massive database (or “corpus”, to give it its technical name) of all the language used everywhere on the internet today, it would be out of date within 24 hours.

Why? Because as humans we love to play with, subvert and even break our own linguistic rules. Even people who hate languages love to make up new words and repurpose old ones. The biggest corpus in the world can only tell you how people used language yesterday, not how they are using it today and definitely not how they will use it tomorrow.

The basis of Statistical machine translation is that the way language has been used on previous occasions is a good guide as to how it should be used this time. Hence why Google Translate famously translated “le président des Etats-Unis” [the president of the United States] as “George W. Bush” months after President Obama was elected. The logic behind this decision is that if “George W. Bush” was used in that space enough times, it must mean that that phrase can be used all the time – a mistake that no human good human translator would ever make!

Add to this the fact that meanings of words change (something that has been mentioned elsewhere on this blog) and things look much worse for MT. It gets worse though, since language is bound so tightly to culture, “literal” translations are often incredibly misleading.

Here is a really simple example. In English, we have a set number of phrases we use to sign off a formal letter. We might use “Yours sincerely” or “Yours faithfully” or maybe “Kind regards”. In French, formal letter sign-offs are much longer and one of them might literally be translated as “Waiting for your response, I ask you to accept, Sir, the expression of my distinguished salutations”.

Now, statistical machine translation experts will rightly tell you that a good, trained package would not translate this literally but would look for an English equivalent. The problem is that the English “equivalent” would be different for different contexts and would involve looking much wider than MT normally looks. The decision here is linked to the context of the letter (specifically whether or not you know the name of the person you are sending it to) and not to language considerations themselves.

There are lots of translation decisions that are context-based like this one and it is in these kinds of decisions that MT will always flail around helplessly. It is in these kinds of context-based decisions that good human translators will always triumph.

So where might the future lead? Well, just as human translators are becoming more specialised, so will MT engines. Research presented at the recent IPCITI conference showed that there are ways that MT and precisely, post-edited MT can work. Perhaps one area where MT will work is in specialised fields, which use consistent language. Another view is that human translators will be called upon to make more use of their knowledge of the world, which adds justification to universities like Heriot-Watt who train their students in areas like international organisations and research skills alongside their technical training in translation and interpreting.

The future is bright, but the future certainly isn’t Machine Translation taking over completely from humans.