Thinking about machine translation

About the webinar

In this post I reflect on a recent webinar on Machine Translation (MT) (view the 57-min recording here), co-hosted by Asia Online and Moravia Worldwide. The webinar was packed with information and insights, and I strongly recommend it to anyone who works with multilingual content and wants to learn more about this up-and-coming translation technology.

The New Model for Partnerships in MT webinar comprehensively covered differences between the two main varieties of MT, trends in content creation and localization including the synthesis of MT with human translation and editing, the pressure on businesses to localize more content within ever shortening timeframes, how MT can help solve this problem, and what the bottom-line impact is.

Machine translation: the big picture

The presenters, Kirti Vashee, VP Enterprise Translation Sales at Asia Online (@kvashee on Twitter and blogger at eMpTy Pages) and Bob Myers, COO at Moravia Worldwide, went through a lot of content-packed slides to lay out the basics:

  • The point is not to compare and contrast MT and human translation, it is about them working together to create a rich, complex model.
  • Two MT approaches have been most influential: rule-based MT applications (e.g. Systran) have benefited from over 50 years of research, and might have reached its limitations. Statistical MT approach (e.g. Google Translate, Asia Online) is younger, still growing rapidly, and is the one that the presenters are betting on.
  • It is useful to distinguish between generic and customized MT. Generic MT (e.g. Babelfish, Google Translate) gives a general gist of the source content, is not specialized, and is designed for broad application. Customized MT (Asia Online’s model) creates a tailored offering for an individual client or industry (e.g. IT or travel), fine-tuning the MT to maximize performance in that specific area.
  • To train the MT engine, both bilingual content (large volumes of normalized and cleaned up TM data, glossaries, “golden” translations) and very high quality monolingual content are necessary inputs. If you have good examples of how monolingual content is applied to the MT engine, I’d love to hear them!

The business case for Machine Translation

According to Moravia and Asia Online, the problem is that only 0.5% of what needs to be translated actually gets translated. The proposed solution is to boost productivity by delivering translation that is “good enough” – that is, MT plus some (non-perfect) human editing, with the goal to have translated content that the user can read, understand, and use to complete the task they are trying to carry out.

Over the past several years, the trend indicates the shift from assisted customer support (phone, email), to automated support (phone prompts, computer-suggested responses, indexed manuals), to community support (“users are always online and talking”). It is easy to see how tapping into the community, giving power to users to respond to queries and solve problems of other users, while having an efficient technology to make the solutions available in more languages can directly and significantly cut the cost of customer support and improve user satisfaction.Evolution of customer support

ROI calculations that Moravia and Asia Online have completed support this claim, showing that it’s much cheaper to shift most of the support burden to the community, while saving time with instant (MT-powered) translations: most recent support articles are the most important ones because they are addressing the issues that users are dealing with right now.

Does MT work, and what makes it work?

Quality is a tricky subject in both human and machine translation, because of varying power and interest levels of key stakeholders in charge of evaluating and validating translations. The proposed solution is to turn over the decision of whether the translation quality is “good enough” to the users: if they are served a page that was translated with the help of an MT engine, does it result in them solving their problem?

The following factors are crucial for success with MT: large volumes of input data (translation memories as well as monolingual content), high quality of the said data (something that many clients tend to overlook, believing in quantity over quality), human editing at various stages to clean the corpus and continuously train the translation engine, extensive and thorough glossaries.

The key takeaway about MT output quality: data must be clean, since what you feed to the engine is what it learns. It’s not enough to have tons of data, the quality and cleaning process to improve the content quality and adapt it for MT use is just as important.

In closing…

As MT becomes more widespread, I wonder how it will impact the way how people communicate as we get more exposure to machine-influenced, or “good enough” content. Will we start to write, speak and think in more machine-like ways?

I would love to have the PPT version of the webinar, because slides were tightly packed with text and images. Unfortunately, Moravia did not make the presentation file downloadable, so you will need to watch the webinar to get the insights. Another reason I wanted the PPT is that I absolutely adored the multicultural avatars: little dudes wearing their kimonos, Mexican hats, turbans and so on. I can totally imagine an entire animated shot starring the multicultural dudes explaining technology and localization. I would have shared a picture if I’d managed to get hold of the presentation file, which sadly, I did not.

Update: my plea was heard and the presentation popped up in my mailbox. Introducing the “multicultural dudes”, courtesy of presenters:Multicultural Dudes

Overall, this webinar is a very useful introduction for someone who is considering MT for their business, or anyone involved with localization business and willing to stay on top of the trends. Moravia Worldwide also offers a free consult for interested companies, to evaluate the cost and ROI (details and contacts at the end of the webinar). Even though I’m not in the market for machine translation right now, I am definitely sold on its significance for businesses who want to bring more relevant, useful content to their customers across the globe.

Related links

Your turn…

  • Is MT changing the way you work, as a translator, a vendor, or a content owner? How do you feel about seeing more human/MT “good enough” content out there? What else is important to know about MT?