31
Aug
Donald A. DePalma 31 August 2005
Filed under (Translation & Localization)
1 pepper rating

Translation memory software is at a crossroads today, with two distinct approaches represented in the marketplace: 1) “coarse-grained” TM software developed in the 1990s by firms such as Atril, SDL, Star, and Trados pivots on matching full sentences or large text segments between two languages as translators build up their translation memories over the course of several years; and 2) “fine-grained” TM from suppliers such as ESTeam, KCSL, Lingua & Machina, and MultiCorpora take a statistical, context-aware approach to creating a TM in just hours or days. These newer fine-grained products create multilingual repositories of indexed words, phrases, and clauses by analyzing and aligning massive corpora of translated text. The sub-sentence linguistic units they generate are much more likely to be re-used than complete sentences, especially since they capture how those segments are used in their original context. They can also identify full and fuzzy sentence matches a la traditional TM and can suck in existing TMs.

With a few hundred installations predominantly in government and multilateral agencies, MultiCorpora is a visible example of the fine-grained approach. We spoke with CEO Pierre Blais about the latest version of MultiCorpora which makes the obligatory evolutionary changes, adds a new alignment model, and modifies the licensing scheme.


  • Blais told us that the field test showed that MultiCorpora’s new, automated alignment model is “near perfection.” Deployment with current users will validate that assertion.
  • The company has made its licensing model more flexible, packaging tiered versions of the product by function and by the number of languages supported (for example, two languages might be enough for a Canadian government agency). The translator version is a subset of the product that lets companies provision users with the feature set demanded for their jobs — some power users who create TMs need all functions, translators might only want TM operations, and content authors might only require terminology look-up. The company also enhanced web access to its TM for distributed teams of translators.

In our discussion about marketing this version, Blais said that MultiCorpora introduced it at the hometown AILIA conference and will show it at ATA, TAUS, and TEKOM. Blais echoed our report on “Best and Worst Language Conferences” in noting that ATA, LISA, and Localization World will be held within a few weeks of each other (and ATA and LocWorld will even be in the same city but not simultaneously). This congested conference schedule poses cost and timing problems for vendors and buyers alike.

Conference complaints aside, MultiCorpora’s new version addresses some critical issues we have flagged in our language tool research: Cost, usability, flexibility, time to effective use, and quality. Based on mathematical algorithms, fine-grained TM products will parallel the quick advance of statistical machine translation solutions over older rules-based technologies.

Share or tag this post on:
del.icio.us Digg Furl Reddit Ask Google Ma.gnolia Technorati Windows Live Yahoo!