Translation memory software is at a crossroads today, with two distinct approaches represented in the marketplace: 1) “coarse-grained” TM software developed in the 1990s by firms such as Atril, SDL, Star, and Trados pivots on matching full sentences or large text segments between two languages as translators build up their translation memories over the course of several years; and 2) “fine-grained” TM from suppliers such as ESTeam, KCSL, Lingua & Machina, and MultiCorpora take a statistical, context-aware approach to creating a TM in just hours or days. These newer fine-grained products create multilingual repositories of indexed words, phrases, and clauses by analyzing and aligning massive corpora of translated text. The sub-sentence linguistic units they generate are much more likely to be re-used than complete sentences, especially since they capture how those segments are used in their original context. They can also identify full and fuzzy sentence matches a la traditional TM and can suck in existing TMs.
With a few hundred installations predominantly in government and multilateral agencies, MultiCorpora is a visible example of the fine-grained approach. We spoke with CEO Pierre Blais about the latest version of MultiCorpora which makes the obligatory evolutionary changes, adds a new alignment model, and modifies the licensing scheme.
In our discussion about marketing this version, Blais said that MultiCorpora introduced it at the hometown AILIA conference and will show it at ATA, TAUS, and TEKOM. Blais echoed our report on “Best and Worst Language Conferences” in noting that ATA, LISA, and Localization World will be held within a few weeks of each other (and ATA and LocWorld will even be in the same city but not simultaneously). This congested conference schedule poses cost and timing problems for vendors and buyers alike. Conference complaints aside, MultiCorpora’s new version addresses some critical issues we have flagged in our language tool research: Cost, usability, flexibility, time to effective use, and quality. Based on mathematical algorithms, fine-grained TM products will parallel the quick advance of statistical machine translation solutions over older rules-based technologies.
|
|