14
Jul
Donald A. DePalma 14 July 2005
Filed under (Translation Technologies)
2 pepper rating

For the last 10 years we’ve asked web search site marketers a basic existential question — if they don’t find something, does it exist in the chaos of the internet? That led to “more pages crawled” press releases in which each search vendor claimed an ever increasing number of pages crawled, thus increasing the likelihood of finding it if it’s out there. Then they offered more sophisticated “advanced” search in which you could construct ever more arbitrarily complex Boolean expressions. Claims of comprehensive search strategies escalated.

But one question none of them could answer was “what if there is an answer out there but it’s not in English? Could you do a cross-language search? Could you re-direct my English-language query about transmogrifying rectabular excrusions into the rich body of Albanian scientific literature on the subject? And while you’re at it, would you be so kind as to translate the Albanian documents into English on the return trip to my computer screen?” Okay, that’s four questions. This capability could dramatically increase the chance of finding something relevant to my needs.

It looks like Yahoo! has taken a first step in that direction. Two months after Google said that it will deploy a massive MT solution for translating from any language into any language, Yahoo! announced that it will help German speakers search in three languages — German, French, and English — when they phrase their query in German. Results will come back in German. This undertaking combines on-the-fly machine translation of the original query into French and English, the specification of search terms for each language, and machine translation of the search results back to the desktop. All told, it’s a straightforward application of existing technology, but a clever — and long overdue — one. We expect this solution to be cloned quite quickly by other search engines.

This takes us a bit closer to the sci-fi promise of universal search and retrieval. Once applied across a wider range of languages, information consumers will be able to use the worldwide part of the web in a relatively transparent fashion. While the machine-translated results will sometimes be suspect, this cross-lingual search will at least inform us and other researchers that someone in the Universiteti i Tiranës’s physics department has already written just the article on rectabular excrusions that we need. Information existence proved, we can determine its relevance to our needs. Bravo!