2 pepper rating

This week in Luxembourg nearly 200 conference delegates temporarily forgot about “Pop Idol, “American Idol,” and “Deutschland sucht den Superstar” as the European Commission (EC) heard and evaluated proposals for its newest translation technology idols. Representatives from universities, research centers, and commercial organizations were looking for a share of the 100 million euros that the EC budgeted for new language technology initiatives. Common Sense Advisory was on hand to keynote the Language Technology Days conference with a presentation on the emerging practice of “enterprise language processing” and its reliance on centralized operations, process definition, content triage, and advanced technology.

The conference, organized by the EC’s Information Society and Media Directorate-General, provided an opportunity for already funded projects to report on their progress, for those with technology-related initiatives to get a hearing on their proposals, and for future proposers to learn the process for and best practices of getting a proposal funded. The conference showcased technologies that could lower barriers to communications in corporate, government, and non-governmental organizations and enable the multilingual society envisioned by the European Union. These themes were emphasized by Patricia Manson and Roberto Cencioni, representing Digital Content and Cognitive Systems work at the European Commission.

Some projects presented at the conference have been around for quite some time, such as the European Language Resources Association (ELRA), which came into being in 1995 to promote language resources and which has focused on ways to evaluate language engineering technologies. At the other end of the spectrum were recently initiated projects such as Clarin, with its goal of networking existing language archives and tools for the humanities and social sciences, and Meta-Net and Meta-Share, a proposed digital marketplace where language data and tools will be documented, shared, combined, and otherwise leveraged “to support a data economy.” In concept and structure, Meta-Net could become an EC-funded superset of the TAUS Data Architecture, also presented at the conference. Another notable effort was PANACEA, a long overdue project to integrate language resources for better exploitation by machine translation systems.

At a very basic level, each of the groups calls for an increase in collection and leverage of data and the metadata that describes it. In one sense, we saw this conference as an information age analog to the origins of the European Union in the European Coal and Steel Community, which was created to deal with the basic infrastructural commodities required to reconstruct Europe following World War II. As we pointed out in our report on The European Translation Market, the European community’s multilingual, cross-border economy depends greatly on the free and easy flow of information among and between the constituent nations.

Finally, we found the absence of most commercial software vendors and language service providers to be noteworthy, but an investment of 100 million euros by the European Commission with matching funds from the winners could have some influence in the language technology marketplace — but nothing near the potential impact of any technology or service model adopted by large providers such as the EC’s Directorate for Translation.