from the trumping-the-fish dept.
compuglot writes "Google gave journalists a glimpse of its next generation machine translation system at a May 19th Google Factory Tour. "Google Blogoscoped" offers an excellent overview of the presentation.
The system has been trained using the United Nations Documents as a corpus. This corpus is some 20 billion words worth of content. It uses existing source and target language translations (done by human translators at the U.N.) to find patterns it then uses to build rules for translating between those languages. Apparently it was successful where the current version had failed in translating certain phrases.
If anyone were capable of making a serious go of MT, that would have to be Google."
The Tao is like a stack:
the data changes but not the structure.
the more you use it, the deeper it becomes;
the more you talk of it, the less you understand.