Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Coming Soon, The Google Translator 418

Posted by CmdrTaco on Tuesday May 31, 2005 @10:11AM from the trumping-the-fish dept.

compuglot writes "Google gave journalists a glimpse of its next generation machine translation system at a May 19th Google Factory Tour. "Google Blogoscoped" offers an excellent overview of the presentation. The system has been trained using the United Nations Documents as a corpus. This corpus is some 20 billion words worth of content. It uses existing source and target language translations (done by human translators at the U.N.) to find patterns it then uses to build rules for translating between those languages. Apparently it was successful where the current version had failed in translating certain phrases. If anyone were capable of making a serious go of MT, that would have to be Google."

This discussion has been archived. No new comments can be posted.

Coming Soon, The Google Translator

Load All Comments

Search 418 Comments Log In/Create an Account

Comments Filter:

fascinating (Score:5, Informative)

by professorhojo ( 686761 ) * writes: on Tuesday May 31, 2005 @10:12AM (#12683644)

since the RTFAs lacked any kind of crunchiness, i sourced some great stuff here [jhu.edu] that does a wonderful job explaining how this system works, and gives the advantages the statistical translation method has over the rules-based approach. as well as the disadvantages.

fascinating stuff:

"Currently, most machine translation technology, including consumer-oriented programs such as Systran's Babel Fish, have been "taught" the rules of language, such as verb tenses and when to use parts of speech. Programmers painstakingly hand-build systems based on such rules. "The computer is told, if you see this thing in Russian, replace it with this thing in English," explains Yarowsky.

"While somewhat effective, such systems are time-consuming to build (consider how long it takes most humans to learn a language and all its rules), and resulting translations are still marred by grammatical and other errors. Those that do work fairly well usually tackle popular Western languages, such as French, German, and Spanish; there are few translation programs developed for other important tongues, such as Chinese, Turkish, or Arabic, let alone for more obscure languages like Tajik.

"To tackle a broader range of the world's languages, and to improve on the quality of machine translation, Yarowsky and his Hopkins colleagues are developing computer programs that can be trained to figure out any language using statistical analysis, i.e., looking at the probabilities of language patterns. In what's known as automatic knowledge acquisition, the computer could "learn" Serbian well enough to translate future documents or conversation, or at the least pick out pertinent words like "bomb."

"As Yarowsky explains: "Say you want to teach a computer how to translate Chinese: You give the computer 100,000 sentences in English and the same 100,000 sentences in Chinese and run a program that can figure out which words go to which words. If in 2,000 sentences you have the word Washington, and in about the same number of sentences you have the word Huashengdun, and they occur in the same place in the sentence, these words are likely translations.

"It's all just observation," Yarowsky adds. "Children do the same thing, but they also do it through visual stimulation and feedback. They see a book and hear the word 'book,' and eventually they learn that it's a book. They see a bird with its wings flapping around and learn that is called a bird. It's the same with machines, only they have much better memories. Computers could remember exactly when and where they saw the words bird and book."

"So, instead of telling a computer how to do something -- conjugate the verb 'to be' in Spanish, for example (I am = soy) -- researchers give it tens of thousands of examples and program the computer to find repeated patterns that the computer can use to conjugate new verbs. Trained this way, the program could potentially "learn" phrase structure and the rules of translation.

"As Yarowsky notes in his 100,000-sentence example, one way to accomplish automatic knowledge acquisition is to use bilingual or parallel text. The program "reads" a document in English and then a version in a second language. Such texts used by Hopkins researchers include the Bible, which is available on the Web in more than 60 languages, the Book of Mormon (over 60 languages), and the United Nations Declaration of Human Rights (240 languages).

"Aiding the computer is the fact that the English version of such texts can be annotated by hand or using another computer program -- essentially marked up to show, for example, that Jesus is a noun and pray is a verb. The translation program-in-training needs such information because it cannot translate future text just by substituting individual words in each language; it must also be able to analyze how sentences work. To do so, the computer program uses pattern recognition templates and other tools to understand sentences on a syntactic level. Simply put, the program is essentially given clues to know what to look for, notes Yarowsky: "It should figure out the subject, figure out the object, and other elements of sentence structure."
Read the rest of this comment...

Share
twitter facebook
- Re:fascinating (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  This is great and all, but I won't be impressed until it translates the gibberish that comes from the Iranian gas station attendant everytime I stop for gas.
  
  For now, I just nod my head in ignorance, and count my change.
  - Re:fascinating (Score:5, Funny)
    
    by Anonymous Coward writes: on Tuesday May 31, 2005 @10:54AM (#12684037)
    
    You go all the way to Iran to get gasoline? Who are you, George W Bush?
    
    Parent Share
    twitter facebook
- Re:fascinating (Score:5, Interesting)
  
  by NoMoreNicksLeft ( 516230 ) writes: <john DOT oyler AT comcast DOT net> on Tuesday May 31, 2005 @10:34AM (#12683867) Journal
  
  Some questions:
  
  Why can't a dictionary be made of nouns, of verbs? Why can't we have it statistically analyze the grammar for ambiguous words?
  
  Does it only recognize exact matches? Especially with verb conjugation, I'd think any words 80% similar or so should be considered matches. Not all languages are as conjugation happy as latin or spanish or even english, and you often lose some nuanced conjugations when translating from one to the other.
  
  What will be done about idioms? Translating these word for word often makes no sense at all, and for me at least (no idea what the official stance is), I'd rather they substitute in idioms with the same general meaning, but for the culture being translated to.
  
  Does it work on alternate character systems, is it word boundary dependent?
  
  Does it understand punctuation rules, will this post translated to spanish have the upside down question marks where they're supposed to be?
  
  How many of the world's existing languages have enough text for this to even be feasible?
  
  Parent Share
  twitter facebook
  - Re:fascinating (Score:4, Interesting)
    
    by Simonetta ( 207550 ) writes: on Tuesday May 31, 2005 @11:13AM (#12684217)
    
    What will be done about idioms? Translating these word for word often makes no sense at all...
    
    The often-quoted examples are: "Out of sight, out of mind" becomes "invisible idiot" and "the spirit is willing, but the flesh is weak" comes out as "The meat is rotten, but the wine's great".
    
    How many of the world's existing languages have enough text for this to even be feasible?
    
    Ah yes, that's the tricky part. Translating for preservation near-extinct languages that are in spoken or recorded form only. A true programming challenge.
    
    I find the Babel-Fish translator to be nearly useless and the Systran box at www.systransoft.com very helpful when selling things on eBay to people in non-English-speaking countries. When I get a question about an auction item that has little grammar cohesion and has a offshore domain, like
    "How many cost you Italia he transport?", I'll run my response through Systran's translator and add the original english afterwards. More often than not the sales and PayPal transactions are successful.
    
    I believe that machine translation will be the 'killer application' for 64-bit home PCs. ..along with DRM busting..
    
    There are five levels of machine translation:
    
    1) word substitution.
    2) phrase substitution.
    3) cohesive paragraphs and idioms.
    4) light literature, magazine articles, and business.
    5) classical literature, law, and diplomacy.
    
    Each level requires at least an order of magnitude more computing power than the previous one. Babel fish is on level two and systran is on three. Google is positioning themselves to be between levels four and five.
    
    I wish them the best of luck. Without sarcasm or irony. This is important work.
    
    "Give me a one sentence definition of 'irony'."
    "Yeah, it's where the Iranians come from."
    
    Parent Share
    twitter facebook
  - Re:fascinating (Score:5, Interesting)
    
    by kebes ( 861706 ) writes: on Tuesday May 31, 2005 @11:24AM (#12684299) Journal
    
    What will be done about idioms? Translating these word for word often makes no sense at all, and for me at least (no idea what the official stance is), I'd rather they substitute in idioms with the same general meaning, but for the culture being translated to.
    
    I think this is precisely where statistical approaches can really shine. A purely dictionary-based conversion will translate an idiom word-for-word, which will make no sense at all. However, a statistical approach could be constructed to look for the "longest reliable match." So if the idiom "cat got your tongue" re-appears over and over, and is correlated to a different idiom in other languages (that may not use the word "cat"!), then the algorithm could tokenize "cat got your tongue" as a single entry that would map to something different in each language.
    
    How many of the world's existing languages have enough text for this to even be feasible?
    
    You're right... that's the killer. Translating using statistics (especially idioms) properly will require a huge database of samples. Even what's been suggested so far is not enough. If we want to translate technical documents, we need a new database. If we want to translate "free form writing" we need yet more data.
    
    However, there's lots of data out there (already in digital format) that could be used... we just need people to see the potential and start using these datasets (or making these datasets available). For instance, for technical stuff there are thousands of abstracts for papers and for theses that are translated into various languages (for instance, many articles published in german are then also released in english... I live in Quebec, and every thesis abstract has to be translated into french also... etc.). Many legal documents (many of which are already available to the public) are also translated for various reasons. It would also be interesting if translators all around the world uploaded documents they had translated into some database (assuming it's nothing sensitive of course!). As this database grew, it would become more and more reliable. Let's face it, there's tons of human-based translation going on, forming a massive dataset... but by and large it's just scattered and not useable.
    
    Parent Share
    twitter facebook
    - Re:fascinating (Score:4, Insightful)
      
      by Bigman ( 12384 ) writes: on Tuesday May 31, 2005 @11:35AM (#12684407) Homepage Journal
      
      Don't forget that many works of fiction are translated into several languages. The only problem with that is persuading the copyright holders to permit their use in training computer translation systems. I'm not sure where you would stand with this legally (After all, IANAL!), so I suspect this is why Google has been using the UN documents. I would imagine these are effectively public domain; and if not, I would imagine the UN would see a reliable machine translation project worth supporting. The only downside I can see is that the UN texts are unlikely to have many idioms or colloqualisms, which would limit the resulting translators usefulness in a more general context.
      
      Parent Share
      twitter facebook
      - Re:fascinating (Score:3, Interesting)
        
        by bogado ( 25959 ) writes:
        
        laws appart (you could use public material like project guthemberg), I think that a translated book is, or at least seem like a bad input for this. Since the text say it expects whole sentences translated 1 - 1.
        
        A novel or book is not translated like this, the best translation aren't word for word or sentence to sentece. Good translators almost rewrite the whole thing, some times with a different style.
        
        Language has a lot of cultural meaning into it, and even the same language sometimes needs to be adpted t
      - Re:fascinating (Score:3, Interesting)
        
        by MoralHazard ( 447833 ) writes:
        
        The only problem with that is persuading the copyright holders to permit their use in training computer translation systems.
        
        As long as the translations have been created in advance, and you can obtain copies of the works in question, it should be fine, legally. I cannot see a way that a court could find the machine-state of a translation machine to be a "derived work" in the copyright sense, and it's certainly not making any literal copies.
        
        Now, someone could distribute a text under a license agreement t
- Comment removed (Score:5, Interesting)
  
  by account_deleted ( 4530225 ) writes: on Tuesday May 31, 2005 @10:39AM (#12683903)
  
  Comment removed based on user account deletion
  
  Parent Share
  twitter facebook
  - Re:fascinating (Score:4, Insightful)
    
    by MindStalker ( 22827 ) writes: <mindstalker.gmail@com> on Tuesday May 31, 2005 @10:59AM (#12684089) Journal
    
    Well the bible is hebrew, greek and latin. There are no outdated English phrases in the Bible. Now if your refering to the King James translation of the bible, obviously such would be good for teaching google Old English but not modern english. You would need a much newer translation that doesn't use old phrases. Such do exist btw.
    
    Parent Share
    twitter facebook
    - Re:fascinating (Score:2)
      
      by magefile ( 776388 ) writes:
      
      Er ... you mean that it'd be good for teaching Google old-fashioned English. Old English is not merely archaic English - it's much closer to the original Germanic ancestry, to the point that it's as similar to German as English.
  - Re:fascinating (Score:2, Interesting)
    
    by Temposs ( 787432 ) writes:
    
    Computational Linguistics is my field, so I can tell you that the problem with the current state of corpora is a lack of massive cross-language corpora over many languages.
    
    The two sources used by Google are basically the only sources available for the kind of task we're talking about. Obviously the thing to do is work on creating more cross-language corpora, and I'm sure this is being done, but it takes much time to create a cross-language corpus on the scale that the UN documents or translations of the Bi
  - Re:fascinating (Score:5, Funny)
    
    by should_be_linear ( 779431 ) writes: on Tuesday May 31, 2005 @11:49AM (#12684551)
    
    but the Bible uses many outdated or non-standard phrases and sentence structures, as does most legal text I've ever seen. I'm not a linguist or a statistician, but from my uneducated viewpoint it sounds like problems might arise in the texts that are available for training the system. Anyone know how they're planning to overcome this?
    
    Harry Potter is the answer. It is several "normal language" books and is translated to all major languages. Also, program would finally figure out how to translate words like "Quidditch".
    
    Parent Share
    twitter facebook
  - - translations of translations (Score:2)
      
      by grahamsz ( 150076 ) writes:
      
      Many bible translations aren't made from the original languages but from other modern language versions.
      
      Thus, i'd expect you'd find, a french translation of the NIV, which is quite a modern translation in the first place.
      - Re:translations of translations (Score:4, Insightful)
        
        by grahamsz ( 150076 ) writes: on Tuesday May 31, 2005 @01:08PM (#12685323) Homepage Journal
        
        I was wrong about the french. However the spanish NVI appears to parallel the NIV, and i'd imagine would be pretty good candidates for this sort of analysis.
        
        http://www.booksofthebible.com/p2390.html [booksofthebible.com]
        
        I believe it's key that in the situation of
        
        Ancient Lang A -> Modern Lang B -> Modern Lang C
        
        that B and C will be far closer than
        
        Ancient Lang A -> Modern Lang B
        Ancient Lang A -> Modern Lang C
        
        Parent Share
        twitter facebook
    - Re:fascinating (Score:3, Informative)
      
      by aldoman ( 670791 ) writes:
      
      You input them all, and let the statistics do their magic.
      
      Just like your email spam filter can handle you pressing junk on stuff that isn't junk, or not junk on stuff that is, it's just all numbers and there is an inherent tolerance for small errors that will be created with this sort of system.
- Re:fascinating (Score:5, Insightful)
  
  by elrous0 ( 869638 ) writes: on Tuesday May 31, 2005 @10:42AM (#12683935)
  
  or at the least pick out pertinent words like "bomb."
  Why do I have a funny feeling that this research isn't being funded by philanthropic foundations?
  
  -Eric
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re:fascinating (Score:2)
    
    by The Desert Palooka ( 311888 ) writes:
    
    Until it replaces the word book with DCS00001
    
    *grin*
- except, no. (Score:4, Insightful)
  
  by mattdm ( 1931 ) writes: on Tuesday May 31, 2005 @11:20AM (#12684272) Homepage
  
  "It's all just observation," Yarowsky adds. "Children do the same thing, but they also do it through visual stimulation and feedback. They see a book and hear the word 'book,' and eventually they learn that it's a book. They see a bird with its wings flapping around and learn that is called a bird. It's the same with machines, only they have much better memories. Computers could remember exactly when and where they saw the words bird and book."
  
  Except, no. Humans are basically generalization machines. Babies are able to grasp very quickly that words apply to categories of things -- not just that a *specific* item is a bird or a book, but to learn "I know a bird when I see it", even without necessarily being able to provide a scientific definition. Computers can be built to emulate this ability, but learning word-to-word mappings isn't *nearly* the same as learning abstract concepts and which words apply to them.
  
  Parent Share
  twitter facebook
  - Re:except, no. (Score:3, Interesting)
    
    by rreyelts ( 470154 ) writes:
    
    Babies are able to grasp very quickly that words apply to categories of things
    
    This is so true. I remember being utterly amazed when my toddler was able to immediately spot a bird in real life based off a cartoonish caricature in one of his children's books. It just flabbergasts me how a mind so young can perform recognition that we can't achieve with a beowulf cluster of supercomputers.
- - Re:fascinating (Score:5, Funny)
    
    by fizban ( 58094 ) writes: <fizban@umich.edu> on Tuesday May 31, 2005 @10:48AM (#12683987) Homepage
    
    "Open the pod bay doors, HAL."
    
    "STFU, Dave. LOL!"
    
    Parent Share
    twitter facebook
  - Re:fascinating (Score:2)
    
    by tomhudson ( 43916 ) writes:
    
    Oh God, it's going to learn languages from examples? I hope they don't try this over the net, otherwise we'll have computers writing LOL, IC, and other nonsense.
    
    ... and they'll be replying to half the "articles" posted on slashdot: "I AM A SCRIPT YHBT YFI HAND"
Needs a *bit* more work... (Score:4, Interesting)

by TripMaster Monkey ( 862126 ) * writes: on Tuesday May 31, 2005 @10:12AM (#12683650)

Just to illustrate, here's the summary of this story, translated to German and back to English using Google's current version [google.com]:

Google gave a Glimpse of its machine Uebersetzungsystems the following production at the factory route of the A May 19 to journalists. Google. "Google Blogoscoped" offers an excellent overview of the representation. The system was trained with the nation documents as korpus. This korpus is something 20 billion word value of contents. It uses the existing target language translations (takes place via human translators at the U.N.) Samples find, which use it then to establish guidelines for translating between those languages. Apparent it was successful, where the present version had failed, if it translated certain cliches. If everyone of forming a serious were capable, of the M.Ue., those would go to have having to Google.

Share
twitter facebook
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re:Needs a *bit* more work... (Score:2)
    
    by TripMaster Monkey ( 862126 ) * writes:
    
    That's why I said 'just to illustrate'.
- Bork bork bork! (Score:5, Funny)
  
  by AtariAmarok ( 451306 ) writes: on Tuesday May 31, 2005 @10:15AM (#12683673)
  
  Here is the result as interpreted by the Swedish Chef:
  "Guugle-a gefe-a a Gleempse-a ooff its mecheene-a Uebersetzoongsystems zee fullooeeng prudoocshun et zee fectury ruoote-a ooff zee A Mey 19 tu juoorneleests. Guugle-a. "Guugle-a Bluguscuped" ooffffers un ixcellent ooferfeeoo ooff zee representeshun. Zee system ves treeened veet zee neshun ducooments es kurpoos. Thees kurpoos is sumetheeng 20 beelliun vurd felooe-a ooff cuntents. It uses zee ixeesting terget lungooege-a trunsleshuns (tekes plece-a feea hoomun trunsleturs et zee U.N.) Semples feend, vheech use-a it zeen tu istebleesh gooeedelines fur trunsleteeng betveee thuse-a lungooeges. Epperent it ves sooccessffool, vhere-a zee present ferseeun hed feeeled, iff it trunsleted certeeen cleeches. Iff iferyune-a ooff furmeeng a sereeuoos vere-a cepeble-a, ooff zee M.Ue-a., thuse-a vuoold gu tu hefe-a hefeeng tu Guugle-a."
  
  Looking forward to a www.borkle.com which returns all its results in such a format.
  
  Parent Share
  twitter facebook
- Re:Needs a *bit* more work... (Score:2)
  
  by zoney_ie ( 740061 ) writes:
  
  To be honest, I don't regard that as all that bad for current machine translation. The fact they think they have something that will be at least some bit better than the current version is great.
  
  I mean in fairness - it's nearly good enough as it is to grasp the story (reading that translated from German version). For any other purpose, hand translation is required any ways (for any presentation purpose, even ordinary English needs tweaking, much less translated text). As long as this improvement increases
Google's translator (Score:3, Interesting)

by bcmm ( 768152 ) writes: on Tuesday May 31, 2005 @10:13AM (#12683653)

So what powers Google's current translator? I have seen it give word-for-word the same as Babel on some occasions (but with better handling of non-ASCII characters).

Share
twitter facebook
- Re:Google's translator (Score:5, Informative)
  
  by iantri ( 687643 ) writes: <iantri AT gmx DOT net> on Tuesday May 31, 2005 @10:22AM (#12683736) Homepage
  
  SystranSoft's Systran [systransoft.com] is behind almost all of the machine translation srevices on the Internet, lincluding Google's.
  
  Parent Share
  twitter facebook
  - Re:Google's translator (Score:2)
    
    by metlin ( 258108 ) writes:
    
    Wow, that's just fantastic!
    
    Thanks, I was looking for some of the less common languages, and it turned out that Systran has those.
    
    Owe you one, mate.
  - Re:Google's translator (Score:2)
    
    by Potor ( 658520 ) writes:
    
    Systan is a usually a joke. Although, the last couple days it has been surprisingly giving me some decent translations. But anyway, now I am pissed. Google is becoming my competitor! I am a translator, and just finished a 400 page book (Dutch -> English). cheers, potor
    - Re:Google's translator (Score:2, Insightful)
      
      by mOdQuArK! ( 87332 ) writes:
      
      I am a translator,
      
      Well, if their service is free and works well (not necessarily perfectly), you now have a tool which should let you translate that entire book in about a week (assuming most of the week will be spent checking the translation & preserving the "flavor" of the source).
      - Google sets itself up for success (Score:2)
        
        by Potor ( 658520 ) writes:
        
        Ah, the promise of all translation software!
        Of course, the issue would be for me to show that I add value to what may freely (presumably) be gotten from the web. And luckily enough, no translation software has come close to providing literature-quality work.
        
        In my mind, Google's choice of the UN indicates a confidence that they will reach a high level of accurate technical translation. This makes great business sense, as the UN is typical of markets that will require a quick turnaround on translation, a
- Re:Google's translator (Score:2)
  
  by Nytewynd ( 829901 ) writes:
  
  From the sounds of things, Google learns with a neural network. It has the ability to learn new mappings based on pattern matching. Babblefish sounds like a distinct mapping of phrases that have been hand coded.
  
  Theoretically, Google can get better at translating over time, as it's neural network learns better connections. It might even get better than a human translator if it goes long enough. There will always be small discrepancies, but if the bulk of the text is correctly translated, that would be
  - Re:Google's translator (Score:2)
    
    by JohnFluxx ( 413620 ) writes:
    
    It can't get better unless it asks the user to say whether the translation was good or not, and perhaps even ask the human to give the correct version.
Integrate with GMAIL! (Score:5, Interesting)

by RubberDogBone ( 851604 ) writes: on Tuesday May 31, 2005 @10:16AM (#12683688)

Make this work with Gmail and I'd even pay money for it!

Tired of getting email from Amazon.DE on my Gmail account and having to copy and paste it over to Babelfish.

That would be very useful for me.

Share
twitter facebook
- Re:Integrate with GMAIL! (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  Why are you subscribed to Amazon.de mailing list if you don't speak German?!?!? How are you gonna read those German books?!
- Re:Integrate with GMAIL! (Score:2)
  
  by yppiz ( 574466 ) writes:
  
  Finally, I'll be able to understand all the Chinese and Russian spam in my Inbox!
  
  --Pat
Anyone care to make a bet? (Score:5, Funny)

by Weaselmancer ( 533834 ) writes: on Tuesday May 31, 2005 @10:17AM (#12683695)

That Microsoft will announce a new revolutionary language translation service sometime in the next two weeks or so?

Share
twitter facebook
- Re:Anyone care to make a bet? (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  Well, it's not like they don't have the technology...
  
  http://research.microsoft.com/nlp/Projects/MTproj. aspx [microsoft.com]
- Re:Anyone care to make a bet? (Score:2)
  
  by snarkh ( 118018 ) writes:
  
  Microsoft has a large group, which has been working on machine translation since early 90's.
  - Re:Anyone care to make a bet? (Score:2)
    
    by xtracto ( 837672 ) writes:
    
    Microsoft has a large group, which has been working on machine translation since early 90's.
    
    Sure but, by the way MS works they will wait until Google or any other one use the translation technology as direct competition in order to start really using that "large group".
    
    This is why MS has not entered the translation market, since only Systrans and other minor dictionaries are available and they have this small market, but if Google or Yahoo or even Apple started using this technology to give an Added Valu
Unsupported assertions (Score:2, Insightful)

by gowen ( 141411 ) writes:

If anyone were capable of making a serious go of MT, that would have to be Google.

Erm... why is that? Is it because machine translation in some sense search technology? Because they've hired reknowned experts in natural language processing? Because they've got a lot of money slushing around and employ a lot of generally smart people?

Oh, no. It's because geeks like Google. Therefore, Google are capable of superhuman feats that mere scientists -- those with years of experience in relevant fields -- ar
- Re:Unsupported assertions (Score:5, Insightful)
  
  by stevejsmith ( 614145 ) writes: on Tuesday May 31, 2005 @10:26AM (#12683778) Homepage
  
  No, it's because Google has tons of talent, money, already-archived text to work with, computers, respect in the industry, and consumer base. I can't think of a company that possesses these characteristics more so than Google.
  
  Parent Share
  twitter facebook
  - Re:Unsupported assertions (Score:2)
    
    by Stonehand ( 71085 ) writes:
    
    I can think of a certain TLA that would be extremely interested in machine translation and probably has access to ludicrous amounts of computing power, archived text in a variety of languages of interest, and top-notch scientists.
  - - Re:Unsupported assertions (Score:4, Interesting)
      
      by stevejsmith ( 614145 ) writes: on Tuesday May 31, 2005 @10:52AM (#12684018) Homepage
      
      Dell and Cisco are not in this business. IBM is not hemorrhaging with cash in the way Google is. Microsoft is not in the business of providing free Internet accessories. In any case, Google has a track record of innovative ideas ("innovative ideas" meaning that not only did they come up with it and implement it partially, but they invested full-on into it, bet money on it, and made it better than the competition) and is most likely of any company who would announce this to actually pull through with it. If some little start-up announced this (as I'm sure a few have), people would take it with a grain of salt. But that Google announces it, I'm sure most people believe fully that Google will deliver on its promise.
      
      And you're right, people have thought of this exact idea (I'm sure every other computer major and linguist has, in fact, since the birth of ENIAC--I know the idea's crossed my mind tons of times, not that I'd have the slightest clue how to do it), however actually attempting to do it with a reasonable chance of success? I'm going to say Google is the first.
      
      Plus, I got the impression from the article that the serve is operational, just not available to the public. If you'll read the article, you'll find that the translator properly translated a fairly complicated phrase from Arabic to English. I'd guess that this service is, from a technical standpoint, at least 95% done -it's just the packaging and touching-up that needs to be done.
      
      Parent Share
      twitter facebook
      - Re:Unsupported assertions (Score:2, Insightful)
        
        by rca66 ( 818002 ) writes:
        
        If you'll read the article, you'll find that the translator properly translated a fairly complicated phrase from Arabic to English.
        
        For each existing MT system you can find fairly complicated sentences which translate ok.
        I'd guess that this service is, from a technical standpoint, at least 95% done -it's just the packaging and touching-up that needs to be done.
        
        "Technial standpoint" you mean, the system is able to translate arbritrary text? Maybe. Or do you mean the system is able to translate a
    - - Re:Unsupported assertions (Score:2)
        
        by Timesprout ( 579035 ) writes:
        
        Clearly someone with mod has no no concept of what flamebait really is or what moderation actually means. Gowens points are prefectly valid, Google have done nothing, let me repeat that, nothing groundbreaking. What they have done is taken some old ideas and implemented them very well. The double standards round here though are amazing.
        
        Just imagine MS produced a web accelerator which recorded personally identifable information about you and made unrequested downloads to your machine. The poor slashbots w
- Re:Unsupported assertions (Score:5, Interesting)
  
  by KagatoLNX ( 141673 ) writes: <kagatoNO@SPAMsouja.net> on Tuesday May 31, 2005 @10:31AM (#12683834) Homepage
  
  Ummm, geeks like Google because Google employs scientists. Which mere scientists were you talking about?
  
  Were you talking about the PhDs at universities busy teaching classes, churning out research papers to avoid being fired (an ugly numbers game some departments play), or perhaps burning time generating volumes of grant paperwork?
  
  Oh, maybe you were talking about the scientists employed by the private sector. I'm sure the management teams wherever they work are willing to take the time and care that Google won't.
  
  You do know how may PhDs Google employs, right? Not to mention that they won't be fighting for resources there either. No backstabbing, liquidating MBAs trashing their corporate budget. No football-crazed alumni assassinating their funding proposals either.
  
  Also, I would remind you that "mere scientists" often come up with the needed research (there are volumes in MT alone), but rarely can afford to put in the years that it takes into a good implementation.
  
  Geeks love Google because it is, in many respects, where the best of business meets the best of academia.
  
  Parent Share
  twitter facebook
  - - Re:Unsupported assertions (Score:2)
      
      by l3v1 ( 787564 ) writes:
      
      I love how you believe "churning out research papers" is somehow orthogonal to doing research.
      
      You'd be surprised...
    - Re:Unsupported assertions (Score:2)
      
      by snarkh ( 118018 ) writes:
      
      Often it is...
- Re:Unsupported assertions (Score:2, Insightful)
  
  by benjcurry ( 754899 ) writes:
  
  Oh, come on! It's because in the past, most of what Google has undertaken has been enormously successful and useful. Yeah, they hire alot of smart people and have lots of money. Gmail (IMO) is the golden standard of free webmail. Google Maps (IMO) is the best map system out there. They also are responsable for Adsense, Adwords and I think they even have a search engine that gets a good amount of hits per diem. Maybe there is a reason to think this translation thingamabob will be good!
- Re:Unsupported assertions (Score:3, Insightful)
  
  by imroy ( 755 ) writes:
  
  Erm... why is that?
  
  Because Google has shown that it knows how to handle large amounts of human-created content and create useful information from it. The search engine was just the start. Just look at the spell checker they added. It doesn't use a dictionary, just the mass of web pages they spider monthly. It's not always perfect, but it allows it to be more adaptive than other methods. This translator looks like something similar along those lines.
- - Re:Unsupported assertions (Score:3, Funny)
    
    by tobybuk ( 633332 ) writes:
    
    Look pal, you said something about Google that could be taken a negative. Here on Slashdot that is only slightly better that saying something good about Windows. But thank your lucky fucking stars you didn't decide to disparage the immortal being that is Linux. That's worse than flushing the original Koran down the pan.
so name.. (Score:2)

by Turn-X Alphonse ( 789240 ) writes:

Googlefish or babelgoogle? Maybe we shouldjust change "internet?" to google and every site much have google involved.

Googlesoft.com
Googlenix.com
Opengoogle.org
g ooglejournal.com
Piffle (Score:5, Funny)

by ear1grey ( 697747 ) writes: on Tuesday May 31, 2005 @10:20AM (#12683720) Homepage

If anyone were capable of making a serious go of MT, that would have to be Google.

An interesting story, but please, for the love of all that's balanced and objective; tell me again how that smudge on your nose really is chocolate.

Share
twitter facebook
- Re:Piffle (Score:2)
  
  by Heisenbug ( 122836 ) writes:
  
  Piffle yourself. They have 100,000 servers to throw at statistical analysis, they have enough cash floating around to offer sign-on bonuses that even Microsoft can't beat, they have a history of applying PhDs to practical problems, and they have obvious business interests in making machine translation more useful. Google-worship aside, they're certainly a top contender in my book.
  
  Of course, I don't know anything about this specific field, and that article sure was pretty fluffy. I'd be interested in more i
  - Re:Piffle (Score:3, Insightful)
    
    by gordo3000 ( 785698 ) writes:
    
    neither their computing power nor their cash is anything to be in awe over. Neither are truly top contenders when it comes to the computing industry, unless you take the time to wonder why this is impressive.
    
    Remember, almost all of those servers are needed for what they are currently working on, sot hey don't really have anywhere near that kind of computing power. I would be willing to bet that if they threw every free cycle at this, they have closer to 20%. Further, most of these servers are for moving
Altavista Babelfish (Score:5, Funny)

by yotto ( 590067 ) writes: on Tuesday May 31, 2005 @10:20AM (#12683721) Homepage

When questioned on the matter, Altavista's Babelfish translator gave this quote:

Google does not have anything on my amazing abilities of the translation!

Share
twitter facebook
if anyone... (Score:5, Interesting)

by rdc_uk ( 792215 ) writes: on Tuesday May 31, 2005 @10:20AM (#12683724)

Actually, my bet for most likely to make a real go of machine translation would be...

IBM

Look how far they ran with chess programs, because they felt like it...

If they decided to go the same distance with translation...

Share
twitter facebook
- Re:if anyone... (Score:2, Funny)
  
  by LiquidCoooled ( 634315 ) writes:
  
  They won't have any money left to fritter on useless projects after SCO beats them ;)
  - Re:if anyone... (Score:3, Funny)
    
    by rbarreira ( 836272 ) writes:
    
    I believe your thoughts are upside down...
- Re:if anyone... (Score:4, Informative)
  
  by digidave ( 259925 ) writes: on Tuesday May 31, 2005 @10:54AM (#12684039)
  
  Yeah right. Not while they're trying to convince customers to buy their current generation of crap translators. I got sucked into an IBM conference two years ago where they tried to convince me that their Websphere translator was "near perfect" and that it was ready to be deployed on web sites wanting to offer content in multiple languages. They even went so far as to bring in supposed unbiased happy customers who testified that the Websphere translator was as good as human translators.
  
  In the conference was mostly IBM platinum partners (development firms who specialize in IBM "solutions" and make IBM enough money to be called platinum partners) and they seemed to buy into it. Of course, platinum partners tend to believe everything IBM tells them.
  
  Parent Share
  twitter facebook
- Re:if anyone... (Score:2, Insightful)
  
  by rca66 ( 818002 ) writes:
  
  Actually, my bet for most likely to make a real go of machine translation would be... IBM
  
  They already did it. Several years ago. You can get it with Websphere and offsprings are sold under different labels.
  
  Look how far they ran with chess programs, because they felt like it...
  
  Chess is trivial compared to the task of translation. You can not compare these two problems.
Only works for translating speeches (Score:5, Insightful)

by Shotgun ( 30919 ) writes: on Tuesday May 31, 2005 @10:23AM (#12683753)

If your blog sounds like a politician giving a speech at the UN, this service will do a wonderful job. Doubtful that it will do any better that Babelfish otherwise.

The biggest problem in artificial intelligence is that the system learns the material that it is trained to, and only that material. Computers don't generalize or extrapolate the known into the unknown worth a damn.

Share
twitter facebook
- Wait, why? (Score:5, Interesting)
  
  by Ieshan ( 409693 ) writes: <ieshan@@@gmail...com> on Tuesday May 31, 2005 @10:56AM (#12684071) Homepage Journal
  
  "Computers don't generalize or extrapolate the known into the unknown worth a damn."
  
  Fortunately, that's not all that google has to go on. Google has 8 billion webpages, in many different languages, most of which are written by non-speechwriters. Not only can they analyze words based on translated context, but they can analyze words based on intra-language context, to form associations between words and meanings.
  
  The real trick is getting down two important linguistic concepts: "Sandhi Rules" (for instance, the use of "an" before a vowel and "a" before a consonant, which are totally regular but more complicated than a word-to-word matchup), and the "degree" or "quality" of words, which indicate the type of adjective most appropriate in any given context.
  
  For instance, "erudite", "learned", "educated", "knowledgeable", "skilled", and "cunning" could all be related words, but many of them have positive or negative assocations which may only really be conveyed by understanding the meaning, irony, or sarcasm of a particular phrase.
  
  For instance, "John has been skilled in writing beautiful code for most of his adult life" is quite different from "John has been educated in writing beautiful code for most of his adult life", or "John has been erudite...". The first one is probably right if John has had a natural inclination to doing it properly, the second if he has undergone some training (though we don't know the actual state of his ability), the third (though the word doesn't even really make sense here) if he has been arrogant about his ability, shouting RTFM! every time someone asked him a question.
  
  Parent Share
  twitter facebook
  - Don't forget... (Score:3, Funny)
    
    by ballpoint ( 192660 ) writes:
    
    John, the cunning linguist.
- - Re:Only works for translating speeches (Score:2, Funny)
    
    by Dystopian Rebel ( 714995 ) writes:
    
    And if the peeps chin-wagging at Kofi Annan's gig don't interpret 733T 5P3AK, you're in the saddle!*
    
    *Up the river without a paddle.
Good online translators for other languages (Score:3, Insightful)

by metlin ( 258108 ) writes: on Tuesday May 31, 2005 @10:23AM (#12683757) Journal

While Google's existing translator and Altavista's Babelfish are good, they do not help in the translation of several other languages.

That would be a really good benefit - for instance, I wanted something translated to and fro from Svensk (Swedish), but I really couldn't find any translation service that did.

Good translation of the more common languages would be nice, but simple translations, even - of a variety of languages would be really useful.

Share
twitter facebook
Yeah for foreign spam! (Score:2, Funny)

by Anonymous Coward writes:

At last I can translate all those non-English spam emails I get! There'll be no more missed opportunities to buy chinese viagra, woohoo.
- Re:Yeah for foreign spam! (Score:2)
  
  by fuzzybunny ( 112938 ) writes:
  
  This is the best one I have ever received. For you German speakers out there. And note the footer and b1ffsteriffi/
  Date: Mon, 30 May 2005 06:44:20 -0700 (PDT)
  From: harris peters
  To: sassisch@yahoo.com
  Subject: Grüße
  
  HALLO LIEB, WEISS ich, DASS DIESER BUCHSTABE ZU IHNEN, DA eine ÜBERRASCHUNG,
  aber, sich nicht SORGEN, alle KOMMEN MAG IST GUT. Ich BIN Herr HARRIS
  PETERS, GESCHÄFTSSTELLENLEITER FINANZIELLEN VERTRAUENSCBankPlc, der IM
  MAURITIUS GELEGEN Ist. VOR EINIGEN JAHREN, KAM Ein MANN, d
Pre-emptive strike (Score:4, Funny)

by eno2001 ( 527078 ) writes: on Tuesday May 31, 2005 @10:24AM (#12683764) Homepage Journal

Since it's become "hip" to bash Google these days and support either MSN's search technology or Yahoo, I'm making a pre-emptive strike for the IT fashionistas:

"Duh!!! The best machine translator in the world already exists and there can be no improving upon it! Babblefish (thank you Altavista) has been doing this for well nigh a decade. All you Johnny-come-latelys are probably going to rave on with fanboy adoration of Google (the company that can do no wrong)!!! To top it all off, you lot apparently know nothing about Microsoft's language transtlation project which is slated to be deployed as part of Longhorny in 2010. Online language translation from Google will fail because Microsoft will have it built into the OS itself. Why send your document online for translation when the OS itself will not only translate it, but it will correct the grammar, punctuation and generate a WMA file in one of ten thousand gorgeously rendered synthetic voices. Google has lost. Google as been trolled. Google will have a nice day".

We now return you to your regularly scheduled pos[tt]en.

Share
twitter facebook
- Re:Pre-emptive strike (Score:2)
  
  by ConceptJunkie ( 24823 ) writes:
  
  and generate a WMA file in one of ten thousand gorgeously rendered synthetic voices
  
  They might now have 10000 synthetic voices, but I bet they still all sound like GORF.
Old news... (Score:5, Funny)

by jasonmicron ( 807603 ) writes: on Tuesday May 31, 2005 @10:25AM (#12683774)

There is already a tranzilator [gizoogle.com]

Share
twitter facebook
T.Q. (Score:5, Insightful)

by moviepig.com ( 745183 ) writes: on Tuesday May 31, 2005 @10:29AM (#12683807)

The system has been trained using the United Nations Documents as a corpus.
Seems one could devise a TQ (tranlsation quotient) measuring the effectiveness of machine (or human) translators. Take any standard reading-comprehension test, a send its text material through the translator, and back ...and then compare the scores of subjects taking the resulting test vs. those taking the original.

(Before such translators make their way into, say, diplomatic circles, I'd sure hope there's some objective demonstration of near-infallibility...)

Share
twitter facebook
oh no! (Score:5, Interesting)

by danharan ( 714822 ) writes: on Tuesday May 31, 2005 @10:30AM (#12683814) Journal

I don't ever expect such translation to work perfectly, but taking existing phrases should lead to useful first drafts.

This will mean one less possible career for me, and fewer babelfish induced laugther moments.

As a fluently bilingual person, I often recognize expressions that were translated in Canadian government documents. "Anglicisme" is the word the french have for it.

There's subtlety to languages we may forever lose. Take for example:

"Je donne ma langue au chat" - "I give up (answering a riddle) instead of the more picturesque "I give my language to the cat". Well, that should be tongue, but hey, it's just babelfish!

"Bullshit" won't produce "merde de taureau". That is a strange expression you anglos have, don't you realize?

"Il pleut comme vache qui pisse" will give us "it's pouring cats and dogs" rather than "it's pouring like cows' a'pissin". The french also have never heard of cats and dogs falling from the sky.

While an improved Babelfish may improve our mutual comprehension, please pause for a moment to consider all the linguistic hilarity we'll forever lose.

Share
twitter facebook
- Re:oh no! (Score:5, Funny)
  
  by fuzzybunny ( 112938 ) writes: on Tuesday May 31, 2005 @10:38AM (#12683897) Homepage Journal
  
  While an improved Babelfish may improve our mutual comprehension, please pause for a moment to consider all the linguistic hilarity we'll forever lose.
  
  Yeah, like me going to work for Bull [bull.com] in 1997, and searching for "comment dit-on, le, fuck, le chose sur lequel on tappe, thingy qui connecte a l'ordinateur, ah yeah, le clavier". French Bull dude: "ah, le keyboard."
  
  Hilarity indeed.
  
  Parent Share
  twitter facebook
- Re:oh no! (Score:2)
  
  by bhima ( 46039 ) writes:
  
  Most of the work I do is in both German & English and you're right "the linguistic hilarity" is delicious! Particularly when you include regional dialects rather than just "proper grammar".
20 Billion? (Score:2)

by Bananatree3 ( 872975 ) writes:

That should be 200 billion words according to the article [outer-court.com]
Time to move the AI bar (Score:4, Interesting)

by TopSpin ( 753 ) * writes: on Tuesday May 31, 2005 @10:36AM (#12683878) Journal

First, this is outstanding; Google, unsatisfied with traditional machine translation techniques, pioneers their own design. I'm certain their advertisers will be pleased to have their adds auto-translated to whatever language is necessary.

Second, I think we'll witness a case of having the AI ante upped once again when another traditional AI challenge is met. Wikipedia puts this best; When viewed with a moderate dose of cynicism, AI can be viewed as 'the set of computer science problems without good solutions at this point.' Once a sub-discipline results in useful work, it is carved out of artificial intelligence and given its own name.

Share
twitter facebook
Machine Translation may never get there.. (Score:2)

by acomj ( 20611 ) writes:

A relative worked in an "internationalization" department, creating software/manuals in many langugages.

In order for machine translation to be as good as human translation, you fist need to determine what the sentance "means". Often times you need to track previous sentances to determine meaning of things like the word "it". Human languague is not very detailed and relies on common knowledge experences to infer meaning.

Its very hard. Some langauges are easier than others for this stuff. German/french/spa
Lovely translation source... (Score:5, Funny)

by isa-kuruption ( 317695 ) writes: <kuruption&kuruption,net> on Tuesday May 31, 2005 @10:40AM (#12683913) Homepage

So when you go to translate.google.com and translate something, the result will be legal-eze in the resulting languages.

Spanish: "Que pasa?"
English translation: "With regards to the current situation, how is the day progressing?"

Share
twitter facebook
- Re:Lovely translation source... (Score:5, Funny)
  
  by ShadyG ( 197269 ) writes: <bgraymusic@g m a il.com> on Tuesday May 31, 2005 @11:48AM (#12684538) Homepage
  
  No, it actually translates "que pasa" into "We hereby condemn these actions taken by the Israeli government."
  
  Parent Share
  twitter facebook
DVD's subtitle tracks (Score:4, Funny)

by Jotham ( 89116 ) writes: on Tuesday May 31, 2005 @10:42AM (#12683932)

DVD subtitle tracks would be another good addition to help pick up slang too (most have an english track along with a couple others depending on the region)... all time-synced and easy to match up...

(I'm guessing that it'd fall under fair use and google wouldn't have to struggle to get the movie studios approval, (even though such tech would benefit the studios too))

Share
twitter facebook
- Re:DVD's subtitle tracks (Score:3, Insightful)
  
  by BullfrogJones ( 572383 ) writes:
  
  One serious problem I see with the 'matching source' method is that it's rare to find two sources that truly match. Movies are a great example - as a native English speaker that lived for 5 years in Spain, I can attest to the fact that the translations provided by the movie studios (used for subtitles in the theater and also for DVDs) are problematic on many levels.
  It's not enough to recognize a given word in language A is such and such word in language B, and not even enough to do the same with idiomati
Starting Wars ! (Score:5, Funny)

by justanyone ( 308934 ) writes: on Tuesday May 31, 2005 @10:45AM (#12683959) Homepage Journal

In 'Hitchhiker's Guide to the Galaxy' (the 'trilogy' of books, not the recent movie), it's mentioned that the babelfish has effectively started many, many wars. The reasons seem to be that any being can be rude to any other being without a serious set of translations that explain exactly what the rude terms mean and how they should be regarded.

I'm highly concerned for this warmongering that Google has undertaken.

Reference Here: http://www.bbc.co.uk/cult/hitchhikers/guide/belgiu m.shtml [bbc.co.uk]

Picture this: I write a blog entry with either bad punctuation or erroneous content. Under the old system (pre-Goolge translation), I would receive several flames about my idiocy. With Google translations:

* People around the world will be confused and angered about my punctuation;
* Vastly larger numbers of people will complain about my erroneous content;
* Other people will step up to my defense and a massive flame war will ensue;
* Idiots eveywhere (who speak other languages) will echo my idiocy by believing the erroneous content I posted;
* The signal to noise ratio of the net will rise markedly;
* I will still be unsure of whether to count on my fingers starting with my thumb or forefinger depending on which European country I'm in.

I believe this pro-war, anti-peace, conflict-ridden idea of making everyone THINK they understand each other is ripe for critism. God made everyone else speak funny, I think it should stay that way! Only right thinking people speak my language anyway, and everyone else should just shut up and sit down!

(WARNING: above post contains carcinogenic levels of sarcasm, fasciousness, satire, irony, and adjectives. Please unplug brainstem and wipe with a clean, damp cloth before continuing.)

Share
twitter facebook
hype (Score:2)

by Lazy Jones ( 8403 ) writes:

If anyone were capable of making a serious go of MT, that would have to be Google.
Oh, come on. I (still) like Google, but that's a bit silly, no?
yeah, but can it translate this? (Score:5, Funny)

by nullset ( 39850 ) writes: on Tuesday May 31, 2005 @10:56AM (#12684056)

Wenn ist das Nunstruck git und Slotermeyer? Ja!... Beiherhund das Oder die Flipperwaldt gersput. be careful! If you translate this you may end up dead.....

Share
twitter facebook
Will it support Esperanto? (Score:2)

by dsplat ( 73054 ) writes:

Since Esperanto is mentioned so prominently, I have to wonder whether the tool will support it. There has been at least one previous attempt to use Esperanto as an intermediate language for a machine translation project. The only English translation of the article I could find is now only available in Google's cache [64.233.167.104]. There is an ironic symmetry to that.
Google IM (Score:2)

by loconet ( 415875 ) writes:

As the article suggest, Google could use this if they ever decide to go ahead and launch an instant messanger. Imagine being able to chat with anyone in the world while google does the translation in real time for you. What are the implications of this.

As an example, in one hand my family back in Peru, who don't speak english, would be able to chat with my current gf who doesn't speak much spanish but still likes chatting with them. In the other hand, this would slow both parties' motivation of learning a
words don't really have meanings (Score:5, Interesting)

by mincognito ( 839071 ) writes: on Tuesday May 31, 2005 @11:12AM (#12684211)

Some people here seem to have a false picture of how language works. Individual words do not have meanings. Not to a human interpreter anyway. Sentences used in actual contexts have meanings (unless a single word is uttered as an elliptical sentence). The "meanings" of words, as found in dictionaries, are simply abstractions from occasions of use. The idea that individual words have meanings hasn't been current in philosophy or linguistics for about 50 years. Also, the idea of St. Augustine that children learn the meaning of words by associating sounds that they hear with particular objects that they observe is now also considered rather dubious.

Share
twitter facebook
Next step in learning? (Score:2)

by MadCow42 ( 243108 ) writes:

Learning from pre-translated texts is a great start...

Step two should be human corrections to machine-translated documents (learn from your mistakes - like we do), should it not?

MadCow.
How will it translate ambiguous headlines? (Score:3, Insightful)

by Chyeburashka ( 122715 ) writes: on Tuesday May 31, 2005 @12:25PM (#12684904) Homepage

I smiled when I read this recent headline:
Clinton tours devastated Bandeh Aceh.

Of course, I knew what the writer really meant. But the Bable Fish translation into French produces exactly the meaning which I first parsed when reading that headline.

Les excursions de Clinton ont dévasté Bandeh Aceh.

If machine translation become more common, perhaps English writers will have to be a little more careful.

Share
twitter facebook
- Re:Bubla *Cick BAle (Score:2)
  
  by wootest ( 694923 ) writes:
  
  Hey! My mother was a saint!
- Re:Bubla *Cick BAle (Score:2)
  
  by CyberKnet ( 184349 ) writes:
  
  That's the idea. Given enough examples this gibberish and it's counterpart in english, eventually the system could start to 'translate' it. Personally, I'd like to see this used for the reverse. Feed enough random input and english texts for their 'counterparts' and use the service to create a new language.
- Re:Other uses... (Score:2)
  
  by I confirm I'm not a ( 720413 ) writes:
  
  I can't wait for a Welsh version of firefox =P
  
  According to this Mozilla QA document [mozilla.org], Firefox should have had a Welsh locale since 1.0.2? Not that I've looked, the closest I come to speaking owt other than English is claiming I speak "Lallans" whilst in the West of Scotland... aye richt ;-)
- Re:how do they know? (Score:2)
  
  by pedantic bore ( 740196 ) writes:
  
  Perhaps they just look at it and ask themselves "if I was Chinese, would I say something like that?"
  Seriously, given the profound differences between Chinese and English, and each languages' complexity, I'd be very impressed it did a decent job. Until I see it working, however, I remain skeptical. After all, the field of machine translation has been around longer than most Google employees have been alive, and it's still got a long way to go.
- Re:Two thoughts (Score:2, Insightful)
  
  by Secret Agent 99 ( 855215 ) writes:
  
  If they use UN documents as a guide, the Google MT engine will be excellent at translating bureaucratese between languages. I'm not sure if that's a good thing!
  
  Exactly. And the UN surely has fairly rigorous QA processes for its translations. Now try expanding the corpus with more translated copy.
  
  In addition to feeding the system with translations that haven't been through formal QA (in many but not all cases), you also are now feeding it copy that has not had all the style deliberately squeezed out of i

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

fascinating (Score:5, Informative)

Re:fascinating (Score:2, Funny)

Re:fascinating (Score:5, Funny)

Re:fascinating (Score:5, Interesting)

Re:fascinating (Score:4, Interesting)

Re:fascinating (Score:5, Interesting)

Re:fascinating (Score:4, Insightful)

Re:fascinating (Score:3, Interesting)

Re:fascinating (Score:3, Interesting)

Comment removed (Score:5, Interesting)

Re:fascinating (Score:4, Insightful)

Re:fascinating (Score:2)

Re:fascinating (Score:2, Interesting)

Re:fascinating (Score:5, Funny)

translations of translations (Score:2)

Re:translations of translations (Score:4, Insightful)

Re:fascinating (Score:3, Informative)

Re:fascinating (Score:5, Insightful)

Re: (Score:2)

Re:fascinating (Score:2)

except, no. (Score:4, Insightful)

Re:except, no. (Score:3, Interesting)

Re:fascinating (Score:5, Funny)

Re:fascinating (Score:2)

Needs a *bit* more work... (Score:4, Interesting)

Re: (Score:2)

Re:Needs a *bit* more work... (Score:2)

Bork bork bork! (Score:5, Funny)

Re:Needs a *bit* more work... (Score:2)

Google's translator (Score:3, Interesting)

Re:Google's translator (Score:5, Informative)

Re:Google's translator (Score:2)

Re:Google's translator (Score:2)

Re:Google's translator (Score:2, Insightful)

Google sets itself up for success (Score:2)

Re:Google's translator (Score:2)

Re:Google's translator (Score:2)

Integrate with GMAIL! (Score:5, Interesting)

Re:Integrate with GMAIL! (Score:2, Funny)

Re:Integrate with GMAIL! (Score:2)

Anyone care to make a bet? (Score:5, Funny)

Re:Anyone care to make a bet? (Score:2, Informative)

Re:Anyone care to make a bet? (Score:2)

Re:Anyone care to make a bet? (Score:2)

Unsupported assertions (Score:2, Insightful)

Re:Unsupported assertions (Score:5, Insightful)

Re:Unsupported assertions (Score:2)

Re:Unsupported assertions (Score:4, Interesting)

Re:Unsupported assertions (Score:2, Insightful)

Re:Unsupported assertions (Score:2)

Re:Unsupported assertions (Score:5, Interesting)

Re:Unsupported assertions (Score:2)

Re:Unsupported assertions (Score:2)

Re:Unsupported assertions (Score:2, Insightful)

Re:Unsupported assertions (Score:3, Insightful)

Re:Unsupported assertions (Score:3, Funny)

so name.. (Score:2)

Piffle (Score:5, Funny)

Re:Piffle (Score:2)

Re:Piffle (Score:3, Insightful)

Altavista Babelfish (Score:5, Funny)

if anyone... (Score:5, Interesting)

Re:if anyone... (Score:2, Funny)

Re:if anyone... (Score:3, Funny)

Re:if anyone... (Score:4, Informative)

Re:if anyone... (Score:2, Insightful)

Only works for translating speeches (Score:5, Insightful)

Wait, why? (Score:5, Interesting)

Don't forget... (Score:3, Funny)

Re:Only works for translating speeches (Score:2, Funny)

Good online translators for other languages (Score:3, Insightful)

Yeah for foreign spam! (Score:2, Funny)

Re:Yeah for foreign spam! (Score:2)

Pre-emptive strike (Score:4, Funny)

Re:Pre-emptive strike (Score:2)

Old news... (Score:5, Funny)

T.Q. (Score:5, Insightful)

oh no! (Score:5, Interesting)

Re:oh no! (Score:5, Funny)

Re:oh no! (Score:2)

Needs a bit more work... (Score:4, Interesting)

Re:Needs a bit more work... (Score:2)

Re:Needs a bit more work... (Score:2)