Automatic Translation Without Dictionaries

Please create an account to participate in the Slashdot moderation system

Automatic Translation Without Dictionaries 115

Posted by Soulskill on Saturday September 28, 2013 @05:42PM from the baby-steps-to-the-universal-translator dept.

New submitter physicsphairy writes "Tomas Mikolov and others at Google have developed a simple means of translating between languages using a large corpus of sample texts. Rather than being defined by humans, words are characterized based on their relation to other words. For example, in any language, a word like 'cat' will have a particular relationship to words like 'small,' 'furry,' 'pet,' etc. The set of relationships of words in a language can be described as a vector space, and words from one language can be translated into words in another language by identifying the mapping between their two vector spaces. The technique works even for very dissimilar languages, and is presently being used to refine and identify mistakes in existing translation dictionaries."

This discussion has been archived. No new comments can be posted.

Automatic Translation Without Dictionaries

Load All Comments

Search 115 Comments Log In/Create an Account

Comments Filter:

My hovercraft is full of eels! (Score:5, Funny)

by Anonymous Coward writes: on Saturday September 28, 2013 @05:45PM (#44981553)

My nipples explode with delight!

Share
twitter facebook
- Re: (Score:2)
  
  by Brad1138 ( 590148 ) writes:
  
  LOL, I was just watching my Monty Python DVDs last week and saw that episode. Very Funny.
- Re: (Score:2)
  
  by Alsee ( 515537 ) writes:
  
  This method is mÃfÆ'Ã Â© complÃfÆ'Ã throughly cromulent! I always use Google Translate to exÃfÆ'Ã Â© cuter all my messages through Slashdot franÃfÆ'Ã Ãf Â ais Ãf then English. You can not tell me mÃfÆ'Ã Â at all.
  -
  - Re: (Score:1)
    
    by Anita Hunt (lissnup) ( 2913179 ) writes:
    
    You beat me to it!
- Re: (Score:1)
  
  by kloro2006 ( 1860846 ) writes:
  
  Ditto, ditto, ditto. And because it is able to generate an unlimited amount of close translations, the technology would it seems to me change our whole concept of dictionaries used by those who want to read foreign texts without external translations. Whenever I want to understand a word, what I want also is to learn it for its future appearances, but I can't learn a word without lots of examples. With the dictionary I have in mind, you find the word in a corpus of good literature in the language (if good
And what's the algorithm complexity? (Score:1)

by d33tah ( 2722297 ) writes:

Well, that sounds quite cool, but also makes me wonder how does the algorithm tell wrong associations from the good ones. These things can easily go up to n^2 complexity.
- Re: (Score:1)
  
  by d33tah ( 2722297 ) writes:
  
  (I meant O(n^2) memory complexity.)
  - - Linearithmic (Score:2)
      
      by tepples ( 727027 ) writes:
      
      But there's a space between linear and quadratic called linearithmic, or O(n log n). Merge sort uses nested loops and lies in this space.
- Re:And what's the algorithm complexity? (Score:5, Funny)
  
  by SuricouRaven ( 1897204 ) writes: on Saturday September 28, 2013 @07:38PM (#44982187)
  
  Statistical translation is always going to have issues like that, but it can perhaps reach the 'good enough' point to hold a conversation with.
  I can easily see it getting confused by formal vs informal use. If it goes on association, eventually it's going to get 'lawyer' and 'extortionist' confused.
  
  Parent Share
  twitter facebook
  - Re:And what's the algorithm complexity? (Score:4, Funny)
    
    by Anonymous Coward writes: on Saturday September 28, 2013 @08:07PM (#44982359)
    
    I too get lawyer and extortionist confused.
    
    Parent Share
    twitter facebook
- Re: (Score:2)
  
  by FatdogHaiku ( 978357 ) writes:
  
  Awl hour go rhythms spume pizza!
- Re: (Score:2)
  
  by TaoPhoenix ( 980487 ) writes:
  
  This is of course a subset of the big overall AI problem.
  So I think (without spending hours on the Articles!) that somewhere either in this research or the next few sets past it, is a key clue. I think the algorithm is (making up a slightly silly sounding word) "Quadratically modular". In other words, nothing says the comp can only use one algorithm to start working on its meaning. Studies like to chop things down because researchers get nervous at Emergent Complexity in old style science results. But using
Pun + Her attitude arbitrary pleases me too. (Score:1)

by mynamestolen ( 2566945 ) writes:

Neither the article or PDF contain the word "pun". We're still a little way off. But hopefully we'll get better than this attempt from google translate: = She turned me off with her bossy manner. but Google translate gets OPPOSITE meaning. saying "Her attitude arbitrary pleases me too."
- Re: (Score:2)
  
  by mynamestolen ( 2566945 ) writes:
  
  hmmm?? slashdot doesn't easily accommodate unicode.
  - Re: (Score:3)
    
    by Kjella ( 173770 ) writes:
    
    Welcome to /. where we still party like it's 1999. We'll have colonies on Mars before this site gets unicode support.
  - Re: (Score:3)
    
    by tepples ( 727027 ) writes:
    
    Slashdot has a fairly strict code point whitelist because there were problems in the past with trolls using directionality override characters to break Slashdot's layout and big blocks of foreign characters to make not-ASCII ASCII art.
    - Re: (Score:2)
      
      by NonUniqueNickname ( 1459477 ) writes:
      
      ... to make not-ASCII ASCII art.
      So... just art?
      - Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        I was referring to Shift JIS or Unicode glyph art, which extend the concept of ASCII art past the ASCII character set.
- - Re: make that the cat wise! (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    Yes exactly. For sayings google translate works not so good now. But perhaps with this technique it will be to plums in the future.
    - Re: (Score:2)
      
      by Anne Thwacks ( 531696 ) writes:
      
      Since the TV subtitles have so many errors that they are impossible for humans to understand, I cant see this working in my lifetime. I suspect the cat is a weasel, rather than wise.
how would (Score:1)

by ozduo ( 2043408 ) writes:

'tight pussy" be translated?
- Re:how would (Score:5, Funny)
  
  by Anonymous Coward writes: on Saturday September 28, 2013 @06:02PM (#44981669)
  
  how would 'tight pussy" be translated?
  "Tight pussy" would be translated automatically, and without dictionaries. This is answered right in the headline.
  
  Parent Share
  twitter facebook
- Re:how would (Score:5, Funny)
  
  by Jane Q. Public ( 1010737 ) writes: on Saturday September 28, 2013 @06:56PM (#44981973)
  
  "tight pussy" be translated?
  "The cat has drunk a saucer of wine."
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by SuricouRaven ( 1897204 ) writes:
  
  Depends on source corpus. If they trained it using one of the usual formal collections of publications, it would only have built up associations based on the slang-free usage and so would translate it as 'Tight cat.' If they have instead fed it a broader selection, perhaps culled from a web spider, it may pick up the other meaning.
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    it may pick up the other meaning
    In a manner of speaking. The actual meaning of the words is completely irrelevant.
- Re: (Score:2)
  
  by smallfries ( 601545 ) writes:
  
  Dat cat was good to roll with, yo?
- Re:Sounds good, but we need a robust plug (Score:5, Funny)
  
  by Finallyjoined!!! ( 1158431 ) writes: on Saturday September 28, 2013 @06:14PM (#44981743)
  
  it gets full of lint
  
  What's it got in its pocketses?
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by caseih ( 160668 ) writes:
  
  Agg. firefox put me on the wrong story... bye bye karma
  - Re:Sounds good, but we need a robust plug (Score:4, Insightful)
    
    by icebike ( 68054 ) writes: on Saturday September 28, 2013 @06:23PM (#44981799)
    
    Firefox had nothing to do with it.
    It was PEBCAK, pure and simple.
    
    Parent Share
    twitter facebook
  - Re: (Score:3)
    
    by plover ( 150551 ) writes:
    
    With this story being about automated translations getting it very wrong, there was a 95% chance people would have thought you were just making a joke about Apple doing language translations!
    If you had posted a follow up like "That's what Apple translate gets when I wrote 'Orchards of apple trees have fans to spray microscopic poison dust on all trees', it would have been perfectly believable.
Darmok and Jalad at Tanagra (Score:4, Interesting)

by Vanders ( 110092 ) writes: on Saturday September 28, 2013 @06:03PM (#44981675) Homepage

Finally, the team point out that since the technique makes few assumptions about the languages themselves, it can be used on argots that are entirely unrelated.
Once again, Star Trek is ahead of the curve.

Share
twitter facebook
- Re: (Score:3)
  
  by Samantha Wright ( 1324923 ) writes:
  
  Incidentally, real life caught up [lbgale.com]—fortunately there's not much worth translating with such a low-bandwidth form of communication.
  - - Re: (Score:2)
      
      by Samantha Wright ( 1324923 ) writes:
      
      The way they describe their conceptual representation system (they give the example "king - man + woman = queen") makes it pretty clear that figurative language is completely out of the question.
- Re: (Score:2)
  
  by epine ( 68316 ) writes:
  
  Once again, Star Trek is ahead of the curve.
  If you don't count noticing that the gear cogs of the antikythera could be made ever smaller and smaller by ongoing advances in Swiss craftsmen 1600 years later, then Star Trek was indeed ahead of its time in guessing that a large phone might become a small phone with batteries (the Baghdad Battery [wikipedia.org] dates to roughly the same age as the antikythera) and a radio (1887) carried by some exotic flux such as neutrinos (as named by Fermi in 1933).
Hmmm... (Score:1)

by freshlimesoda ( 2497490 ) writes:

Makes me think about hash functions and flash storage and data interoperability..... future..
Hofstadter? Isn't this AI, not translation? (Score:5, Interesting)

by Etcetera ( 14711 ) writes: on Saturday September 28, 2013 @06:11PM (#44981733) Homepage

Reminds me a lot of the Fluid Concepts and Creative Analogies [amazon.com] work that Hofstadter led back in the day.
I don't see this directly working for translation into non-lexographically swappable languages (eg, English -> Japanese) very well, because even if you have the idea space mapped out, you'd still have to build up the proper grammar, and you'll need rules for that.
That being said.... Holy cow, you have the idea space mapped out! That's a big chunk of Natural Language Processing and an important step in AI development. ... Understanding a sentence emergently in terms of fuzzy concepts that are an internal and internally created symbol of what's "going on", not just using a dictionary and CYC [wikipedia.org]-like rules to figure it out, seems like a useful building block, but maybe I'm wrong.
Very cool stuff. Makes me want to go back and finish that CS degree after all.

Share
twitter facebook
- Re:Hofstadter? Isn't this AI, not translation? (Score:5, Interesting)
  
  by phantomfive ( 622387 ) writes: on Saturday September 28, 2013 @07:18PM (#44982089) Journal
  
  I don't see this directly working for translation into non-lexographically swappable languages (eg, English -> Japanese) very well, because even if you have the idea space mapped out, you'd still have to build up the proper grammar, and you'll need rules for that.
  According to the paper, this translation technique is only for translating words and short phrases. But it seems to work well for languages as far apart as English and Vietnamese.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by infinitelink ( 963279 ) writes:
  
  [...] Holy cow, you have the idea space mapped out! That's a big chunk of Natural Language Processing and an important step in AI development. ... Understanding a sentence emergently in terms of fuzzy concepts that are an internal and internally created symbol of what's "going on", not just using a dictionary and CYC [wikipedia.org]-like rules to figure it out, seems [...]
  
  Like not enough given the symbol-grounding problem.
- Re: (Score:2)
  
  by narcc ( 412956 ) writes:
  
  Understanding a sentence emergently in terms of fuzzy concepts that are an internal and internally created symbol of what's "going on", not just using a dictionary and CYC [wikipedia.org]-like rules to figure it out, seems like a useful building block
  Yeah, that's not what's happening at all.
Load of bollocks (Score:2)

by Skiron ( 735617 ) writes:

OK, I am just having a fag. I bet that will bugger it up.
- Re: (Score:2)
  
  by denzacar ( 181829 ) writes:
  
  Cats still dig fags?
Summary wrong (again) (Score:2, Flamebait)

by icebike ( 68054 ) writes:

Simply because you embed your dictionary in something you choose to call a vector doesn't make it any less of a dictionary.
Its still a dictionary, and also a thesaurus. Come to think of it a thesaurus is simply a meaning vectored dictionary.
What's old is new again.
Mathematicians, late to the party, still trying to drink all the punch.
- Re:Summary wrong (again) (Score:5, Insightful)
  
  by hey! ( 33014 ) writes: on Saturday September 28, 2013 @07:26PM (#44982131) Homepage Journal
  
  Simply because you embed your dictionary in something you choose to call a vector doesn't make it any less of a dictionary.
  True, but calling a dictionary a vector space doesn't make it so. For example how "close" are the definitions of "happiness" and "joy"? In a dictionary, the only concept of "closeness" is the lexical ordering of the word itself, and in that sense "happiness" and "joy" are quite far apart (as far apart as words beginning h-a are from words beginning with j-o are in the dictionary). But in some kind of adjacency matrix which show how often these words appear in some relation to other words, they might be quite close in vector-space; "guilt" and "shame" might likewise be closer to each other than either is from "happiness", and each of the four words ("happiness", "joy", "guilt", "shame") would be closer to any other of those words than they would be to "crankshaft"; probably close to "crankshaft" (a noun) than they'd be to "chewy" (an adjective).
  Anyhow, if you'd read the paper, at least as far as the abstract, you'd see that this is about *generating* likely dictionary entries for unknown words using analysis of some corpus of texts.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by abies ( 607076 ) writes:
    
    I think that joy is quite close to chewy (through bubblegum and caramel for example). Of course, I believe some people may get more joy from playing with well oiled crankshaft, but that's a personal preference ;)
- Re:Cat (Score:4, Insightful)
  
  by blue trane ( 110704 ) writes: on Saturday September 28, 2013 @07:21PM (#44982107) Homepage Journal
  
  jazz musician
  
  Parent Share
  twitter facebook
  - Re: (Score:3)
    
    by dkleinsc ( 563838 ) writes:
    
    Rimmer, Lister
Dolphinese Will Now Be Understood (Score:4, Funny)

by MacroSlopp ( 1662147 ) writes: on Saturday September 28, 2013 @06:34PM (#44981867)

With this technology we should be able to understand Dolphin-talk.
It should also allow us to detect future ape rebellions before they happen.

Share
twitter facebook
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  Thanks for all the fish!
  - Re: (Score:2)
    
    by Vanders ( 110092 ) writes:
    
    This has to be done. [youtube.com]
- Re: (Score:2)
  
  by SuricouRaven ( 1897204 ) writes:
  
  It's already been partially decoded.
  Most of the calls are individual identifiers unique to the individual. Makes a lot of sense. A dolphin pod is essentially free-floating a lot of the time in an ocean with no navigational markers and little indication of direction. They need some way to track each other to keep the group from getting split up.
- Re: (Score:2)
  
  by schlachter ( 862210 ) writes:
  
  wrt your first sentence. i don't think this is funny at all. it's an amazing opportunity.
- Re:the spirit is willing but the flesh is weak (Score:4, Interesting)
  
  by icebike ( 68054 ) writes: on Saturday September 28, 2013 @07:17PM (#44982083)
  
  Yes, the pretty vectors (nothing but lists of words) still have to be assembled by humans for the most part. Maybe not EVERY association, but enough of them such that you can build relationships and associations in-directly, and achieve a round-about translation, even if you end up having to go through 2 or 3 related languages to get there.
  After a few words of context are translated you can, perhaps deduce the rest. But the idea you can do so without a dictionary is ridiculous. And putting your dictionary into digital forms and calling it a vector doesn't change the fact that you still have a dictionary associating an english word with a french word and a Mandarin word.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by sourcerror ( 1718066 ) writes:
    
    It pretty much sounds like a dumbed down Wordnet.
    http://en.wikipedia.org/wiki/WordNet [wikipedia.org]
    http://wordnet.princeton.edu/ [princeton.edu]
Isn't that pretty much how Google Translate works? (Score:1)

by Anonymous Coward writes:

Tomas Mikolov and others at Google have developed a simple means of translating between languages using a large corpus of sample texts. Rather than being defined by humans, words are characterized based on their relation to other words.
Like how Google Translate have noticed that Danish domain names ends in "dk" and therefore translates "dk" to "com" with "uk", "gb" and "en" as some of the other suggestions?
Sometimes "a simple means" can be too simple.
Old idea, new implementation? (Score:5, Interesting)

by Theovon ( 109752 ) writes: on Saturday September 28, 2013 @07:00PM (#44982003)

When I was in grad school, studying linguistics, compitational linguistics, and automatic speech recognition, I recall it mentioned more than once the idea of using latent semantic analysis and such to do this kind of translation. So am I correct in assuming that this hasn't been done well in the past, and Google finally made it work well because they have larger corpora of translated texts?

Share
twitter facebook
- Re: (Score:2)
  
  by schlachter ( 862210 ) writes:
  
  yeah, it's about all these different corpuses coming online and being available to a single group, especially because in order to train, they need a one to one translation of a single doc. like a gov doc that's in both spanish and english is great fodder for the algorithm.
- Re: (Score:2)
  
  by k.a.f. ( 168896 ) writes:
  
  When I was in grad school, studying linguistics, computational linguistics, and automatic speech recognition, I recall it mentioned more than once the idea of using latent semantic analysis and such to do this kind of translation. So am I correct in assuming that this hasn't been done well in the past, and Google finally made it work well because they have larger corpora of translated texts?
  You are utterly correct. The idea of machine translation by looking up each word in a dictionary and shuffle the result around was big in the 1950s, but hasn't been since then. It became all too clear very early that this isn't the way to produce texts that a native speaker would ever say (or even comprehend). The barrier to doing this kind of context-dependent analysis was that the hardware wasn't there for a long time, and later the huge parallel corpora that are needed to make it work were missing. (Just
Old news (Score:4, Informative)

by richwiss ( 876232 ) writes: on Saturday September 28, 2013 @07:03PM (#44982021) Homepage

This is old news, going back to 1975. Yawn. http://en.wikipedia.org/wiki/Vector_space_model [wikipedia.org]

Share
twitter facebook
- Re: (Score:2)
  
  by smallfries ( 601545 ) writes:
  
  That is something differnent, each document is a vector and each word is a dimension.
- Re: (Score:2)
  
  by SuricouRaven ( 1897204 ) writes:
  
  Unless the plot calls for a breakdown of communication, in which case the language will be too 'complex' for the universal translator.
  - Re: (Score:2)
    
    by jedidiah ( 1196 ) writes:
    
    Damn you Darmok!
    - Darmok and Jihad at Viagra (Score:2)
      
      by tepples ( 727027 ) writes:
      
      The allusion-heavy Tamarian language [tvtropes.org] has real-world analogs, such as Tropese [tvtropes.org] and the tendency for users of sites closely linked to 4chan to talk in memes.
When and where matters (Score:2)

by gmuslera ( 3436 ) writes:

Meaning of words, and their translations, vary with time and location. Infering meanings from texts from 20 years ago or another country, state or even region inside a state, even if the language is the "same", could be risky. There had been a lot of marketing problems thanks to this kind of bad translation [socialnomics.net]
Like so many of these algorithms (Score:4, Interesting)

by holophrastic ( 221104 ) writes: on Saturday September 28, 2013 @08:21PM (#44982427)

They do a great job of improving the precision of what used to be mediocre. And then, as a direct result, they not only make the errors worse, they make the errors undetectable.
CAT: small, furry, pet.
BIG CAT: big, furry, pet.
Um. Both are orange. One's a tabby. One's a tiger.
It's not good enough that your translation system has a 99% accuracy whereas the old one has a 90% accuracy. What matters is that the old one's 10% error rate sounded like an error (e.g. tiger becomes monster), whereas your new one's 1% passes the turing test and can't be discerned by an intelligent listener (e.g. tiger becomes tabby).
"My friend owns a monster." -- You friend owns what? I don't think you meant a monster. -- "eh, you know, a very big dangerous jungle cat" -- oh, like a lion -- "not a lion, it has stripes" -- oh, a tiger.
"My friend owns a tabby." -- Ok.

Share
twitter facebook
- Re: (Score:3)
  
  by flimflammer ( 956759 ) writes:
  
  "My friend owns a monster." -- You friend owns what? I don't think you meant a monster. -- "eh, you know, a very big dangerous jungle cat" -- oh, like a lion -- "not a lion, it has stripes" -- oh, a tiger.
  Do you frequently converse with machine translators that elaborate the meaning of their mistranslations? Would be interested in knowing which one is capable of that. See when I use them it's what-you-see-is-what-you-get and I have to pick at the original source text with a dictionary to learn monster actually means tiger. That they can nonchalantly narrow the meaning down for you in a Star Trek-esque computer conversation is leaps and bounds ahead of what I'm used to!
  Sarcasm aside for a moment, you're actua
  - Re: (Score:2)
    
    by holophrastic ( 221104 ) writes:
    
    That's almost my complaint. It's not that I won't notice the errors. It's that I won't notice the errors when they are spoken. I'll notice the errors when I get bitten by a tiger after reading a sign that says "beware of cat".
    It's important for miscommunication to be identified during the communication protocol.
Still needs dictionaries (Score:3)

by raju1kabir ( 251972 ) writes: on Saturday September 28, 2013 @09:16PM (#44982597) Homepage

Anyone who regularly uses Google Translate has seen the problems that come with this approach.
It "translates" analogous terms in ways that make no sense. Translate "Amsterdam" from Dutch to English and it often gives you "London". Same with kilometres / miles, and other things that significantly change the meaning of the text.
With some hand-crafted guidance, the outcome can be much less useful than the more rough-sounding word-by-word machine translations from days of yore.

Share
twitter facebook
- Re: (Score:2)
  
  by sourcerror ( 1718066 ) writes:
  
  On the other hand it's much better at translating idioms or expressions where the component words have a lot of different meanings.
Synonyms (Score:2)

by manu0601 ( 2221348 ) writes:

I wonder how they handle synonyms, which may be much more prevalent in a given language from another one.
If the destination language is poorer in synonyms than the source language, this is straightforward, and that automatic translation will just miss subtle points that cannot be translated without a periphrase. In the opposite case, which is moving from synonym-poor language to a synonym rich language, the computer needs to choose the right word, and doing so requires some understanding of the context.
And
- Re: (Score:3)
  
  by Panoptes ( 1041206 ) writes:
  
  Synonyms are only the tip of the iceberg: there are so many other problem areas. Collocations (words that 'go together'): we can say a 'tall boy', but not a 'high boy'; 'a large beer', but not 'a big beer'. Connotations (attitudes, feelings and emotions that a word acquires): compare 'a slim girl' with 'a skinny girl'. Idioms: 'hot potato' and 'red herring' cannot be translated directly into any another language. Add irony and sarcasm to the mix, class and regional usage, dialects, diglossia (for example, d
  - Re: (Score:3)
    
    by manu0601 ( 2221348 ) writes:
    
    I understand that collocation are adressed by their model: they study texts to discover that 'boy' may be preceded by 'tall' but not by 'high', and that in french, 'garçon' may be preceded by 'grand' but not 'haut'. That enables them to translate without a hitch.
    But even adjectives handling may come with traps. Adjectives in french may appear before or after a noun. You may say 'un grand garçon' or 'un garçon grand', the meaning is the same most of the time. But there are exceptions! 'un typ
What I want from a translator (Score:2)

by snadrus ( 930168 ) writes:

1. Rough word-by-word is the beginning
2. Sentence structure reorganization
3. Idiom recognition.
4. Connotation, Tone, Irony
5. Generation / Area / Nature: How a native listener can determine details about the speaker.
The result will always be annotated-looking with warnings for plays-on-words, and will always be longer with maximum detail extraction from the source language.
I'm sure there's more to do after these items are done.
Another way of understandling language translation (Score:2)

by beachdog ( 690633 ) writes:

"Vector spaces" is the heart of the Google proposal. Previous posters have disassembled the weaknesses pretty well.
The thing a "vector spaces" analysis needs is specific vector mapping based on the sounds of speech, the rythmns of a language, the breathing of the speaker and the physical proximity parts of the brain associated with hearing and parts of the brain associated with speech.
Multiple languages exist because the growing infant's brain organizes the sounds it hears by passing the neural sensations t
Star Trek Universal Translator anyone? (Score:1)

by Anonymous Coward writes:

Looks like this could be the beginning of a Universal translation scheme. Next all we need is to add voice recognition to this and Star Trek tech comes alive once again!
Computational Linguistics (Score:1)

by Anonymous Coward writes:

In recent times there are regularly articles in technology magazines about topics in computational linguistics (CL) that are blatantly ignorant of the current research. This is just another example.
The time that dictionaries are used for applied machine translation is already history since 10 years. Statistical machine translation (SMT) and the techniques described here have not been developed by google. In fact the basic idea of SMT is over 25 years old and distributional semantics is over 50 years old. Ph
Devil is in the details, (Score:2)

by Antony T Curtis ( 89990 ) writes:

The amusing side effect of the effectiveness of statistical machine translation is that more and more people would use machine translation instead of employing fluent humans to do the translation and that is where the fun begins: The machine-translated phrases will reenter the corpus as seed data and as the percentage of human-origin data in the corpus reduces, so does the quality of the translations as subtle errors are magnified over time.
There should be some kind of "fingerprint" added to the machine tra

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

My hovercraft is full of eels! (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

And what's the algorithm complexity? (Score:1)

Re: (Score:1)

Linearithmic (Score:2)

Re:And what's the algorithm complexity? (Score:5, Funny)

Re:And what's the algorithm complexity? (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Pun + Her attitude arbitrary pleases me too. (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: make that the cat wise! (Score:2, Funny)

Re: (Score:2)

how would (Score:1)

Re:how would (Score:5, Funny)

Re:how would (Score:5, Funny)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Sounds good, but we need a robust plug (Score:5, Funny)

Re: (Score:3)

Re:Sounds good, but we need a robust plug (Score:4, Insightful)

Re: (Score:3)

Darmok and Jalad at Tanagra (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Hmmm... (Score:1)

Hofstadter? Isn't this AI, not translation? (Score:5, Interesting)

Re:Hofstadter? Isn't this AI, not translation? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Load of bollocks (Score:2)

Re: (Score:2)

Summary wrong (again) (Score:2, Flamebait)

Re:Summary wrong (again) (Score:5, Insightful)

Re: (Score:2)

Re:Cat (Score:4, Insightful)

Re: (Score:3)

Dolphinese Will Now Be Understood (Score:4, Funny)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:the spirit is willing but the flesh is weak (Score:4, Interesting)

Re: (Score:2)

Isn't that pretty much how Google Translate works? (Score:1)

Old idea, new implementation? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Old news (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Darmok and Jihad at Viagra (Score:2)

When and where matters (Score:2)

Like so many of these algorithms (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Still needs dictionaries (Score:3)

Re: (Score:2)

Synonyms (Score:2)

Re: (Score:3)

Re: (Score:3)

What I want from a translator (Score:2)

Another way of understandling language translation (Score:2)

Star Trek Universal Translator anyone? (Score:1)

Computational Linguistics (Score:1)

Devil is in the details, (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals