Forgot your password?
typodupeerror
The Internet Science

Using The Web For Linguistic Research 205

Posted by timothy
from the that's-rediculous dept.
prostoalex writes "The Economist says linguists are gradually adopting the World Wide Web as a useful corpus for linguistic research. Google is used, among other resources, to research how the written language evolves and how some non-standard examples of usage become more or less acceptable (The Economist quotes the phrase 'He far from succeeded,' where 'far from' is used as an adverb). LanguageLog is a resource linked in the article, where linguists discuss current peculiarities of the English language."
This discussion has been archived. No new comments can be posted.

Using The Web For Linguistic Research

Comments Filter:
  • by Peter Cooper (660482) on Sunday January 23, 2005 @04:56AM (#11446691) Homepage Journal
    It's probably a good thing that they steer away from Slashdot as a corpus of English usage. Or, should I say, in SOVIET RUSSIA it's best Slashdot stays away from THEM! Or is it that only old people use the Internet as a corpus of the English language while pouring hot grits down a naked and petrified Natalie Portman's pants?
    • Thats true but only in japan.
    • Hopefully, they'll harvest well written webpages for data and not those of 13-year old girls drooling over Orlando Bloom, AOL users, or porn sites.

      Actually, I take that back.

      It could actually be very interesting from a lexical or morphological point of view. The phenomenon of abbreviating words, such as "u" for "you" or "ur" for "you're" or "ru" for "are you." Language teachers in classrooms have been seeing it crop up in actual homework assignments. While reading such language may be like having glass
      • Because the keyboard is still the main way of inputing information into the computer, people take short cuts

        One thing that's always been at the front of my my mind, why aren't these kids learning how to type? Or at least to type with any reasonable amount of skill. The only computer I had as a child was a Commodore 64, and I was still faster than most of todays youth even with their abbreviations. I was somewhat lucky in that our schools somehow foresaw the advent of the home computer and made sure we kne
        • by Anonymous Coward

          One thing that's always been at the front of my my mind, why aren't these kids learning how to type?

          Because, unlike the parent's assumption, the phenomenon isn't related to computers. It's related to text messaging. It might be just as fast to type "you" instead of "u" with a keyboard, but it's noticably slower on mobile phones, especially before predictive text became popular.

          Furthermore, there is a limit on how many characters you can send in a single message. Most service providers automaticall

          • Because, unlike the parent's assumption, the phenomenon isn't related to computers. It's related to text messaging. It might be just as fast to type "you" instead of "u" with a keyboard, but it's noticably slower on mobile phones, especially before predictive text became popular.

            I'd still advance the argument that extensive use of computers by a larger portion of the population has contributed to the phenomenon. I remember seeing those abbreviations before cell-phone use became almost ubiquitous.

            It's al

      • It's troubling to read so many comments that worry that the linguistic researchers will find "bad language", and worse, that people have moderated such comments up. It reflects a misunderstanding of what linguists do: they want to get a description of the language as it is used, and as it changes, and historically speaking, usages that start in the gossip of teenage girls often become mainstream a couple of generations later. They need it all, and they probably need the crappy stuff most of all, because
        • Looks like someone didn't read my entire comment.

          I specifically stated that I was retracting my initial comments, but I kept them in there as a tongue-in-cheek statement. I never said anything about "bad language." I'm well aware of the difference between how language is considered by linguists and English teachers.

          As to the relation of written language on the web to spoken language, I don't think that's been established. I know that in dialogue systems, which deal with spoken language as opposed to wr
      • Ebonics was never proposed as a valid dialect for use in the classroom. If you've got a bunch of students that can only speak French, and you want to teach them English -- it helps to know enough French to understand what they're trying to say.

        Same concept. Whole thing got hijacked by politics.

        --Dan
    • What do you think Natalie Portman would do if she actually viewed Slashdot some time?
    • omg lol wtf r u tlaking about!!!!111 slshadot englsh is gret i dont know hwo you guys cud nock it!!!!11

      Yeah...

      Linux is a community made of of mostly literate folks who generally understand the language well enough to be understood. Zealots complaining about the use of apostrophe 's' as a possessive, you can do worse than slashdot in terms of grammar, easily.

      That said, only pedants can claim perfect spelling and grammar at all times, ever.
  • Indeed (Score:4, Funny)

    by Pan T. Hose (707794) on Sunday January 23, 2005 @04:57AM (#11446693) Homepage Journal
    Indeed what their sayin is true. U can learn English very well, especially grammer readin /. frist psots. Teh intarweb seems to certainly kick arse for that sorta research. Very 1337 articel. Thx d00dz.
  • by sandstorming (850026) <[moc.gnimrotsdnas] [ta] [eesnhoj]> on Sunday January 23, 2005 @04:57AM (#11446694)
    When we might actually say words like 'lol' out aloud. Imagine a deal going down between two mining companies and the CEO of one company with a straight face, and deadly serious demeanour saying to the cameras: "Despite many thinking we pwned them in the deal, we believe it came out leet for every1"
  • by Anonymous Coward
    more than just web users are adjusting to this shift in language. i countinously question my co-workers (social workers) in telling the youth what is propper and not. if a launguage does not evolve then it dies. using words, moslty slang and rap song lyrics, is becoming more than just the normal, it is becoming the standard.
    • by Kafir (215091)
      i countinously question my co-workers (social workers) in telling the youth what is propper and not.

      I'm glad they're telling the youth what is proper; you're clearly incompetent to do so.

      using words... is becoming more than just the normal, it is becoming the standard.

      Is that right? Using words is "becoming more than just the normal"? I've been using words for years now; I'm glad to hear that's becoming the standard. Your post is a perfect example of why people should learn to write in something a
  • Epiphany (Score:2, Funny)

    by phaln (579585)
    It came to me that the English language was in deep trouble when people started saying "rotfl" and "lol" in person. There seems to be kind of a backlash brewing though, with improved email composition styles dictated by employers, and such.
  • Google does it again (Score:3, Interesting)

    by vladd_rom (809133) on Sunday January 23, 2005 @05:03AM (#11446708) Homepage

    This is not the first time when Google (and search engines in general) changed how we do things.

    Nowadays copyrighters use Google to search for potential violations of their intelectual property. Plagiarism is easy to detect nowadays thanks to Google as well. Instead of using rather expensive [turnitin.com] systems in order to search for duplicate work, teachers are now one search away in distinguishing original work from the rest.

  • by Anonymous Coward
    It be now official. Netcraft gots confirmed, dig dis: *BSD be dyin'

    One mo'e cripplin' bombshell hit da damn already beleaguered *BSD community when IDC confirmed dat *BSD market share gots dropped yet again, now waaay down t'less dan some fracshun uh 1 puh'cent uh all servers. Comin' on de heels uh a recent Netcraft survey which plainly states dat *BSD gots lost mo'e market share, dis news serves t'reinfo'ce whut we've knode all along. What it is, Mama! *BSD is collapsin' in complete disarray, as fittin'

  • hammerrevolution.com [hammerrevolution.com] --;
    • person 1: like person 2: like person 1: --; person 2: yea
    • Oh christ, not you whackjobs again. You've infested our forum [dailyjolt.com]. Sure, --; is a neat emoticon, but when could one ever use it? On a seperate note, anyone cringe when reading "He far from succeeded."? On a completely seperate note, anyone notice how programmers write with slightly different grammar? Extra punctuation always goes outside the ", never inside, as above.
      • Programmer grammar (Score:3, Insightful)

        by cbr2702 (750255)
        Adding or changing characters in a literal string seems like misquoting. Traditionally in handwritten work the comma went almost directly under the quotation mark. When people shifted to typewriters and then computers, an arbitrary choice was made to put the comma first. Most programmers I meet seem to have reversed that choice.
  • by Anonymous Coward on Sunday January 23, 2005 @05:23AM (#11446742)
    There are more non native speakers on the web then
    native speakers.
    In the European community the native English
    speaking persons are by far a minority. That way
    French expressions are poring into the language
    in an unstoppable way. Those expressions are then
    used by native speaking politicians and are
    broadcasted by television. That way they enter the
    mainstream of the English language.

    Regards
    • by Anonymous Coward
      There are more non native speakers on the web then
      native speakers.

      Of course, non-native speakers have generally less trouble distinguishing "then" from "than" than the so-called "native" speakers do. You might speak it natively, but remember, you don't write it natively.

    • by Anonymous Coward
      Ahhh run for the hills the French are coming!!!
    • Who needs to be careful? Hopefully the Internet *will* cause languages to merge. It could be like the Tower of Babel in reverse. Wouldn't it be great if there was a unified global language?

      Now I know some people would be quite upset at the horrible "loss" of cultural diversity implied by a single global language. But we can be just as diverse in many other ways that don't cause us to be unable to communicate with each other on a basic level. And IMHO, being able to communicate is much more important

      • Wouldn't it be great if there was a unified global language?
        Now I know some people would be quite upset at the horrible "loss" of cultural diversity implied by a single global language. But we can be just as diverse in many other ways that don't cause us to be unable to communicate with each other on a basic level. And IMHO, being able to communicate is much more important than some academic's ideal of "cultural identity".


        Okay... how about the complete loss of the ability to read any of the world's litera
        • You're overdramatizing. This is a process that will take hundreds if not thousands of years, even with technology helping to accelerate it. It's not like we'll wake up 10 years from now with a unified language and forget how to read today's literature!

          By the time we have a unified language, we'll have a whole new set of literature to go along with it. Today's literature will be like ancient greek literature, and yes, it will only be readable by people with special training. It will need to be translat

          • by monecky (32097)
            > Academics care about linguistic diversity in an abstract sense, but normal people really don't.

            I think you're a bit wrong on this. There are around 6,800 languages [ethnologue.com]. Most languages have developed their own culture. Do you really think millions of people around the globe would be willing to lose their identity?

            For example, after the collapse of the Soviet Union, Uzbeks started replacing Russian loan-words with the original Uzbek words.

            Paul Rodrigues
            • Those millions of people won't lose their identity, because they will die long before the transition is complete. Remember we're talking about a process that takes a *long* time. The children of those people will grow up using a language that is slightly changed, and so on.
          • While I agree it would be a good thing to have a unified language, I don't see it happening to the fullest extent. This is because of the different alphabets used in some languages. I can easily see the European languages merging (German and Latin based), possibly including the Slavic ones but perhaps they would form another set. And there might be some Middle Eastern languages that remerge. And the Asian languages of China, Japan, Korea, Viet. etc..

            But I can't imagine those super groups ever merging
            Of c
          • Of course, if we all move to speaking some kind of ubermetalanguage, that implies that the languages spoken today would be lost. That would be a sad thing.

            Language is a reflection of culture, and culture, to date, is a deeply regional thing. The standard example is that the Inuit people of Alaska and Canada have dozens of words for snow; while this seems to be not entirely accurate [straightdope.com], the general point stands that different groups have richer or poorer ways of expressing concepts based on their collective

            • Firstly, the languages wouldn't be lost, they simply would be known by far fewer people. I don't see that as a sad thing, because I don't see the point in making people speak different languages when a single rich language with regional variation would be just as culturally rich and much more practically useful. Of course regional variation would endure, and it might even become more prominent than it is today in English, as you suggest. But the base language will be a common language, and people from al
    • by new500 (128819) on Sunday January 23, 2005 @07:40AM (#11447059)
      . . .

      Those expressions are then
      used by native speaking politicians and are
      broadcasted by television.


      Dude, it's worse, the French have already infiltrated as far as the advertising business and are using covert channels to spread some dangerous crack i heard was called La Liberte :

      http://french.about.com/b/a/081281.htm

      Slightly more seriously :

      Apart from pointing out that your use of the word native is rather presumptive of geographic origin in this big wide internet thing, i wonder if this linguistic adoption is more one way towards English since the internet. OK the French got Le Weekend, and tons of anglicised nouns, tried to ban them all and didn't manage. But i read Friday that a British pilot training firm lost a contract to a French one. The reason cited by the Asian airline was that, whilst the training had to be in English, the French trainers spoke better, clearer, more intelligble English than did the English. I can't argue with that. Sadly.
    • Such as carefull, poring, broadcasted? ;)
    • Don't worry, according to the French we're doing far greater damage to their language and culture.
    • There are more non native speakers on the web then native speakers.

      Good. Less to worry about whenever they get restless.

      That way French expressions are poring into the language in an unstoppable way.

      Ah, but when you pore into the language, the language also pores back into you.

  • by Anonymous Coward
    I've used the web for corpus linguistics research. My last big project was to look at a lot of web pages with Mexican and Chilean slang Spanish, and see if there was a difference in vocabulary usage. There was a significant difference; I could, 70% of the time, tell if a given passage was Chilean or Mexican Spanish.

    I could have gotten a higher accuracy rate, but this was just a simple undergraduate project.
    • Though I've done it at a higher level of the educational system (while doing a Ph.D. in Linguistics). The big, big advantage of using search engines is the sheer size and variety of the content available on the web. For a number of things, there is simply no other way to get enough examples, because the phenomenon you're interested in is just too rare. The downsides are repetitiveness (it's often the case that you get the same document a lot of times at many different URLs; for example, song lyrics), typ
  • Without RTFA my fist instint is to say why post anything related to natural language on slashdot? But the truth is, as a sysadming/webmaster/anything that plugs into an outlet for a small credit union I am appalled at the way people want to write on the web. It's hard to describe, but see (for the moment) this [usalliancecu.org] for a crippled example (yeah, a work site published externally, FSCK'ing horrible - more where that came from). Anyhow, it seems the second people publish shit one the web they give up on grammer/pu
  • There needs to be an anual prize for the highest compression ratio using random pages from the web as the corpus. This would probably do more for real advancement of artificial intelligence than the Turing competitions.
  • Non-official English (Score:2, Informative)

    by Anonymous Coward
    Unlike French and Italian, there is no official instution that defines 'correct' English. Essentially, the English-speaking world just 'makes it up' as it goes. Thus when I see the adverb 'really' butchered into 'real' I must try not to get annoyed. i.e. It's real hard to use your mother tongue. vs. It's really hard to use your mother tongue. Please help me here - is the misuse/non-use of 'really' something that's taught in school?
    • by Kafir (215091)
      From Merriam-Webster Online [m-w.com]:

      real (3, adverb): VERY (he was real cool -- H. M. McLuhan)
      usage Most handbooks consider the adverb real to be informal and more suitable to speech than writing. Our evidence shows these observations to be true in the main, but real is becoming more common in writing of an informal, conversational style. It is used as an intensifier only and is not interchangeable with really except in that use.

      I'd say you're fighting a losing battle on this one. I'm not too bothered by it, e

    • These things can change over time. After all, in German there is in most cases no distinction between adverbs and adjectives, no "ly" suffix (adjectives get suffixes to agree with the gender and case of the nouns that they modify, but in some forms there is no suffix). It is possible that "ly" could disappear over time.

  • Good! Natural language is a moving target. The web is an excellent communication medium and ignoring it would be quite a
    silly move. The example reminds me of "To boldly go", which was not proper, but its elegance is hard to argue against.
    • In fact there are some arguments about the To Boldly go etc.

      Apparently written English Grammar varies so much from how it is often spoken as the rules were written down by a Scholar in latin who firmly believed that English should conform to the same rules - even though it doesn't

      A careful poke at this 17th century book ( thereabouts - which sets the standard for modern grammar ) means that even Shakespeare wrote bad grammar, and he isn't the only one.

      So in fact correct grammar isn't so correct at all an
      • the rules were written down by a Scholar in latin who firmly believed that English should conform to the same rules

        My understanding is tht the banning of split infinitives was never a hard and fast rule, even among good writers; Orwell certainly dissented. Infinitives in Latin (and French, German, Italian and Spanish[1]) can't be split anyway, as they are one word.

        So in fact correct grammar isn't so correct at all and should be taken with a pinch of salt.

        There's a middle ground. How many moles of NaCl

  • by adam31 (817930) <.adam31. .at. .gmail.com.> on Sunday January 23, 2005 @05:49AM (#11446804)
    How do you even pronounce 'pwn3d' ? Google is not a tool to study speech patterns, and there's nothing to say that speech even resembles written text.

    The article addresses this in a weird way, where it first draws attention to the distinction, but once it reaches its crux, where google is used as a tool, the distinction is ignored entirely; instead it opts to focus on stranger things.

  • by Frogbert (589961)
    I woulda thght such a thng was unpossible.
  • by KiloByte (825081) on Sunday January 23, 2005 @06:33AM (#11446892)
    Yes, we can record the errors made by the uneducated public (and even those done by, uhm, me). The question is: should we do that or not?

    I was pretty taken aback when a council of linguist in Poland suddenly declared some widely-chastised and not even very popular errors to be valid usage. I've been brought up in the circles of people who not only put a lot of stress to the language you use, but also cruelly point out every incorrect word or phrase you use -- and this made me quite intolerant to bad speech.

    Being but a dirty foreigner, I know that my English can sound bad in the ears of native English speakers -- that's why I sometimes ask people to correct me if they spot errors.

    In other words: some people find careless speech repulsive. Thus, we should do whatever we can to promote correct usage as opposed to legalising incorrect uses.
  • when you doubt between two spellings of a word, check the search results count in google. I've used that trick.

    Then again, my idea of fun is to use google count for finding the words that get misspelt(google ratio with misspelled 5%) the most often.

    I thought compatable was common, but i only get a 1% ratio there. Maybe there should be a category 'non native'.

    Is conneXion considered an error? I like it much better than connection.

    Just now i find out that there are lists , eg at most commonly misspelled w [world-english.org]
    • Is conneXion considered an error? I like it much better than connection.

      It's correct, but British. Just like colour/color, or theatre/theater. Or foetus/fetus, though that doesn't seem to come up so often.

      connexion
      Pronunciation: k&-'nek-sh&n
      chiefly British variant of CONNECTION

      Did it never occur to you to check an actual online dictionary [dictionary.com]? I use google to see if my usage of a word or phrase is acceptable (or at least common), but a dictionary is probably a better bet for spelling.
  • by Dracos (107777) on Sunday January 23, 2005 @06:34AM (#11446894)

    I think that for most of the 20th century, English, and most languages in the industrialized world, was largely static, dominated by the written word which was dominated by proper grammar. Since WWII, popular culture and faster communications have increasingly exposed us to local vernaculars, mostly through radio and television. The written word lagged behind in its cultural evolution.

    Thanks to the internet (initially email, BBS's and IRC, but more widely known on the Web), we now have a hybrid of the spoken and written word: the "typed word". This form of language evolves at the same rate as the spoken word, and injects its own vernacular as a side effect of the medium: acromyn and abbreviation "words" (rofl, how r u), along with common misspellings (pwned), and mixing letters with numbers or punctuation (133t, n00b). All of these serve at least one purpose, whether as a form of super shorthand, insult, the appearance of being "cool", or are merely the result of laziness on the part of the author. Most typed-word terms don't transfer well when spoken.

    One of my hobbies is studying (European) languages and how they are related. Sometimes I worry about the damage the typed word is causing to the spoken and written word (and any proper linguist should at least be interested in the phenomenon). Luckily, most typed word expressions aren't pronounceable, and the ones that are sound absurd, because they are removed from their original context when spoken, and everyone recognizes gibberish when they hear it. How the typed word affects the written word remains to be seen. Yes both are typed now, but only the written word has a chance of going through an editorial process. I think it will take a very long time for the formal lexicon and rules of grammar to embrace, however reluctantly if ever, the typed vernacular.

    • I think that for most of the 20th century, English, and most languages in the industrialized world, was largely static, dominated by the written word which was dominated by proper grammar. Since WWII, popular culture and faster communications have increasingly exposed us to local vernaculars, mostly through radio and television. The written word lagged behind in its cultural evolution.

      You do realise that most of the 20th century happened after the second world war, don't you? A condition that became false

    • There has always been a distinction between conversational speech and formal speech. Someone talking on the radio or giving a lecture will use different grammar from what someone interactively talking to a few people would use. They will use expressions which are considered slang or inappropriate, and will mangle the sentence structure for purposes such as getting the sentence over with as soon as its meaning has been conveyed. Diction is often traded for speed in set phrases (e.g., saying "how are you?" wi
    • I comma for number one comma fail to see your point period.
  • I've had the chance to use Google as a grammar or style checker in my day job as a glorified copy editor. I type two nearly identical expressions X and Y in the search box. If expression X gets 10,100 hits and expression Y only 500 hits, I use expression X.

    For example, as a non-native speaker, I found myself waffling between the expression (A) "run for mayor of" and the expression (B) "run as mayor of." Letting Google arbitrate, I found 14,900 hits for (A) and only 200 hundred hits for (B). I chose (A).

    I

  • Linguists are gradually adopting the World Wide Web as a useful corpus for linguistic research.

    I love a bit of cunning linguistics.

  • by Slur (61510) on Sunday January 23, 2005 @06:49AM (#11446935) Homepage Journal
    ...which was this little program I wrote around the nascence of the internet. it took any sentence as input and kept a record of which words preceded each word, and which words followed each unique word. The idea was to build up a simple map of which words could precede or follow others completely without context. From this you could follow paths that made sentences or paths that looped forever, or paths that made no sense, and some interesting paths that made unintended sense.

    Why a tree? Language and geneology seem to have a common thread. Meaning is like genetics. Language is expressive. Information is a kind of tree whose branches grow as reality elaborates and past events accumulate. New terms need to be invented for the dynamics we perceive in reality, just as new names are given to individuals as they emerge into the world. Patterns, continuity, periodicity. Such things lie at the heart of material existence and provide the hooks for consciousness itself. Information theory is the next great frontier, along with particle physics. Already they have converged and diverged and converged again. And playing with artificial trees turns out to be a lot of fun.

    As for the "Meme Tree" program ... The next iteration built up a more discreet map by scoring proximity of unique words in sentences and inclusion in sentences together. Again, the idea was to build a simple statistical map free of any context, simply to get a sense of pure lexical association.

    The theory is that the internal consistency of these various lexical maps should roughly reflect many aspects of associative meaning. You could think of the statistical map as a Godelian bubble whose "truth" - if you will - is imposed by the laws governing the statistical associations. We don't derive the laws of language and meaning from these exercises, but we create an internally-complete map that reflects something about the nature of meaning.

    There is a practical aim as well. If you can derive the strength of equivalence and the various levels and colors of associative meaning you could in theory build a "Truth Machine" capable of answering any question with a high degree of accuracy. The result of any question could be computed as any other information retrieval problem would be.

    I never got around to having my little Meme Tree programs scrape the internet for random sentences. However, this should be a very simple thing to do. Google has had programming contests in the past - programs that use the Google database in interesting ways. Statistical analysis of language is basically what they do. Research projects on their data could provide stunning insights into the nature of information itself, its relation to language and to reality, and likely into our very nature as linguistic beings.

  • BBC voices (Score:2, Informative)

    by matt me (850665)
    Link on front page of bbc.co.uk - bbc.co.uk/voices/ [bbc.co.uk] - their attempt at tracking accents and dialects across the UK.
  • by Anonymous Coward
    Just a month ago I finished a paper exploring using Google counts in great detail for language analysis and other forms of meaning extraction.
    "Automatic Meaning Discovery Using Google":http://arxiv.org/abs/cs.CL/0412098/ [arxiv.org]

    Comments welcome, -Rudi.
  • by minairia (608427) on Sunday January 23, 2005 @09:04AM (#11447234)
    I am American but have to write in Japanese for work. No matter how much one learns in school, when one writes in a foreign language, you'll hit a point of wondering if what you wrote is how native speakers say something or is even understandable. Whenever I hit a point like that, I put the sentence in question (or key fragments thereof) into a Google search. If nothing comes up, I know I have to rewrite. If only a few links come up, I know what I wrote might be a little wierd, but is at least understandable. If I get pages and pages of links, I'm golden.
    • My wife is a linguist (but I am not), she would NEVER use google hits as proof that her translation is correct. In English, especially, there are far too many grammatical and spelling errors that have come into common useage (think "their", "there", and "they're" or "it's" and "its").

      A high number of google hits could mean the translation is correct, but it could also mean there are a lot of idiots using the internet.
  • Linguistics 101 (Score:2, Insightful)

    by DingerX (847589)
    I use search engines all the time for linguistics reseach: when I'm reading or translating from one language to another, and I run into an odd usage, I just type the phrase in the magic box and *poof*, I get hundreds of contextual examples. Likewise, if I'm writing in a foreign language, and I need to know if a preposition or a construction is correct (and not simply words), again all I have to do is type it in and see what comes out.

    Measuring how the internet changes world languages is only a small part o
  • My personal interest has been in using Google to return pages related to some search query and then data mining the text on the referenced pages (my company develops a product called theConcept [mesadynamics.com] for OS X). For example, doing keyphrase analysis on the first 100 pages returned in the results from the Google search "linus torvalds" returns key pairs such as:

    • operating system
      linux kernel
      free software

    And citations linked to those pairs such as:


    • Linus torvalds as the moving force behind the operating system t
  • There is also http://www.ethnologue.com/ [ethnologue.com], which keeps track of over 6000 human languages.
  • LanguageLog [upenn.edu] is a resource linked in the article, where linguists discuss current peculiarities of the English language.

    This is misleading in suggesting that LanguageLog [upenn.edu] is limited to English. Actually, it deals with all sorts of linguistic topics and languages.

If A = B and B = C, then A = C, except where void or prohibited by law. -- Roy Santoro

Working...