Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet Science

Using The Web For Linguistic Research 205

prostoalex writes "The Economist says linguists are gradually adopting the World Wide Web as a useful corpus for linguistic research. Google is used, among other resources, to research how the written language evolves and how some non-standard examples of usage become more or less acceptable (The Economist quotes the phrase 'He far from succeeded,' where 'far from' is used as an adverb). LanguageLog is a resource linked in the article, where linguists discuss current peculiarities of the English language."
This discussion has been archived. No new comments can be posted.

Using The Web For Linguistic Research

Comments Filter:
  • Re:I rue the day... (Score:3, Interesting)

    by Peter Cooper ( 660482 ) on Sunday January 23, 2005 @05:00AM (#11446701) Homepage Journal
    When we might actually say words like 'lol' out aloud.

    I've heard it done. I've also heard 'roffle' (an attempt at pronouncing ROTFL I guess). Bizarre, really, since those terms are attempts to turn physical real-life actions into a verbal-only form.
  • Google does it again (Score:3, Interesting)

    by vladd_rom ( 809133 ) on Sunday January 23, 2005 @05:03AM (#11446708) Homepage

    This is not the first time when Google (and search engines in general) changed how we do things.

    Nowadays copyrighters use Google to search for potential violations of their intelectual property. Plagiarism is easy to detect nowadays thanks to Google as well. Instead of using rather expensive [turnitin.com] systems in order to search for duplicate work, teachers are now one search away in distinguishing original work from the rest.

  • by Anonymous Coward on Sunday January 23, 2005 @05:23AM (#11446742)
    There are more non native speakers on the web then
    native speakers.
    In the European community the native English
    speaking persons are by far a minority. That way
    French expressions are poring into the language
    in an unstoppable way. Those expressions are then
    used by native speaking politicians and are
    broadcasted by television. That way they enter the
    mainstream of the English language.

    Regards
  • by mizhi ( 186984 ) on Sunday January 23, 2005 @06:01AM (#11446831)
    Hopefully, they'll harvest well written webpages for data and not those of 13-year old girls drooling over Orlando Bloom, AOL users, or porn sites.

    Actually, I take that back.

    It could actually be very interesting from a lexical or morphological point of view. The phenomenon of abbreviating words, such as "u" for "you" or "ur" for "you're" or "ru" for "are you." Language teachers in classrooms have been seeing it crop up in actual homework assignments. While reading such language may be like having glass wiped across the eyes of people educated before computers came into wide-spread use, it's interesting how it's affecting younger people.

    There's a collision between the high tech world children grew up with today and the way language is taught in schools in a similar way to the situation with how students speak on the street versus how they are expected to speak in the classroom or the professional world. Remember when it was proposed that ebonics be considered a valid dialect for using in the classroom?

    What would be even more interesting to study is how keyboard effect the structure of languages. It seems that people are under the assumption that languages are static and don't change, but this is incorrect.

    Because the keyboard is still the main way of inputing information into the computer, people take short cuts and I would be surprised if that didn't start to effect their use of language in other contexts.

    I'm just rambling, but such studies would be akin to socialogical studies that look at the influence of technology on social organization.
  • by Joe Tie. ( 567096 ) on Sunday January 23, 2005 @06:32AM (#11446888)
    Because the keyboard is still the main way of inputing information into the computer, people take short cuts

    One thing that's always been at the front of my my mind, why aren't these kids learning how to type? Or at least to type with any reasonable amount of skill. The only computer I had as a child was a Commodore 64, and I was still faster than most of todays youth even with their abbreviations. I was somewhat lucky in that our schools somehow foresaw the advent of the home computer and made sure we knew how to type, but I'd certainly hope that held even more true in todays schools!
  • by Dracos ( 107777 ) on Sunday January 23, 2005 @06:34AM (#11446894)

    I think that for most of the 20th century, English, and most languages in the industrialized world, was largely static, dominated by the written word which was dominated by proper grammar. Since WWII, popular culture and faster communications have increasingly exposed us to local vernaculars, mostly through radio and television. The written word lagged behind in its cultural evolution.

    Thanks to the internet (initially email, BBS's and IRC, but more widely known on the Web), we now have a hybrid of the spoken and written word: the "typed word". This form of language evolves at the same rate as the spoken word, and injects its own vernacular as a side effect of the medium: acromyn and abbreviation "words" (rofl, how r u), along with common misspellings (pwned), and mixing letters with numbers or punctuation (133t, n00b). All of these serve at least one purpose, whether as a form of super shorthand, insult, the appearance of being "cool", or are merely the result of laziness on the part of the author. Most typed-word terms don't transfer well when spoken.

    One of my hobbies is studying (European) languages and how they are related. Sometimes I worry about the damage the typed word is causing to the spoken and written word (and any proper linguist should at least be interested in the phenomenon). Luckily, most typed word expressions aren't pronounceable, and the ones that are sound absurd, because they are removed from their original context when spoken, and everyone recognizes gibberish when they hear it. How the typed word affects the written word remains to be seen. Yes both are typed now, but only the written word has a chance of going through an editorial process. I think it will take a very long time for the formal lexicon and rules of grammar to embrace, however reluctantly if ever, the typed vernacular.

  • by Hal XP ( 807364 ) on Sunday January 23, 2005 @06:43AM (#11446919) Journal

    I've had the chance to use Google as a grammar or style checker in my day job as a glorified copy editor. I type two nearly identical expressions X and Y in the search box. If expression X gets 10,100 hits and expression Y only 500 hits, I use expression X.

    For example, as a non-native speaker, I found myself waffling between the expression (A) "run for mayor of" and the expression (B) "run as mayor of." Letting Google arbitrate, I found 14,900 hits for (A) and only 200 hundred hits for (B). I chose (A).

    I discovered there's practically a dead heat between the expressions "a new lease on life" (which, if I'm not mistaken, is the expression favored by American usage) and "a new lease of life," with the latter nosing out the former 144,000 hits to 140,000. In this instance I let my own usage arbitrate. Since I'm more exposed to American than to English, I chose on.

  • by monecky ( 32097 ) on Sunday January 23, 2005 @11:59AM (#11447891) Homepage
    > Academics care about linguistic diversity in an abstract sense, but normal people really don't.

    I think you're a bit wrong on this. There are around 6,800 languages [ethnologue.com]. Most languages have developed their own culture. Do you really think millions of people around the globe would be willing to lose their identity?

    For example, after the collapse of the Soviet Union, Uzbeks started replacing Russian loan-words with the original Uzbek words.

    Paul Rodrigues
  • Re:inner city teens (Score:3, Interesting)

    by chialea ( 8009 ) <chialea@BLUEgmail.com minus berry> on Sunday January 23, 2005 @12:57PM (#11448228) Homepage
    >His meaning is perfectly intelligible, but some language snobs (very few of whom are actually linguists and know anything much about language) pretend not to be able to understand certain accent/dialects in order to feel superior.

    Incomprehension often has very little to do with that. A friend of mine moved to MA from NC at the same time as I moved from CA. She could not understand most people there, most people there could not understand her. I could, on the other hand, understand both of them. I've been at at least one conference in which two non-native speakers of English could not understand each other at all, and required a native speaker to translate.

    There are simply certain grammatical patterns that I don't understand well, if at all. It has nothing to do with snobbery; I simply can't understand, most likely because I haven't been exposed to it all that much.

    When using media of international exchange, I would certainly try to make myself comprehensible. I spend quite a lot of time trying to do this in my research papers and communication. Writing in unambigious, grammatically correct English (or something approaching it) is the first step towards sharing ideas with a wide audience. People limit their communication and opportunities by the language they use.

    Lea

It is easier to write an incorrect program than understand a correct one.

Working...