Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Graphics Software

Words That Speak a Thousand Pictures 102

venolius writes: "The New York Times (free registration required) has an article on TextArc (created by W.Bradford Paley), a site that "aids in the discovery of patterns and and concepts in arbitrary text" (from the detailed overview at TextArc). The site serves an applet that performs the task (texts on which analysis is available include Alice in Wonderland, Hamlet, and thousands of others -made available by Project Gutenberg-). The NYTimes article reports that Paley found that "Dracula", which relies on a strong storyline had a few keywords clustered hotly at the center, and that the metaphoric "Frankenstein" generated a circle of 50 words of modest intensity that faded towards the edges. "Portrait of the Artist as a Young Man" with evenly distributed key words produces tight and round lines and "Alice in Wonderland" produces loopier lines. Check it out! (the applet was tested on better hardware, but I did well enough with 98/IE6/550MHz/64MB)"
This discussion has been archived. No new comments can be posted.

Words That Speak a Thousand Pictures

Comments Filter:
  • Free reg. (Score:2, Informative)

    by k98sven ( 324383 ) on Tuesday April 16, 2002 @07:49AM (#3349239) Journal
    As usual, one can change the www.nytimes to
    archive.nytimes to acces the article without registration.
  • by proxybyproxy ( 561395 ) on Tuesday April 16, 2002 @08:02AM (#3349259)
    Just remebered this:

    Nupedia and Project Gutenberg Directors Answer [slashdot.org] - a /. interview with Michael Hart
  • by robbway ( 200983 ) on Tuesday April 16, 2002 @09:47AM (#3349706) Journal
    I have to say it: I see no value in this. The mathematical algorithms do more to shape the images than the words themselves. My opinion is that this is rather unartistic, uninspiring, and doesn't reveal anything about language at all.
  • by Anonymous Coward on Tuesday April 16, 2002 @11:20AM (#3350454)
    Ummm... did you know that (in the US) blind people are allowed to recieve copyrighten books for free if they are certifiably blind. There is actually an undertaking to convert novels into a form usable by the blind. Ahh just found a link [infoanarchy.org].
  • Re:Please... (Score:2, Informative)

    by WBPaley ( 574192 ) on Tuesday April 16, 2002 @07:56PM (#3354795)
    Hi people,

    Thanks for all the discussion!
    Here are some notes from the perpetrator (Brad)...

    >by morhoj on Tuesday April 16, @07:29AM (#3349188)
    >Don't ever do that to my browser again...

    Valuable feedback; perhaps more gracefully put by

    >by Paradise Pete on Tuesday April 16, @09:34AM (#3349613)
    >I think his complaint was that it did it unexpectedly.

    I have put in a warning about the screen takeover; Others say there are ample warnings about the research & speed issues, so I left that alone. I agree that /.should link to Alice.html and
    Hamlet.html and Thousands.html, where the warnings are, rather than directly to the page that opens the applets. Can this be changed now, so others don't have morhoj's problem?

    ---

    >by reo_kingu on Tuesday April 16, @07:33AM (#3349201)
    >is this really new? I think maybe some of my teachers having
    >been using this thing to grade papers.

    Don't know if it's new, but I haven't seen it before.

    >by big.ears on Tuesday April 16, @10:41AM (#3350110)
    >...factor analysis on text. It maps every word in a text into about a
    >100-dimensional space, based on how often they co-occur in similar
    >contexts. If you feed those factors into a clustering algorithm or and
    >multi-dimensional scaler in order to present it graphically, you probably
    >get something very close to this trick.

    Flattering, but I was trying to come up with something easier to write and explain. This trick uses arithmetic (each word is drawn at its average position) not math. Net pull of a bunch of rubberbands is easier to explain _and_ conceptualize for a lot of my audience.

    ---

    >by proxybyproxy on Tuesday April 16, @07:56AM (#3349250)
    >Once again Project Gutenberg shows its beautiful face. ...

    Hear, here! Inspiring and generous work.

    ---

    >Just ran Slashdot through it (Score:2, Funny)
    >by Anonymous Coward on Tuesday April 16, @08:43AM (#3349375)

    ;)

    ---

    >by TheCrunch on Tuesday April 16, @09:28AM (#3349577)
    >(User #179188 Info | http://www.slippersandpipe.co.uk/) But a word
    >of warning to anyone else running Win98 on a P133 with 64MB RAM.
    >This thing nuts your machine. I can't get it off my desktop. I'm gonna
    >have to reboot again.. arg.

    Sorry... That warning's now on the intro pages to each applet

    ---

    >The Emperor Is Naked! (Score:1, Informative)
    >by robbway on Tuesday April 16, @09:47AM (#3349706)
    >I have to say it: I see no value in this. The mathematical algorithms do
    >more to shape the images than the words themselves. My opinion is
    >that this is rather unartistic, uninspiring, and doesn't reveal anything
    >about language at all.

    A damning observation, if it were true. I also have little respect for artsy code that doesn't express the variability in the data. In fact, the only "algorithm" here is the averaging, so any variation _must_ come from the language. They initially look similar, but so do leaves to people who don't get into the country a lot. For some people developing a feel for how different texts reveal themselves here might be worth the time. But I expect that will take more than a few minutes.

    As to unartistic--I'll weigh your opinion with Larry at the Whitney, Bruce at Columbia, Matt at the Times, Sara at Banff, and a few dozen others as I decide whether it's art. (I made it as an ndex/concordance).

    I agree that it doesn't say anything about language, but leaves don't say anything about biology. _You_ gotta provide the intelligence.

    Actually, it was built to tap into the human brain's pre-attentive processing abilities. (Oh no, do I need to provide a warning now that it'll take over your brain as well as your desktop? ;) You can actually read many more words than you are consciously aware of as your eye scans text. I hoped that as your eye jumps from word to word
    in a TextArc it wasn't jumping randomly, but to the next most "important" word, where "importance" is some function of brightness (frequency), position (distribution), and recency of concept activation,
    or level of interest (in your own head). It seems to work especially well in the 32" x 20" printed versions. Different people read different things.

    ---

    >Wishing I could see an example... (Score:1)
    >by BobTheJanitor on Tuesday April 16, @10:33AM (#3350032)

    Some screen shots are on the site, lower right button. (Guess I should make it more prominent.) http://textarc.org/Stills.html

    ---

    >Dark grey text on black background? (Score:1)
    >by an_mo on Tuesday April 16, @11:20AM (#3350462)
    >If textarc.org [textarc.org] continues to publish their stuff
    >with dark grey text on a black bacground they're not
    >reacing for the masses.

    Oops. Fixed, I think. (Do you?)

What is research but a blind date with knowledge? -- Will Harvey

Working...