Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Technology

Why Hal Will Never Exist 325

aengblom writes "Researchers at the University of Maryland's Human-Computer Interaction Lab are suggesting what many of us have already guessed. The future of human-computer interaction won't be through speech--it will remain visual (they explain why). The Washington Post is running a story about the researchers and how they think we will get computers to do what we want. The article is a fascinating read and is joined by a great video clip (real or quicktime) of the researchers and their methods. The Post is holding an online discussion with the researchers tomorrow. Also check-out Photomesa the lab's software program that helps track images on a computer. (Throw a directory with a 1,000 high-res files at this thing and you can justify that pricey new computer you bought)."
This discussion has been archived. No new comments can be posted.

Why Hal Will Never Exist

Comments Filter:
  • Comment removed (Score:2, Interesting)

    by account_deleted ( 4530225 ) on Thursday May 09, 2002 @05:12AM (#3489400)
    Comment removed based on user account deletion
  • by danamania ( 540950 ) on Thursday May 09, 2002 @05:16AM (#3489410)
    One thing we have as well as a possible limitation on our own brainpower by using speech while thinking, is that in an office full of machines - or even a house with a family and a dog - using a computer with speech is going to pollute the people next to you with your thoughts/computer use, and they with yours - at least in the realm of using the computer as a tool.
    We're pretty well-adapted to using tools with our hands and getting feedback on what they're doing with video/audio/feel coming back from that tool, but not the other way. Speaking works naturally for nattering with friends :)
    There's no way I'd advocate the -stopping- of speech systems research, as there are people who have incredible trouble typing due to various impediments. Besides the direct uses, every piece of research had a dozen uses other than it's intended purpose.
  • by 26199 ( 577806 ) on Thursday May 09, 2002 @05:17AM (#3489412) Homepage

    ...have both. I want to be able to give the computer voice commands when I feel like it, visual commands when I feel like it... and just use the darn keyboard an' mouse when I feel like it, too.

    Interesting findings, but they're not going to get out of providing good voice interfaces that easily :-)

  • Thinking out loud? (Score:5, Interesting)

    by galaga79 ( 307346 ) on Thursday May 09, 2002 @05:25AM (#3489440) Homepage
    "It turns out speaking uses auditory memory, which is in the same space as your short-term and working memory," he adds.

    What that means, basically, is that it's hard to speak and think at the same time.


    I don't know about this statement, I always find it easier to write and/or think when I am expressing my thoughts out loud. Wasn't this something we were tought in school, like it's easier to read out loud than silently? Mind you having done two years of psychology I realise there is a lot differing opinions about how the brain works, so can any psychology graduates tell me if his statement is true?
  • Re:Single Modality? (Score:4, Interesting)

    by _Quinn ( 44979 ) on Thursday May 09, 2002 @05:29AM (#3489446)
    (Mod the parent up.)

    Aside from this, making a speech interface anyone wants to use isn't about the speech; it's about the natural-language comprehension that most people (naively?) associate with speech recognition; e.g., the Enterprise's computer. Which, you note, the crew interact with on a technical level visually.

    As for the specific example of italicizing text, natural language understanding should give rise to accurate _dictation_ systems, where the computer will insert the appropriate puncuation and emphases as you speak. If you're typing, instead, CTRL+I is your friend. :)

    -_Quinn
  • Bad logic. (Score:2, Interesting)

    by Bowie J. Poag ( 16898 ) on Thursday May 09, 2002 @05:42AM (#3489460) Homepage


    The future of computing holds so much potential in terms of horsepower that something HAL-like will not only be inevitable, but necessary in order to harness and package that horsepower. It may not happen tomorrow, or even 20 years from now, but presenting a a thinking machine to the user is the only way to encompass such capability for us humans to enjoy. We've already got a situation where most personal computers spend 99.9% of their lives waiting for us to do something. Machine sentience is not only the best, but the most elegant and efficient way to handle it. What use is having a machine at all, if it spends the vasst majority of its time idle?

    The term "operating system" will be deprecated someday, replaced with something akin to "personality engine" or "anthroderm".

    And yes, it irritates me to no end when someone predicts something wont happen in the future, rather than proposing how and when it will.

    Cheers,
  • The real issue (Score:3, Interesting)

    by 00_NOP ( 559413 ) on Thursday May 09, 2002 @05:57AM (#3489478) Homepage
    Is surely whether, in the future, computers will be bothered to talk to us.

    There is no doubt that computers with greater intelligence - ie an ability to learn and adapt - than ourselves will be here, probably in the next 20 - 25 years.

    When these machines get here they may well decide that speaking is a waste of their time.
  • by foniksonik ( 573572 ) on Thursday May 09, 2002 @06:00AM (#3489485) Homepage Journal
    The best part about 3D interfaces is the ability to make vast leaps from one place to another without the need to memorize your environment. (ala CLI).

    Think in terms of the real world where you can inspect your intended target from a distance and decide what the best route is to get there. That can't happen in 2D w/o alot of cumbersome reference (ala CLI).

    3D allows for XYZ movement and perspective enabling 4D decisions.

    If you knew that you had a setup workspace to your left and a differently setup workspace to your right and again one above you and below and 10 units in front and back and then could alternate the forementioned space with any one of the points mentioned... spatial division in 3D, would you not be more productive than having to dig repeatedly in to a hole/plane?

  • by The_Shadows ( 255371 ) <thelureofshadows.hotmail@com> on Thursday May 09, 2002 @06:01AM (#3489487) Homepage
    With everything we've seen done in history, the statement "Why HAL will never exist" has to be one of the most asinine things ever said.

    We've put a man on the moon, split the atom, discovered the building blocks of life, cloned life, and created a globe spanning network of information. A hundred years before each of these discoveries were made, people could only imagine such things, and they were really considered Science Fiction.

    Science Fiction has proven many times to be prophecy. Artificial Intelligence is hard SF. It has basis in the real world. I may come to pass. It may not, as well. But to say we will never be able to create "HAL" is ridiculous. It may be 100 years, and "never in our lifetimes" may be accurate. But it may happen. Never rule our science.

    I'm done.

    The_Shadows[LTH], out.
  • Re:Nonsense! (Score:2, Interesting)

    by linzeal ( 197905 ) on Thursday May 09, 2002 @06:19AM (#3489519) Journal
  • Re:I agree (Score:3, Interesting)

    by CyberDruid ( 201684 ) on Thursday May 09, 2002 @06:26AM (#3489529) Homepage
    Voice interface is excellent for communication from a distance. When I'm sitting in my couch, I don't want to go all the way over to my computer to check trivial things like if I have mail, when the Simpsons is on, what I have scheduled for today, playing an mp3-album, etc, etc. I just want to tell my computer to do it from wherever I happen to be. If I ask for information, the computer can use text-to-speech to give it to me.
    I'm actually looking in to the possibility of setting up such a system for myself (mostly for hack-value, of course ;). Just need decent open source voice recognition for a few pre-defined commands. I'll probably need a way to place a few (2-3) cheap microphones in my apartment and connect them (in series?) to my computer, as well.
  • by PzyCrow ( 560903 ) <john@milsson . n u> on Thursday May 09, 2002 @06:31AM (#3489537)
    Hopefully you wouldn't have to say that many things, the human vocabulary is often larger then the "possible" combinations of a keyboard and mouse.

    A comment like "Insert a five iteration for-loop" would be quicker thant typing:
    "for(int i=0;i5;i++){}"

    As "Move the most recent ten office documents to my folder", would be quicker than clickettyclickettyclickclick-click/home/user/click .
  • by Richard Kirk ( 535523 ) on Thursday May 09, 2002 @06:43AM (#3489562)
    Most people seem to think of speech processing as an untrained computer understanding ordinary human speech complete with all the sub-verbal input such as gestures, pauses, and emphasis. This is an ambitious goal, but it is not everything. We do not expect a computer to read our ordinary handwriting off a piece of paper. So, why do we expect our computer to understand what we say straight away?

    Perhaps it is because speech interpretation is unfamiliar and underdeveloped. It is difficult to use a speech interface in a crowded office without annoying others. Most able-bodied people would chose to use a visual-tactile interface for most tasks. What gets used gets supported, and what gets supported gets used. However, this does not mean that speech interpretation is inherently flawed. For example...

    • Suppose you have found a telephone number in a directory. It is easy to read out the number; it is easy to listen to the number and press the buttons on the phone; but it is tricky to read and type the number. If your visual interface is already busy, then it can be a lot easier to use speech.

    • Suppose you are editing an image. You may be in a darkened room, and making subtle changes to the colors. You don't want to put menus and dialogues on your screen, because that will interfere with your sense of color balance, or block your view of your image. You can do a lot with simple commands like "make it greener" "make it bigger". One of the most useful things was to switch between "foregound" and "background". Remember the image viewer on Blade Runner?

    • I used to sit next to someone with RSI, who used to use MS-Word without the keyboard. He had a little thumbwheel mousy-thing which he could use with his arms folded for pointing and picking,but he could do everything on speech. He did take some time getting up to speed on the system, and he did have to train the computer, but I din't learn to use a keyboard overnight either.
  • by heideggier ( 548677 ) on Thursday May 09, 2002 @07:50AM (#3489672)
    I think that the bloke is right that speech is a really bad way of communicating with computers, as they are designed today. But think that it's a bit of a leap of logic to conclude that this will always be the case.

    Case inpoint, today computers are normally designed around some kind of windows environment, a Wimp interface, where information in displayed as a metaphore, ie scoll bars, ok buttions etc etc. This is an environment that was never designed for interact beyound a mouse and a keyboard. DVD however do not follow this standard, normally being based on some kind of menu system. Clearly, the way you make something determines the way it is used.

    If speech is to be a sucess on computers then the way that people interact with the computer needs to be changed. I think a system like the console where programs arn't very powerfull on their own but due to the way that they have been linked together would work very very well.

    I long for the day when I can say, "dump down everything on slashdot and tell me if any of my post have been modded up" to read wget somesite | grep index.html | echo $whatever (please excluse this example), all you would need is somekind of AL which is able to manage the interpreation correctlly (at least most of the time).

    I think, fundamentally, computers should be designed to so what you tell them to do (how I think such a system would work) and not force you to do things in a certain way, which is what current systems do today, One should never have to learn a interface.

    I also think that this guy has limited his imagination somewhat, the main thing about hal was that he was everywhere, and that in the future, computers are everywhere. For example if you were on the loo, and just thought up a really good chess move, then you would just say, Hal queen to bishop 4, not get up, sit at a console, login a realise you've forgotten what it was you where about to do. Saying that in such a case it's easier to point to some graphic, cause you don't have to think to much, Seems kinda lame

  • by yzquxnet ( 133355 ) on Thursday May 09, 2002 @08:21AM (#3489751) Homepage
    At least in the tiniest of form in my house. I wired two older PII 400mhz boxes up and loaded in some voice recognition software, a text to speech program, and various other programs that control stuff that I have hooked up to the machines. Currently I have a few lights, and cable TV running through it. I can get the machines to turn lights on and off, Turn the TV on and off, change channels, record programs, play back programs, I can also get limited control over the computers themself. But the voice interface is really clunky for doing serious work
  • Re:Wrong (Score:1, Interesting)

    by Anonymous Coward on Thursday May 09, 2002 @08:24AM (#3489762)
    "misguided" is a picture of "misguided". That's what the written word is - pictures.

    Some people (me, for example), _think_ by having a mental picture of "words on a page". I've talked to some people who "think" with a little voice in their head - I don't, I see words writing themselves on a page. Maybe because I learned to read very young, or something. I read at a max of about 3000 words per minute (seriously).

  • by Snard ( 61584 ) <mike.shawaluk@ g m a i l .com> on Thursday May 09, 2002 @08:57AM (#3489875) Homepage
    (attention: MOVIE SPOILER ALERT if you've been living in a cave for the last 30 years)

    ... HAL's most important human-to-computer information exchange (well, one-directional I guess) in the movie was a non-verbal one - where he read Frank and Dave's lips.
  • Speech Recognition (Score:2, Interesting)

    by mbbac ( 568880 ) on Thursday May 09, 2002 @09:05AM (#3489906)
    Speech recognition is like a CLI for people without fingers. It will never take over as the primary interface between humans and computers.

    Most of us here are fairly comfortable with a CLI, because we know the commands to use. However, we're in the vast minority.

    We've already advanced past the CLI, past using command keywords towards using visually intuitive interfaces. Speech recognition would be even worse than going back to using CLIs as the primary interface, because I know most people can type rm ~/foo/blah.js faster than tey can speak it to a computer. Probably even more people can just drag the icon for the file to the trash can even faster.

    However, where speech recognition can be useful is in dictation.
  • by Anonymous Coward on Thursday May 09, 2002 @10:44AM (#3490483)
    I met Ben in person. He came to our financial company (unnamed here) last year on a kind of self promo/evangelist tour. Don't get too caught up in his views. What the article doesn't mention; he has a significant financial stake in what he is peddling here. Ah, the dangers of underwriting "higher" education.

    In fact, I had a great argument with him in front of about 500 people. Something to the effect of .. does he really expect most people to learn all his interfaces (quite a few in his toolset)? Most of them have a steep learning curve. Great for tech heads but not for the average AOL user (not that I care to cater to that crowd).

    The average user is not looking to learn some geeky interface. The average user simply wants answers. They want the computer to do the real work and give them the answer they are looking for. When a person has questions, what do they want? They want someone on the phone with the answers. They want someone competent at the help desk. They want to push a Star in their fancy car and feel like there is someone there with them to make things better. They want mom and dad to provide the answer to "What's that?" Voice may be difficult for a computer to master, but it is core to human interactions. Sorry Ben. I just disagree.
  • Re:Finally... (Score:2, Interesting)

    by BlueFashoo ( 463325 ) on Thursday May 09, 2002 @12:33PM (#3491250)
    Agreed. For a reference see this. [slashdot.org]

    The videogame generation is quite adept at using their thumbs for input on small handheld devices while older people still use the other fingers.
  • Re:Single Modality? (Score:3, Interesting)

    by MoneyT ( 548795 ) on Thursday May 09, 2002 @05:18PM (#3493169) Journal
    You do of course relize that the comprehension part is the least of our worries. Try telling your computer to open a temporary file on your computer. Have you seen some of those file names? If we do go to speech commands, we're going to need to get a much better system of naming things (can't name your documents dsfk.txt anymore). As for just getting files or programs to open, Apple's speach recognition does this fairly well. Just place and alias (or the actual file) into the speakable items folder and then tell the computer to open [item name]. They even have a command to make a currently selected item speakable (places an alias in there for you). Admittedly, it isn't the best interface yet, but it's a start. And the voice passwords are just so friggen cool (OS 9 only, when do I get it for X?)
  • Latin??? (Score:1, Interesting)

    by Anonymous Coward on Thursday May 09, 2002 @05:53PM (#3493422)
    Latin is a very complex structured unambigious language, has anyone ever tried to make an AI think in it? It seems like you could make a simplified version of it which lacks the irregularities that would be perfect for computing.

"A car is just a big purse on wheels." -- Johanna Reynolds

Working...