Microsoft Unveils AI Model That Understands Image Content, Solves Visual Puzzles (arstechnica.com) 46
Researchers from Microsoft have introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions. From a report: The researchers believe multimodal AI -- which integrates different modes of input such as text, audio, images, and video -- is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human. "Being a basic part of intelligence, multimodal perception is a necessity to achieve artificial general intelligence, in terms of knowledge acquisition and grounding to the real world," the researchers write in their academic paper, Language Is Not All You Need: Aligning Perception with Language Models.
Visual examples from the Kosmos-1 paper show the model analyzing images and answering questions about them, reading text from an image, writing captions for images, and taking a visual IQ test with 22â"26 percent accuracy. [...] In this case, Kosmos-1 appears to be purely a Microsoft project, without OpenAI's involvement. The researchers call their creation a "multimodal large language model" (MLLM) because its roots lie in natural language processing, like a text-only LLM, such as ChatGPT. And it shows: For Kosmos-1 to accept image input, the researchers must first translate the image into a special series of tokens (basically text) that the LLM can understand.
Visual examples from the Kosmos-1 paper show the model analyzing images and answering questions about them, reading text from an image, writing captions for images, and taking a visual IQ test with 22â"26 percent accuracy. [...] In this case, Kosmos-1 appears to be purely a Microsoft project, without OpenAI's involvement. The researchers call their creation a "multimodal large language model" (MLLM) because its roots lie in natural language processing, like a text-only LLM, such as ChatGPT. And it shows: For Kosmos-1 to accept image input, the researchers must first translate the image into a special series of tokens (basically text) that the LLM can understand.
Re: (Score:2)
No more CAPTCHAS now? Then it's GOOD. Really good.
Re: (Score:2)
You do know CAPTCHAs exist for a purpose and that purpose isn't "just to annoy you", right?
Re: M$ BAD!! +eleventy billion insightful (Score:2)
Re: (Score:2)
You do know CAPTCHAs exist for a purpose and that purpose isn't "just to annoy you", right?
No I don't. CAPTCHAs were implemented as an IT fad that only sort-of-worked early on, when the puzzles were easy for humans to solve and impossible for bots. The bots got better at solving them so fast that desperate attempts to keep the idea viable viable by blurring the figures and adding more random squiggles just made the worst of them impossible for humans and solvable by bots, which is exactly the opposite of what was intended.
Site operators, get off your lazy butts and give us two-factor authenticact
Re: Nope (Score:5, Insightful)
Re: Nope (Score:5, Informative)
Re: (Score:2)
Re: (Score:2)
It's not just word correlation or image correlation, it's a model for the substantial attributes, functions and rules of operation of the thing being modeled.
Re: (Score:2)
> Prove to me that you understand things and aren't just responding to stimuli based on a huge neural network
Neural networks and psychotic people "hallucinate". Normal people -- chances are the poster is one -- do not. Hence, he almost certainly undestands, in a way different from the way neural networks process stimuli.
Our huge neural network does its thing perhaps -- and perhaps not -- unlike the ML neural network, but there is another quality in our minds that is aware of what the neural network is do
Re: (Score:2)
Re: (Score:2)
Neural networks don't "hallucinate". That's nonsense intended to make you think that more is happening than is actually happening. Just like the word "neural" is supposed to make you think of biological brains, even though they're not similar in any way.
Again, there is absolutely nothing like "understanding" happening here. It's just statistics and probability.
I don't think that ChatGPT is conscious, but the case for it being not conscious is surprisingly difficult.
We don't need bad philosophy. We have a complete understanding of the system. You might as well be asking if nighttime is conscious. It's a mean
Re: (Score:2)
We don't need bad philosophy. We have a complete understanding of the system. You might as well be asking if nighttime is conscious. It's a meaningless question.
We have a pretty poor understanding of these systems, as demonstrated by the fact that we have trouble getting them to output what we want. Note how much trouble OpenAI has had preventing the system from giving out answers they deem controversial or biased or otherwise not what they want. Our understanding of what these systems are doing
Re: (Score:2)
Note how much trouble OpenAI has had preventing the system from giving out answers they deem controversial or biased or otherwise not what they want.
If you had even a basic understanding of these systems, that wouldn't surprise you in the least. This magical thinking of yours would come to a quick and decisive end if you'd just take the time to learn something about the technology.
Re: (Score:2)
Re: (Score:2)
The point was that you asserted that we had a "complete understanding" of these systems. That is very much not true as demonstrated by the inability to control their output effectively.
Indeed. One thing my statistics prof back when I got my CS MA said was "statistics is exceptionally unintuitive" and he was right. The problem is that statistics uses massive amounts of exceptionally shallow correlations, while logical reasoning goes deep but very narrow with a comparably small number of steps (i.e. _implications_). That is why automatons like ChatGPT can simulate relatively simple behaviors, but cannot reason one bit. Statistical models are simply unsuitable for reasoning in general.
At the
Re: (Score:2)
That is very much not true as demonstrated by the inability to control their output effectively.
That's completely absurd. One thing has nothing to do with the other. Just because something is completely understood does not mean we can control it at will. We can have a full and complete understanding of some hash function, for example. Given an input and an output, we can give a full and complete accounting of how and why the output was produced. However, we can't go the other way, start with a desired output and work backwards to find all the possible inputs that will produce that hash.
There is currently a massive number of researchers working on trying to understand what LLMs are doing internally.
You seem
Re: (Score:2)
Re: (Score:2)
Ahem, I claimed we do _not_ habe a complete understanding? And I argued we may not be able to get one?
Re: (Score:2)
Re: (Score:2)
Again, there is absolutely nothing like "understanding" happening here. It's just statistics and probability.
Indeed. But Physicallists are quasi-religious fanatical idiots that are deep in delusion and do not understand anything. The fact of the matter is that a smart human being realizes he/she/it is intelligent and can have insight and this is directly tied to consciousness. The "Heureka!" moment, if you will. (No idea whether dumb humans have that, but I know tons of smart people that all report having experienced this element of human existence...)
Machines, according to current Physics, cannot have consciousne
Re: (Score:2)
It's good to remember we cannot even define life or measure if something is alive, and consciousness is more sutble than life.
The paper is interesting but is missing a reason for its existence: why are we even asking if LLMs are "conscious"? We don't have a definition of a way to measure, yet like life we know it when we see it. I take it that the real meaning of the question is "are LLMs anything like us" in terms of the mind? The answer is, no, definitely not, just like we know that a robot doll is not al
Re: (Score:1)
It's good to remember we cannot even define life or measure if something is alive, and consciousness is more sutble than life.
Indeed. Life is still unknown as to how it works. Physicalist idiots will claim a cell is a mechanistic machine, but that is clearly a conjecture. If it were the case, why have all attempts to create life artificially failed? Unless and until we can create life artificially (and no, a virus is _not_ alive), we do not know. As to consciousness, the current Physics standard model does not have a mechanism for it. In fact you can credibly argue it does pretty clearly say "physical objects do not have identit
Re: (Score:2)
Agreed, nothing I would add, or take away.
FWIW I have, like many, found Nietzsche's thinking to be instructive in tearing to the ground the delusional ideas you mentioned, even if I am not sold on what he suggests building up as a replacement.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
You're confused, I can only assume because the word "neural" in the term "neural networks" makes you think that these things are anything at all like brains. They are not. That's silly nonsense. Brains can do many things that neural networks can not do.
Oh, and these AI things are doing a lot less than you believe them to be doing.
No, there is absolutely nothing like understanding here. Do you really need me to explain this yet again? I'm happy to, but it's getting really old.
Re: (Score:2)
Indeed. No idea why some people insist on refusing to see the facts. Anti-vaxers, flat-earthers, physicalist, etc. all believing the most demented nonsense in the face of overwhelming evidence to the contrary.
Re: (Score:2)
You obviously do not have effective intelligence (after all you just claimed to be an automaton and these provable cannot be intelligent), so proving that to you is impossible.
Re: (Score:2)
Incidentally, Physicalists are idiots pretty much on the same ignorance level as flat-earthers.
Re: (Score:2)
I'm starting to wonder if it's even worth the effort. It's like arguing with creationists. These people really, really, want to believe that their science fiction fantasy has come true. Some of them want their virtual girlfriend to really love them back. Others want to live until the batteries give out in Ray Kurzweil's promised video game afterlife. Countless other things, I'm sure, I can't keep up with the nonsense.
Maybe once the hype dies down and the limitations become too much to ignore, maybe the
Re: (Score:2)
I'm starting to wonder if it's even worth the effort. It's like arguing with creationists.
Yep. It is quasi-religious fanatical behavior. Flat-earthers. anti-vaxxers, creationists, etc. all being exceptionally sure of their stances in face of overwhelming evidence to the contrary. These people cannot be reached. They are too deep into their delusion. I think the best we can hope for is them becoming one more bizarre sect when the hype has died down.
If course, there is one, bretty bad, possibility: It is possible these people all only have a limited form of intelligence that is not capable of und
Where can I fork a github project? (Score:3)
Anybody know where the source code for any of this is? I'd like to take the engine, feed it the Project Gutenberg database, and call it "VictorianAI"- since that's 99% of the text in Project Gutenberg. I think it might be small enough to fit on a desktop device and just crank away at it for a few months, and it'll provide a better signal-to-noise ratio for how to be human than whatever they fed ChatGPT on.
Re: (Score:2)
https://writings.stephenwolfra... [stephenwolfram.com]
So does this mean the end of Captcha? (Score:4, Funny)
If so, are we going to have to have PGP keysigning parties to form a web of trust where we all certify that each others' keys are owned by a flesh and blood human and nothing else so as to retain anonymity while keeping out AI chatbots from discussions?
Re: So does this mean the end of Captcha? (Score:2)
Re: (Score:2)
No, because it will drown out the content from humans in a wall of spam. All contents will be from bots run by people trying to keep me from seeing something, sell me something. They will be taylored to make my search for information as fruitless as possible. People will leave and forums will die populated by bots talking to themselves. If people want forums they will need some kind of way to prove they are human. But there's no need to prove id. Let's keep anonymity.
Re: (Score:1)
No, because it will drown out the content from humans in a wall of spam. All contents will be from bots run by people trying to keep me from seeing something, sell me something.
Yes. It will be a deluge of targeted misinformation and disinformation. It's inevitable.
Re: (Score:2)
Sounds like twitter.
AI (Score:2)
With "22â"26 percent accuracy" its a no brainer.
Humans are preparing their own demise (Score:1)
is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.
And then at the level higher than that of a human. (And that will be done by AI itself).
I need a Link (Score:5, Insightful)
To solve all these damn Capchas that pop up. I seem to need to go at least 6 of them before it verifies me as not a robot, be a lot easier if I could have a robot do them for me.