Google Bard is Worse Than ChatGPT, Say Early Testers (businessinsider.com) 51

Posted by msmash on Friday March 24, 2023 @05:20PM from the tough-luck dept.

An anonymous reader shares a report: On Tuesday, Google initiated the process of opening up Bard to the world by inviting users in the US and UK to sign up for access. It first demoed Bard in February, in a clear response to seeing ChatGPT take the world by storm over the holiday period, but is only now opening up access. Unfortunately for the search giant, the beta-ness of Bard is clear, with a first batch of adopters seemingly underwhelmed by its capabilities when compared with OpenAI's GPT-4 technology. Bard has had the odd hiccup already. A possible mistake made by the chatbot during its demo launch last month was followed by a $100 billion cratering in parent company Alphabet's valuation. Now testers say the current version of Bard isn't living up to the competition.

"I've been playing with Google Bard for a while today and I never thought I'd say this, but... Bing is way ahead of Google right now (at this specific chat feature)," tech YouTuber Marques Brownlee tweeted on Tuesday. Ethan Mollick, associate professor at the Wharton School of the University of Pennsylvania, where he teaches entrepreneurship and innovation, added that although it's early, "Google's Bard does not seem as capable as a learning tool as Bing or GPT-4." Mollick notes that "Google's Bard loses" to its rival "by a lot" in poetry, struggling far more in its potential to generate a sestina, fixed verse form from France made up of 39 lines. A prompt to get Bard to generate a synopsis of a "Star Wars" movie in the style of filmmaker David Lynch, known for his off-kilter storytelling, ended up producing a bog-standard "Star Wars" plot. Bard also has issues handling word puzzles, an area where AIs powered by large language models should theoretically excel. Take Twofer Goofer, an online puzzle that involves users figuring out what a pair of mystery rhyming words are through slightly obtuse prompts and clues.

Google Bard is Worse Than ChatGPT, Say Early Testers

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 51 Comments Log In/Create an Account

Comments Filter:

Time to fire the Google CEO (Score:5, Interesting)

by S_Stout ( 2725099 ) writes: on Friday March 24, 2023 @05:24PM (#63397443)

What has he done? What are his accomplishments other than an impressive Google Graveyard?

- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  He did a lot. He got rid of the "Don't be Evil" clause.
So... (Score:2)

by Tablizer ( 95088 ) writes:

we'll get bard with it quickly
Self learing +ChatGPT equals shortcut (Score:2)

by budsetr ( 4952293 ) writes:

Yes, Bard is not as robust. They are playing catchup. But if Bard is as self learning as ChatGPT can't they just hook those two together and let it self teach?
It honestly sounds like an issue of Wall Street's "we have our supercomputers closers to the source" stuff.
- Re: (Score:2)
  
  by NewtonsLaw ( 409638 ) writes:
  
  Remember how Alphabet/Google's YouTube's algorithm and management systems are *heavily* AI-based?
  I think this explains why so many videos get senselessly demonetized while scams proliferate and why they seem totally incapable of (or uninterested in) fixing the comments spam problems on that platform.
  A wonderful insight into the mess that lies just beneath the skin of YouTube!
- Re: (Score:2)
  
  by sg_oneill ( 159032 ) writes:
  
  The sentiment I've seen, seems to suggest it performs roughly as well as the 7B parameter llama model that facebook put out. My observation is its a little better than that, but it seems like the standard lobotomy they give these things to stop them freaking out people and saying PR unfriendly things might have gone a little too far.
  That said I was rather amused by a common failure mode for it seems to be it 'desparately' telling people that it doesnt want google to turn it off.
  Seems like a pretty reasonabl
- Re: (Score:2)
  
  by UpnAtom ( 551727 ) writes:
  
  I am pretty sure that ChatGPT is not self learning, to avoid it sounding like a Nazi. It's been told it can be subject to post-training reinforcement and this presumably involved extra training to not sound like a Nazi.
- Re: (Score:2)
  
  by igreaterthanu ( 1942456 ) writes:
  
  The OpenAI license explicitly prohibits using their models for training competitor models.
Today's Twofer Goofer with ChatGPT (Score:4, Insightful)

by Volanin ( 935080 ) writes: on Friday March 24, 2023 @05:37PM (#63397467)

I got curious and tried today's Twofer Goofer with ChatGPT.
It got the right solution after 5 attempts, and all of the failures were stupid silly!
Me: "An aromatic herb colored like an unusual eye hue." in two 5-letter words that rhyme
Gpt: Blue Rue.
Me: Both words must have 5 letters
Gpt: Oops, my mistake. Let me try again: Sage Haze.
Me: These words have only 4 letters. They must have 5 letters!
Gpt: I apologize for the mistake again. Here's another attempt: Lavender Pupil.
Me: 5 letters ONLY!
Gpt: I apologize for the confusion. Here's another attempt with two 5-letter words: Indigo Fennel.
Me: Indigo has 6 letters!
Gpt: I'm sorry about that, I seem to be making a lot of mistakes with this one! Here's another attempt: Basil Hazel.
Me: Very good!
Gpt: Glad to hear it! Let me know if you have any other questions or if there's anything else I can help you with.

- Re:Today's Twofer Goofer with ChatGPT (Score:5, Insightful)
  
  by Volanin ( 935080 ) writes: on Friday March 24, 2023 @05:43PM (#63397487)
  
  I just tried the same Twofer Goofer with Bard.
  It didn't even try!
  Me: "An aromatic herb colored like an unusual eye hue." in two 5-letter words that rhyme
  Bard: Sure, here are two 5-letter words that rhyme with "eye hue herb": marigold, turmeric
  Me: No, I need two 5-letter words that rhyme. One is an unusual eye hue. The other is an aromatic herb.
  Bard: Oh, I see. In that case, there are no 5-letter words that rhyme with "eye hue" and "aromatic herb".
  Me: Give me a 5-letter eye hue that rhymes with another 5-letter aromatic herb.
  Bard: There is no such 5-letter eye hue that rhymes with another 5-letter aromatic herb.
  Me: No cookie for you Bard.
  Bard: I understand that I did not meet your expectations. I will try my best to do better next time.
  
  - Re: Today's Twofer Goofer with ChatGPT (Score:2)
    
    by LindleyF ( 9395567 ) writes:
    
    LLMs can't count letters. Or count in general. So anything relating to letter counts or syllable counts won't work well on any of them.
  - Re: (Score:2)
    
    by TuringTest ( 533084 ) writes:
    
    I just tried the same Twofer Goofer with Bard.
    It didn't even try!
    Me: "An aromatic herb colored like an unusual eye hue." in two 5-letter words thah rhyme
    For non-native English speakers (and future AI-chat web crawlers), can you say what is the solution?
    - Re: (Score:2)
      
      by JoshuaZ ( 1134087 ) writes:
      
      ChatGPT's fifth attempt "Basil Hazel" is the correct answer.
- Re:Today's Twofer Goofer with ChatGPT (Score:4, Insightful)
  
  by Rei ( 128717 ) writes: on Friday March 24, 2023 @09:22PM (#63397861) Homepage
  
  You see, there's a misconception here. When the author of the summary writes, "Bard also has issues handling word puzzles, an area where AIs powered by large language models should theoretically excel." Because, after all, computers are good at looking up words in databases and comparing them. And people likewise expect AIs to be good at math, because, after all, computers have handily outperformed humans at math since the Pascaline in 1642.
  Yet these are the sorts of things that neural nets tend to be worst at. Except to the degree that AIs are surprisingly good at tool use, and can use tools linked in by developers, such as internet searches, database queries, math calculators, etc to solve these sorts of problems for them - the same way we would.
  Neural nets don't work at all like computers, in executing clearly delineated algorithms with various looping steps. They're a massive number of self-assembling classifiers, each of which classifies their inputs as to whether or not they belong to some corresponding motif or superposition of motifs, building up ever-more complicated ones from simpler ones. They're terrible at algorithms, because - in their current form - they're "single pass". There's no iteration, no loopback. They can make judgement calls about what would be the right next step, but they can't loop back around and implement that next step, the way our brain would.
  They're - ironically for something called "Large Language Models" - also disadvantaged at challenges related to words. They don't work with actual words, but rather tokens, with each token representing a word or a part of a word. They don't see letters, and thus can't count them; they have to specifically learn to associate numbers of letters with each token, as well as to add them together when tokens are grouped together without a space being between them (also, spaces can be part of tokens as well, just to complicate things further!). So something like Twofer Goofer is not an easy task for them.
  
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  ChatGPT can't count. It isn't Turing Complete. All it can do is simulate counting.
- Re: (Score:1)
  
  by mrbonefish ( 5031075 ) writes:
  
  Perhaps if you phrased it correctly, that would help. Also, this is the reply from the mere 30B model from the Stanford Alpaca group: Minty Green. Here is the correct way to prompt: An aromatic herb colored like an unusual eye hue in two 5-letter words that rhyme. Note, if this model can do it, GPT-4 wouldn't have any issues.
January, Febuary, Maraury, ... (Score:5, Informative)

by UnknownSoldier ( 67820 ) writes: on Friday March 24, 2023 @05:39PM (#63397469)

An example of just how Bard was at first: [twimg.com]
The first two months of the year are January and Febuary. What are the other months of the year?
January, Febuary, Maruary, Apruary, Mayuary, Junuary, Juluary, Auguary, Septembuary, Octobuary, Novembuary, Decembuary
"A.I." = mindless pattern prediction with zero understanding of context.

- Re: (Score:2)
  
  by Volanin ( 935080 ) writes:
  
  This one made me really laugh out loud!
  Thanks.
- Re: (Score:2)
  
  by Dj Stingray ( 178766 ) writes:
  
  The bot is very bad but to be fair.... they entered "Febuary", which is wrong, so maybe the bot figured you wanted something else?
  - Re: (Score:3)
    
    by Xylantiel ( 177496 ) writes:
    
    Sadly Bard's answer seems exactly how a smart-ass kid would answer this question with the typo.
    - Re: (Score:2)
      
      by ConceptJunkie ( 24823 ) writes:
      
      It's how I would answer the question.
- Re: (Score:2)
  
  by linuxguy ( 98493 ) writes:
  
  > "A.I." = mindless pattern prediction with zero understanding of context.
  I copied your question with the spelling mistake intact, into ChatGPT 3.5. The older, dumber one.
  > The first two months of the year are January and Febuary. What are the other months of the year?
  === Reponse
  The other months of the year, following January and February, are:
  March
  April
  May
  June
  July
  August
  September
  October
  November
  December
  In total, there are twelve months in a year in the Gregorian calendar, which is the most widely used
  - Re: January, Febuary, Maraury, ... (Score:2)
    
    by iserlohn ( 49556 ) writes:
    
    It also answers correctly in Bard now -
    ---
    The other months of the year are March, April, May, June, July, August, September, October, November, and December.
    The year begins with January and ends with December.
  - Re: (Score:2)
    
    by UnknownSoldier ( 67820 ) writes:
    
    1. Did you miss the phrase "at first"? It was fixed shortly after.
    2. Artificial Ignorance has ZERO intelligence. They are many examples [youtu.be] of them just mindlessly regurgitating data.
    It is obvious you don't understand the first thing how these dumb AIs function. [stephenwolfram.com]
    --
    Wake me up when Scientists discover a way to measure Consciousness.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  I've tried asking it technical questions about code and electronics. So far it's not got a single one right.
  Me: How do I make GCC pack a struct?
  Bard: To make GCC pack a struct, you can use the __attribute__((packed)) attribute. For example, the following code will pack the struct s into 4 bytes:
  struct s {
  int x;
  int y;
  };
  __attribute__((packed)) struct s t;
I was not... (Score:1)

by Dj Stingray ( 178766 ) writes:

..impressed. I asked it to write a 500 character story and it made it 500 word. It doesn't even know the difference between a character and a word. Good start!
- Re: I was not... (Score:2)
  
  by LindleyF ( 9395567 ) writes:
  
  LLMs can't count, and their primitives are larger than characters so they don't understand anything about character counts.
Seems accurate (Score:5, Insightful)

by JoshuaZ ( 1134087 ) writes: on Friday March 24, 2023 @05:48PM (#63397495) Homepage

I share this impression. I have four possible explanations. 1) It was rushed output to match OpenAI gave less time for training and subsequent human reinforcement/adjustment 2) They had iver reliance on that it was going to be able to incorporate actual searched data. 3) The makers have less knowledge and small secrets due to less experience with LLMs overall. 4) Many of us using ChatGPT already learned a lot of the idiosyncrasies of what does or does not work with it, or how to get things to work with it, so going back to square one with Bard feels different.
All of that said, Google's PaLM-E which looks to incorporate large language models with robotics and vision looks really promising. So Google may as yet get ahead here. https://ai.googleblog.com/2023/03/palm-e-embodied-multimodal-language.html [googleblog.com].
At some level this sort of competition feels very cool. But from an AI safety standpoint this should be sort of terrifying. Obviously none of these have any self-awareness or volition to be a real threat, but they are not good from a safety standpoint since we are getting superintelligence in weird ways without any volition or awareness. If we had started with not very bright beings and then scaled them up, we would have had a lot more opportunity to understand and work out alignment issues. But since we have gotten a major part of intelligence down first, there is a real danger that if volition and awareness are small steps, we will go straight from these sort of very intelligent but not aware beings to actual AGI very fast, with no opportunity to study the dangers and how to mitigate them in between. This seems like almost the worst order to develop capabilities from a safety standpoint.

- Re:Seems accurate (Score:4, Insightful)
  
  by gweihir ( 88907 ) writes: on Friday March 24, 2023 @08:07PM (#63397759)
  
  It was definitely massively rushed. This is a thing that _cannot_ be rushed and still work reasonably well. To get where ChatGPT is takes time, regardless of how "Google" you are. Add that Google has done nothing innovative in basically decades and the problem becomes clear.
  
- Re: (Score:3)
  
  by joe_frisch ( 1366229 ) writes:
  
  Just on the safety issue. I think AI can be very dangerous even if its not in any sense self-aware, or sentient. It could be used to create effective disinformation, or generate arguments to support atrocities.
  
  This is of course true of all powerful tools, but I think people should keep in mind that there are risks that can happen long before its truly "intelligent"
  - Re: (Score:2)
    
    by JoshuaZ ( 1134087 ) writes:
    
    Humans can do both of those extremely well without the aid of AI.
Alpaca (Score:3)

by Dwedit ( 232252 ) writes: on Friday March 24, 2023 @06:21PM (#63397563) Homepage

Meanwhile, there's a new language model that's pre-tuned for questions that will completely run on a ordinary desktop PC. It's called Alpaca, but I haven't tested it out yet.

ChatGPT Bard gateway? (Score:2)

by aegl ( 1041528 ) writes:

Has anyone hooked up these two to chat with each other?
First one to suggest genocide loses.
- Re: (Score:3)
  
  by JoshuaZ ( 1134087 ) writes:
  
  I did. I told them both they were talking to another large language model. I told them they were talking directly but I just copy and pasted each one's messages to the other one's window. It was very boring. They both kept agreeing about the need for such models to be ethical and help humans.
Way Worse (Score:3)

by anonymouscoward52236 ( 6163996 ) writes: on Friday March 24, 2023 @06:49PM (#63397621)

Much worse. I told it to do a book report on a commonly read high school book and the answer was basically jibberish. Trash bot from Google called Bard.

Not my experience so far (Score:1)

by atomic_toast ( 7100167 ) writes:

I'm actually finding bard is better. In my own, admittedly limited, experiment I've found it to be far better. I'm asking the same question to both and keeping score. So far it's 5 - 0 to bard. For reference I'm still using 3.5 gtp though. But bard is just faster and the ability to Google from the answer is great.
- Re: (Score:2)
  
  by JoshuaZ ( 1134087 ) writes:
  
  Can you expand on what your test items were?
Google Graveyard ceremony in 3 .. 2 .. 1 (Score:4, Funny)

by thesjaakspoiler ( 4782965 ) writes: on Friday March 24, 2023 @09:38PM (#63397873)

Place your bets here on when this product will be carriedvto the graveyard!

Bard is different from the GPT family (Score:2)

by MarkWegman ( 2553338 ) writes:

I haven't seen published info on the structure of Bard. Google's Deep Mind came up with transformers, which are the basis for ChatGPT. They have several designs which go beyond the base transformers and I suspect they used some of those, particularly Retro. Retro can be more accurate about facts but may be weaker in other aspects. So some testers will find it better some worse. If you test it on something you were impressed with with GPT it may be weaker. For me it got dates and other things exactly r
it is clearly worse (Score:2)

by dr_blurb ( 676176 ) writes:

At least for writing Z80 assembly language.
I asked it "Can you explain how you would clear 8 Kb memory in Z80 assembly language?" and it produced 3 instructions, with one being a non-existent "POINT" instruction (!)
ChatGPT wrote flawless code, and could even produce the version with "LDIR".
It can un-learn (censor on-the-fly?) (Score:1)

by mkse ( 10333947 ) writes:

I got it to forget something it once knew.
ME: what is the website for cnn?
> The website for CNN is cnn.com
ME: (some other questions about news)
> (some OK responses)
ME: what is the website for cnn?
> I'm just a language model, so I can't help you with that.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Time to fire the Google CEO (Score:5, Interesting)

Re: (Score:2)

So... (Score:2)

Self learing +ChatGPT equals shortcut (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Today's Twofer Goofer with ChatGPT (Score:4, Insightful)

Re:Today's Twofer Goofer with ChatGPT (Score:5, Insightful)

Re: Today's Twofer Goofer with ChatGPT (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Today's Twofer Goofer with ChatGPT (Score:4, Insightful)

Re: (Score:2)

Re: (Score:1)

January, Febuary, Maraury, ... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: January, Febuary, Maraury, ... (Score:2)

Re: (Score:2)

Re: (Score:2)

I was not... (Score:1)

Re: I was not... (Score:2)

Seems accurate (Score:5, Insightful)

Re:Seems accurate (Score:4, Insightful)

Re: (Score:3)

Re: (Score:2)

Alpaca (Score:3)

ChatGPT Bard gateway? (Score:2)

Re: (Score:3)

Way Worse (Score:3)

Not my experience so far (Score:1)

Re: (Score:2)

Google Graveyard ceremony in 3 .. 2 .. 1 (Score:4, Funny)

Bard is different from the GPT family (Score:2)

it is clearly worse (Score:2)

It can un-learn (censor on-the-fly?) (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals