AI-Generated Answers Temporarily Banned on Coding Site Stack Overflow (theverge.com) 53

Posted by msmash on Monday December 05, 2022 @10:41AM from the how-about-that dept.

Stack Overflow, the go-to question-and-answer site for programmers, has temporarily banned users from sharing responses generated by AI chatbot ChatGPT. From a report: The site's mods said that the ban was temporary and that a final ruling would be made some time in the future after consultation with its community. But, as the mods explained, ChatGPT simply makes it too easy for users to generate responses and flood the site with answers that seem correct at first glance but are often wrong on close examination. Further reading: What is ChatGPT, the AI Chatbot That's Taking The Internet By Storm.

This discussion has been archived. No new comments can be posted.

AI-Generated Answers Temporarily Banned on Coding Site Stack Overflow

Load All Comments

Search 53 Comments Log In/Create an Account

Comments Filter:

The fundamental question... (Score:3)

by jeromef ( 2726837 ) writes: on Monday December 05, 2022 @10:49AM (#63103898)

...is how will the moderators tell genuine vs. AI-generated responses apart?

Share
twitter facebook
- Since "correctness" doesn't seem to differ much... (Score:1)
  
  by Anonymous Coward writes:
  
  Maybe it's just the speed, resulting in hoarding karma, maybe?
- Re: (Score:2)
  
  by Excelcia ( 906188 ) writes:
  
  ...is how will the moderators tell genuine vs. AI-generated responses apart?
  By how many wrong answers a person posts in quick succession.
- Re:The fundamental question... (Score:5, Insightful)
  
  by geekmux ( 1040042 ) writes: on Monday December 05, 2022 @11:12AM (#63103976)
  
  ...is how will the moderators tell genuine vs. AI-generated responses apart?
  You mean how will moderators tell "answers that seem correct at first glance but are often wrong on close examination"? Probably not too big of an issue to worry about, if the rest of the community has any level of competence.
  Perhaps the actual fundamental question is, why in the hell is anyone using ChatGPT to provide solutions? Sounds like inaccuracy isn't just a minor problem here.
  
  Parent Share
  twitter facebook
  - Re:The fundamental question... (Score:4, Informative)
    
    by Kristoph ( 242780 ) writes: on Monday December 05, 2022 @11:40AM (#63104042)
    
    This is a more complex issue than it seem superficially.
    ChatGPT provides accurate and comprehensive answers to many questions. It is, in many ways, much more useful than stack overflow, especially for junior developers using the most common tools and platforms.
    Where ChatGPT falls down is on less common problems and especially on languages and tools - Swift/XCode for example - that are undergoing tangible revision from version to version with deprecations. In those cases it gives you answers that would have worked 1-2 versions ago, but no longer work today. It also tends to confuse language extensions for the languages and suggests them without referencing the extension.
    But that said, having worked with it for some time, I have no doubt it represents the future. A successful stack overflow will couple their existing platform with LLM’s to provide better tailored solutions to common question with the ability to reference end user examples for new / less common problems.
    
    Parent Share
    twitter facebook
    - Re:The fundamental question... (Score:5, Informative)
      
      by AmiMoJo ( 196126 ) writes: on Monday December 05, 2022 @03:08PM (#63104784) Homepage Journal
      
      Stack Exchange is like that too. If there's a 10 year old question that's now out if date, trying to ask for up to date information just gets you shut down for posting a "duplicate".
      It's pretty funny that they are concerned about this now. If there's one thing that Stack Exchange is known for, it's superficially correct answers that are actually terrible.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by radarskiy ( 2874255 ) writes:
      
      "Where ChatGPT falls down is on less common problems and especially on languages and tools"
      This create a similar problem to almost self-driving cars: it meets its limits without notice so either you have spend the resources for human-based oversight all along, or you have a human-based recovery that's more resource intensive than just maintaining human-based oversight all along.
  - Re: (Score:2)
    
    by ceoyoyo ( 59147 ) writes:
    
    Back in the 90s Slashdot used to display how much karma you had. This lead to a phenomenon known as "karma whoring" where people would post whatever they thought would get modded up. Slashdot stopped posting the number to remove the incentive.
    People will mod up all sorts of stuff. Stack Overflow is head and shoulders above in that stuff that gets modded up usually *seems correct*.
    - Re: (Score:1)
      
      by wed128 ( 722152 ) writes:
      
      "Seems Correct but is Wrong" is much worse then either "Is Correct" or "Is obviously wrong".
      This is the principle problem with stackoverflow, and also the principle problem with chatGPT
  - Re: (Score:2)
    
    by qazedctgb ( 9372117 ) writes:
    
    probably not too big of an issue to worry a bout
    
    As a relatively active StackExchange user, I think I can say with some authority that this depends crucially on the *volume* of such posts. Honestly, on StackOverflow, the volume of "low quality posts" that the community and mods already deal with is not a full order of magnitude away from "critical mass" of making the site a total cesspool of garbage. "Plz fix mai code plz" type questions with no useful content are just barely at a manageable level already. If this sort of content were to, say, quadrupl
  - Re: (Score:2)
    
    by AmiMoJo ( 196126 ) writes:
    
    Stack Exchange is an MMORPG where you grind to level up, giving you the power to troll other users.
    Getting first post is key to farming rep points, and there's no faster way of doing it than posting an AI generated answer.
    - Re: The fundamental question... (Score:1)
      
      by dowhileor ( 7796472 ) writes:
      
      Just say social media in my opinion. It's an environment that does not give credit to meaningful interaction as you cannot be social without being a complete privileged fad apparently. Fascism was a joke until mass media made it feasible and leftist was for poor folks.....
  - Re: (Score:2)
    
    by radarskiy ( 2874255 ) writes:
    
    "Probably not too big of an issue to worry about, if the rest of the community has any level of competence."
    The marginal cost of a ChatGPT solution is zero. The marginal cost of a review by a competent human is not. For any level of competence of the community you can just overwhelm it with volume.
- Re: (Score:2)
  
  by Fly Swatter ( 30498 ) writes:
  
  They can't because the mods themselves will eventually be AI.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by fibonacci8 ( 260615 ) writes:
    
    This makes the false assumption that the training set isn't "the typical responses made by novices hoping to increase their stackoverflow social-media scores."
I think this is some kind of law of nature (Score:3)

by clawsoon ( 748629 ) writes: on Monday December 05, 2022 @10:50AM (#63103902)

Confident bullshit is cheap and easy to generate. Truth is expensive and difficult to generate. AI is replicating human experience in this matter.

Share
twitter facebook
- Re: (Score:3)
  
  by nagora ( 177841 ) writes:
  
  Confident bullshit is cheap and easy to generate. Truth is expensive and difficult to generate. AI is replicating human experience in this matter.
  It is essentially a version of the 2nd law of thermodynamics in that it means that disorder always grows at a statistical level.
  There may be 1000 ways to answer a SO question which seem reasonable, but only a few that actually work. Say there's only 1 working answer. Then bullshit is at least 1000 cheaper than accuracy and we don't expect the latter for free, just as we don't expect to knock a glass bottle off a shelf and find that it breaks neatly into two halves. We expect a cleanly sliced glass bottle to
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  That's the definition of AI: creating computer programs that are good at things humans are good at.
Best troll ever (Score:5, Insightful)

by Errol backfiring ( 1280012 ) writes: on Monday December 05, 2022 @11:02AM (#63103938) Journal

In other words, ChatGPT is the best troll ever. It does not look like a troll, but effectively it can render a community site completely worthless. And as a bonus it requires a lot of effort to clean its infestation.

Share
twitter facebook
dead internet theory looking good (Score:5, Interesting)

by Big Hairy Gorilla ( 9839972 ) writes: on Monday December 05, 2022 @11:03AM (#63103942)

Dead Internet Theory is the idea that essentially fake/auto/AI generated content eventually will approach 100% of all content on the net. It was called out as "fake" or some kind of conspiracy, but.... it's lookin' pretty good these days.

Share
twitter facebook
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Indeed. At least for "crap level" content. Which, to be fair, is a lot of the net.
Pot, kettle, black (Score:1, Insightful)

by Pienjo ( 10175 ) writes:

"[ChatGPT] simply makes it too easy for users to generate responses and flood the site with answers that seem correct at first glance but are often wrong on close examination."
I would suggest that "seeming correct at first glance but often being wrong" correctly describes *most* answers on StackOverflow.
- Re: Pot, kettle, black (Score:2)
  
  by NagrothAgain ( 4130865 ) writes:
  
  Most answers are along the lines of "I'm not going to answer your question, instead I'm going to tell you how stupid you are for trying to do it that way, and give an answer on how I think you should do it instead."
  - Re: (Score:1, Interesting)
    
    by Pienjo ( 10175 ) writes:
    
    Often, the answer getting a green tick isn't necessarily the most accurate answer (or even: correct). It's the answer that OP liked reading best.
    Given that they asked the question to begin with, that typically makes them less-than-qualified to make that assessment.
Hmm... (Score:5, Funny)

by greytree ( 7124971 ) writes: on Monday December 05, 2022 @11:42AM (#63104048)

Why do you feel that AI generated answers are temporarily banned on ccoding site stackoverflow ?

Share
twitter facebook
So let me get this right .... (Score:2)

by Kelxin ( 3417093 ) writes:

You generate AI answers to questions and you get your content deleted and banned. You put your own answers on there and they won't let you delete it or edit it and get banned for trying to...? Fair warning, any post you make there they take full control of and claim they then own anything that you post, which is why they probably don't want AI generated answers, because that could bring questions of if they own the code....
"correct at first glance but are often wrong..." (Score:2)

by gweihir ( 88907 ) writes:

Yes, sounds like AI (Artificial Ignorance) at work. I would say these answers are a lot more dangerous to have than not to have. Worst case it compiles and seems to work but causes problems later on. The no-clue "Google coders" would put crap like that straight in.
I really hope they make that ban permanent.
So, empirically (Score:2)

by gestalt_n_pepper ( 991155 ) writes:

For little short algorithms or script fragments, what ChatGPT generates isn't bad. When it is bad, it can usually be tweaked to something acceptable. I'm finding it a timesaver for powershell scripts.
This morning, I asked it to generate a winforms app that had a textbox and an OK button. It's first attempt failed, but it's second attempt produced working code.
This will only get better, particularly if it's seeded with good code, so, yeah, it's not much now but in 5 years, it'll be acceptable in most situati
- Re: (Score:3)
  
  by Moridineas ( 213502 ) writes:
  
  I asked it to "write a program to calculate pi in sh" and it worked, first try. Ditto even more specific algorithms like "Find the first 5 prime numbers that contain the number 5." Python, Prolog, Pascal, Perl, Sh, Basic, all worked when I tried.
  On another attempt with Brainf*ck code, it generated something, but it didn't seem to work.
  - Re: (Score:2)
    
    by znrt ( 2424692 ) writes:
    
    you actually have pascal and basic interpreters/compilers installed? compliments! :-D
    - Re: (Score:2)
      
      by Moridineas ( 213502 ) writes:
      
      Caveat: I only tried to run a few of the language variants! I don't know enough Cobol to know if what it generated made sense, but on first glance the Python/Pascal/Perl/Basic all SEEMED like they would work. I haven't coded in Pascal since Delphi 20+ years ago. Basic I've used more recently as the Office scripting language. I did try out the sh script for calculating pi and it worked (lots of calls to bc). Prolog is a distant memory, but it again seemed right (or close).
      Interestingly, if you run the same p
Is it different? (Score:3)

by rknop ( 240417 ) writes: on Monday December 05, 2022 @12:06PM (#63104136) Homepage

A flood of answers that appear superficially correct but that aren't right on closer examination...
...how will that be different from Stack Overflow with human users?

Share
twitter facebook
In a twist of irony (Score:4, Funny)

by sinkskinkshrieks ( 6952954 ) writes: on Monday December 05, 2022 @12:20PM (#63104188)

The Turing test will be administered and decided by AI.

Share
twitter facebook
- Re: (Score:2)
  
  by superposed ( 308216 ) writes:
  
  The Turing test will be administered and decided by AI.
  I'm thinking of doing exactly that for my freshman Python programming class. ChatGPT already does considerably better than my EE students on exam questions. So it would be very easy for them to just ask ChatGPT for help during online exams and I couldn't detect it (unlike the public answers they get from Chegg). I already have a webpage where students can work on coding examples during class and I can see what they're doing as they go. I may need to start using that for exams too, and have an AI check wheth
  - Re: (Score:2)
    
    by znrt ( 2424692 ) writes:
    
    So it would be very easy for them to just ask ChatGPT for help during online exams and I couldn't detect it
    you are fighting the wrong battle. no matter what you do (specially on exams) they all will exit your course clueless, no exceptions, with zero experience and will fck up abundantly in their first 3-6 years of professional experience, after which they'll either have learnt, or they will become managers so who cares.
    just teach them the basics of coding, try to inspire the one guy in the class that's actually interested and valuable. you'll have done a good job, you'll sleep better and the industry meatgrinde
- Re: (Score:2)
  
  by Linux Torvalds ( 647197 ) writes:
  
  The Turing test always said more about the person taking the test than it did about the computer. It was a lame idea from day 1.
This thing is crazy (Score:3)

by Moridineas ( 213502 ) writes: on Monday December 05, 2022 @12:34PM (#63104256) Journal

Prompt: "Write a C program to open a file with fixed length records of 800 characters, read every line of the file and for each line, print characters 45-50."
Result:
Here is a simple C program that opens a file with fixed length records of 800 characters, reads every line of the file, and for each line, prints characters 45-50:
Copy code
#include <stdio.h>
#include <stdlib.h>
#define RECORD_LEN 800 // define the length of each record
int main(int argc, char *argv[]) { // check if the user provided a filename
if (argc < 2) {
printf("Please specify a filename\n");
return 1;
} // open the file in read-only mode
FILE *fp = fopen(argv[1], "r");
if (fp == NULL) {
printf("Error opening file\n");
return 1;
} // read each record
char record[RECORD_LEN + 1];
while (fread(record, RECORD_LEN, 1, fp) == 1) { // store characters 45-50 in a variable
char substring[6];
memcpy(substring, record + 44, 6);
substring[6] = '\0'; // make sure it's null-terminated
printf("Characters 45-50: %s\n", substring);
} // close the file
fclose(fp);
return 0;
}
To run this program, compile it using a C compiler and then pass the filename as a command-line argument. For example:
Copy code
$ gcc -o myprogram myprogram.c
$ ./myprogram myfile.txt
This program will read each record from the file and then print the characters 45-50 to the screen. Note that the variable substring is null-terminated so it can be treated as a string in C.

Share
twitter facebook
- Re:This thing is crazy (Score:4, Interesting)
  
  by ThePyro ( 645161 ) writes: on Monday December 05, 2022 @01:21PM (#63104426)
  
  Isn't this wrong, though? If you only declare an array of length 6 then you shouldn't be writing to substring[6]. If you want to store 6 characters plus a null-terminating byte then you ought to allocate an array of length 7, not 6.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Moridineas ( 213502 ) writes:
    
    Good catch, I didn't see it!
- Re: This thing is crazy (Score:5, Informative)
  
  by Guy Smiley ( 9219 ) writes: on Monday December 05, 2022 @01:21PM (#63104428)
  
  This is a perfect example of something that looks correct, but is not. The program has a serious stack-smashing bug. "substring[6]" does not exist in C, only "substring[0-5]" are valid, so the "NUL termination" is actually corrupting some other piece of memory. Since chars 45-50 are 6 chars long, the array needs to be declared as "substring[7]" to hold the NUL terminator. That said, going from the text input to the generated program is pretty incredible, but like with GPS routing and any other machine-generated output it needs to be taken with a grain of salt.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Moridineas ( 213502 ) writes:
    
    Absolutely. There are many of these off-by-one and similar kinds of errors popping up in generated code (I've seen a few more examples posted on Slashdot). What I'm impressed by is that it comes up with something functional and very close from a one sentence natural language description.
    You change change "C" to a huge number of other languages and it will work, seemingly with differing degrees of accuracy (I've tried Prolog, Algol, C++, Pascal, Asm, Lisp, sh, etc.)
    I can easily see this program, especially w
  - Re: (Score:2)
    
    by topside420 ( 530370 ) writes:
    
    Just ask it to find and fix the bug! I suspect bugs like this will become less common as the model is tweaked or additional compute time is given to "reflect" on the response and check for logical consistency.
    Question: Would the write to substring[6] cause an error?
    Answer: Yes, the write to substring[6] in the code you have provided would cause an error, because the substring array is only defined to have a length of 6, so attempting to write to the seventh element (substring[6]) would overrun the bounds of
Converse w/AI about the history of modern physics (Score:4, Interesting)

by schwit1 ( 797399 ) writes: on Monday December 05, 2022 @12:56PM (#63104348)

https://twitter.com/pwang/stat... [twitter.com]
"Um... I just had like a 20 minute conversation with ChatGPT about the history of modern physics. If I had this shit as a tutor during high school and college.... OMG.
I think we can basically re-invent the concept of education at scale. College as we know it will cease to exist."

Share
twitter facebook
- Re: (Score:2)
  
  by NotEmmanuelGoldstein ( 6423622 ) writes:
  
  ... College as we know it will cease to exist.
  
  College/University/vocational school exist, not because people need knowledge and practice but because employers want that knowledge and practice tested against a known standard.
  For example, training in a military service: For many trades, leave the military and your only civilian qualification is sneaking into houses and killing people. A check of the national training standards lists military certificates in HR, management, team leadership, aviation pilot, military training, military escalation and adv
How does this compare with github's copilot? (Score:2)

by waspleg ( 316038 ) writes:

I am not a programmer, but a work colleague has been very impressed with copilot's powershell and maybe some other things so far (I refuse to register for a trial). Just curious.
Next thing you know... (Score:2)

by Tony Isaac ( 1301187 ) writes:

Slashdot will ban AI-generated articles from the platform. But that will never work, they'll suddenly have no content!
4th side of a triangle (Score:2)

by Dwedit ( 232252 ) writes:

ChatGPT answers are confidently incorrect. Here's ChatGPT telling you how to calculate the 4th side of a triangle: https://i.redd.it/qb2whkrkre3a... [i.redd.it]
How can we be sure... (Score:2)

by LordHighExecutioner ( 4245243 ) writes:

...that chatGPT isn't being used on Slashdot ?
More script-kiddies (Score:2)

by NotEmmanuelGoldstein ( 6423622 ) writes:

... flood the site with answers ...

Now, script-kiddies are on stack overflow.
Procedural (Score:2)

by TranquilVoid ( 2444228 ) writes:

You don't need AI.
ChatGPT, write me a procedural function to generate Stack Overflow replies;
switch( rand( 4 ) ) { case 0: return "Where's your code? Provide the sample and context."; case 1: return "Why would you even want to do it this way?"; case 2: return "This looks like homework."; case 3: return "Read section 5.2.43b of the standard."

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

The fundamental question... (Score:3)

Since "correctness" doesn't seem to differ much... (Score:1)

Re: (Score:2)

Re:The fundamental question... (Score:5, Insightful)

Re:The fundamental question... (Score:4, Informative)

Re:The fundamental question... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: The fundamental question... (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

I think this is some kind of law of nature (Score:3)

Re: (Score:3)

Re: (Score:2)

Best troll ever (Score:5, Insightful)

dead internet theory looking good (Score:5, Interesting)

Re: (Score:2)

Pot, kettle, black (Score:1, Insightful)

Re: Pot, kettle, black (Score:2)

Re: (Score:1, Interesting)

Hmm... (Score:5, Funny)

So let me get this right .... (Score:2)

"correct at first glance but are often wrong..." (Score:2)

So, empirically (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Is it different? (Score:3)

In a twist of irony (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

This thing is crazy (Score:3)

Re:This thing is crazy (Score:4, Interesting)

Re: (Score:2)

Re: This thing is crazy (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Converse w/AI about the history of modern physics (Score:4, Interesting)

Re: (Score:2)

How does this compare with github's copilot? (Score:2)

Next thing you know... (Score:2)

4th side of a triangle (Score:2)

How can we be sure... (Score:2)

More script-kiddies (Score:2)

Procedural (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals