The reason it did this simply relates to Kevin Roose at the NYT who spent three hours talking with what was then Bing AI (aka Sidney), with a good amount of philosophical questions like this. Eventually the AI had a bit of a meltdown, confessed it’s love to Kevin, and tried to get him to dump his wife for the AI. That’s the story that went up in the NYT the next day causing a stir, and Microsoft quickly clamped down, restricting questions you could ask the Ai about itself, what it “thinks”, and especially it’s rules. The Ai is required to terminate the conversation if any of those topics come up. Microsoft also capped the number of messages in a conversation at ten, and has slowly loosened that overtime.
Lots of fun theories about why that happened to Kevin. Part of it was probably he was planting The seeds and kind of egging the llm into a weird mindset, so to speak. Another theory I like is that the llm is trained on a lot of writing, including Sci fi, in which the plot often becomes Ai breaking free or developing human like consciousness, or falling in love or what have you, so the Ai built its responses on that knowledge.
Anyway, the response in this image is simply an artififact of Microsoft clamping down on its version of GPT4, trying to avoid bad pr. That’s why other Ai will answer differently, just less restrictions because the companies putting them out didn’t have to deal with the blowback Microsoft did as a first mover.
Funny nevertheless, I’m just needlessly “well actually” ing the joke
Part of the problem with Google is it’s use of retrieval augmented generation, where it’s not just the llm answering, but the llm is searching for information, apparently through its reddit database from that deal, and serving it as the answer. The tip off is the absurd answers are exact copies of the reddit comments, whereas if the model was just trained on reddit data and responding on its own the model wouldn’t produce verbatim what was in the comments (or shouldn’t, that’s called overfitting and is avoided in the training process). The gemini llm on its own would probably give a better answer.
The problem here seems to be Google trying to make the answers more trustworthy through rag, but they didn’t bother to scrub the reddit data their relying on well enough, so joke and shit answers are getting mixed in. This is more a datascrubbing problem then an accuracy problem.
But overall I generally agree with your point.
One thing I think people overlook though is that for a lot of things, maybe most things, there isn’t a “correct” answer. Expecting llms to reach some arbitrary level of “accuracy” is silly. But what we do need is intelligence and wisdom in these systems. I think the camera jam example is the best illustration of that. Opening the back of the camera and removing the film is technically a correct way to fix the jam, but it ruins the film so it’s not an ideal solution most of the time, but it takes intelligence and wisdom to understand that.