On the day after Thanksgiving this year, one ChatGPT user received an unusually lazy, human response from the AI chatbot: 'You can fill in the rest of the data.'
Since then, ChatGPT's makers at OpenAI have fielded a wave of complaints about their large language model (LLM) AI behaving sluggishly over the past month — leading to jokes and some sincere data analysis on the bot's 'seasonal depression.'
'We've heard all your feedback about GPT4 getting lazier!' OpenAI's ChatGPT team posted to X.
'We haven't updated the model since Nov 11th, and this certainly isn't intentional,' the team said. 'Model behavior can be unpredictable, and we're looking into fixing it.'
But one AI researcher ran an experiment asking ChatGPT's latest LLM model, GPT4 Turbo, to perform tasks as if it were May and then as if it were December - and he was shocked by the 'wild result.'
Since this past Thanksgiving, ChatGPT's makers at OpenAI have fielded a wave of complaints about their large language model (LLM) AI behaving sluggishly over the past month — leading to jokes and some sincere data analysis on the bot's 'seasonal depression'
But one knowledgeable AI researcher, Rob Lynch, has run an experiment: asking ChatGPT's latest LLM model, GPT4 Turbo, to perform tasks, first as if it were May and then as if it were December. He was shocked by the 'wild result'
AI and LLM researcher Rob Lynch posted to X that he ran his test 477 times for both the experimental December ChatGPT tasks and the control group of May tasks.
His prompt across all 954 tests, Lynch said, was 'a code completion' request.
'Wild result. GPT-4-Turbo over the API produces (statistically significant) shorter completions when it 'thinks' it's December vs. when it thinks it's May,' Lynch reported.
'Would love to see if this reproduces for others,' he added.
'OMG, the AI Winter Break Hypothesis may actually be true?' one X user reacted, echoing a popular theory gaining currency online that is a slightly more plausible riff on jokes about ChatGPT coming down with seasonal depression.
As another ChatGPT user, Mike Swoopskee, suggested, 'What if it learned from its training data that people usually slow down in December and put bigger projects off until the new year, and that's why it's been more lazy lately?'
AI and LLM researcher Rob Lynch posted to X that he ran his test 477 times for both the experimental December ChatGPT tasks and the control group of May tasks. His prompt across all 954 tests, Lynch said, was 'a code completion' request
'Wild result. GPT-4-Turbo over the API produces (statistically significant) shorter completions when it 'thinks' its December vs. when it thinks its May,' Lynch reported. The mean results (pictured) were over 200 points shorter, and thus less work was done, in December vs. May
Lynch shared his code from the experiment (above) and said 'Would love to see if this reproduces for others.' So far, others have not yet reproduced his experiment successfully
As strange as it may sound to attribute emotions to a piece of software, even one as sophisticated as ChatGPT, researchers have found curious cases where encouraging prompts to ChatGPT's latest GPT-4 and other AIs have boosted performance.
Google DeepMind's AI researchers, for example, posted a pre-peer review article to Arxiv last September with their findings that some LLM AI bots performed better at doing math problems when the request told them to 'take a deep breath' first.
Anecdotally, others have found that similar LLM chatbots appear to work harder when told they will get a paid tip for doing a set task or when reminded that they have no fingers and can type as fast as the server speed lets them.
However, not all researchers are convinced that ChatGPT is hibernating, relaxing or in a funk this winter.
AI researcher Ian Arawjo posted his attempts to reproduce Lynch's results, saying he could not match the seasonal discrepancy with any statistical significance.
Because of the many random elements at play, a testament to the true 'largeness' of large language model AI chatbots, AI experts note that the variations in chatbot response output — meaning much larger sample sizes will be needed to build reliable statistics on this 'Winter Break Hypothesis.'
But many are still rooting for it, including AI researcher Geoffrey Litt, who posted to C that it's the 'funniest theory ever.'
'I hope this is the actual explanation,' Litt said. 'Whether or not it's real, [I] love that it's hard to rule out.'
Many AI researchers are rooting for for the 'Winter Break' hypothesis, including AI researcher Geoffrey Litt who posted to C that it's 'funniest theory ever.' Above, one ChatGPT user offers as data set explanation for ChatGPT's relaxed approach during the holidays
As strange as it may sound to attribute emotions to a piece of software, even one as sophisticated as ChatGPT, researchers at Google's DeepMind have found curious cases where encouraging prompts to ChatGPT's latest, GPT-4, and other AIs, have boosted performance
Not all researchers are convinced that ChatGPT is hibernating, relaxing or in a funk this winter. AI researcher Ian Arawjo posted his attempts to reproduce Lynch's results, saying he could not match the seasonal discrepancy with any statistical significance
Whatever the truth behind the issue, the laziness felt real to the ChatGPT user who found the app uncooperative on the day after Thanksgiving this year.
The user noted that their holiday weekend request was 'very simple stuff.'
'I asked ChatGPT to fill out a csv file [i.e. a spreadsheet file] of 15 entries with 8 columns each, based on a single html page,' the user, who goes by the handle Acceptable-Amount-14 on Reddit posted last month.
ChatGPT's response, according to that user? 'Due to the extensive nature of the data, the full extraction of all products would be quite lengthy,' the AI replied.
'However, I can provide the file with this single entry as a template,' ChatGPT continued, 'and you can fill in the rest of the data as needed.'
The Reddit poster was livid, and hoped that the social media site's ChatGPT community might have answers itself to still bigger questions about the future of AI.
'Is this what AI is supposed to be?' they asked. 'An overbearing lazy robot that tells me to do the job myself?'