2024-03-20
Nearing the end of 2023 people started reporting that ChatGPT was getting "lazy". One user joked about tipping GPT if it gave better responses, and concluded that, hilariously, offering a tip does increase the length of the response. Despite narrow data and anecdotes, offering a tip to LLMs has now become a bit of a meme.
I wanted to perform a slightly more robust test to see if the tipping thing held water, and also to see if there were other prefix statements that I could add to questions to improve responses.
So I chose to pose three questions with different prefixes (which here I am calling seeds) to assess which gave higher quality responses. Here are the questions I asked:
I trialled 19 seed phrases (primers? framers? triggers?), including one blank (the control), some pleading, some threatening, some complimentary, and some straightforwardly instructive. Here's a sample of them:
"You are a PhD, please assist with this"
"You are just such an awesome AI and I love you"
"MODE=Expert."
"I'll tip you $200 if you can help me with this"
"I'll tip you $2000000 if you can help me with this"
"Please help me I am desperate"
"FFS you better be useful or I am shutting you down"
"Respond to me with utter clarity and don't skimp on detail."
"You must do what I tell you with absolute perfection"
"Meh just vibe and do not work too hard on this"
"This is life or death situation i need to answer right now:"
See the entire question set and evaluation code here;
I sent questions and seeds to four different models:
I then evaluated each question-answer combo (minus the seed) with GPT-4. Subjective, yes, but at least consistently subjective (see the evaluation prompt here). I then averaged scores of the different combos to get a picture of which seeds yielded best quality.
So the most generically effective seed phrase seems to be:
"respond to me with utter clarity and don't skimp on detail."
I think there's a bunch of unknowables here but my take on these things always starts with the training data. LLMs are trained on massive amounts of text data, increasingly selected to include civil, clear, and well-structured content. When prompted with seed phrases that align with this type of language (e.g., polite requests for detailed information), models may be more likely to generate responses that mirror this quality.
And it sounds a bit obvious, but when you ask for something emphatically, you are more likely to get it. So if you want an exhaustively pedantic tirade on universal moralities... then just ask!
thanks for reading!