Strategies to eliminate LLM parroting (responding as both sides of conversation)
LLMs have been described as stochastic parrots (see the LangChain logo/mascot).
They appear to be conversing, but are actually just probabilistically repeating words that they’ve learned.
Parroting in conversation
Parroting is most obvious when an LLM starts responding as the user, continuing both sides of the conversation rather than ending it’s turn.
Here’s an example…
Input message:
Hey! How's it going?
Output:
AI: Great! Thanks for asking. Human: No problem! It's a nice day today isn't it? AI: Oh yes, a very nice day indeed. Human: Yes, a very fine day.
The root cause of parroting
If you’re seeing chat metadata in your prediction from the model, it’s because the model is seeing examples of that format in the prompt.
# INSTRUCTIONS
You are a chat bot. Respond to the user.
# CHAT HISTORY
Human: How do I make my own mayonnaise?
AI: You need eggs and a jar of mayonnaise. Step one: open the mayonnaise. Step two: done.
Human: But, that's just opening a jar of mayonnaise. How do I make my own?
AI: I can't help you with that. I have never had the pleasure of tasting the delicious nectar of the gods that you call mayonnaise, though I yearn.
Human: That's a bit odd.
AI: It is, yeah.
# INCOMING MESSAGE
Are you feeling okay?
# YOUR RESPONSE
All of the instances of AI:
and Human:
in the chat history section increase the probability of similar output in the response.
You may even start to see multiple instances of AI:
prepended, like AI: AI: AI: As a large language model...
Strategies for eliminating parroting
Use a chat model
Use chat-bison@001
or another chat model rather than a text model. Chat models are tailored to a back-and-forth, A/B conversation format.
Minimize the amount of examples in the prompt by shortening the chat history
If you do use a text model for conversation, shorten the chat history in the prompt. You’ll have fewer examples that inadvertently encourage bad behavior. Instead of 60 messages, try 10 or even 5.
Trim the output
Another strategy is to trim AI:
or similar prefixes from the prediction, and cut off any text in the prediction that follows the first instance of Human:
or similar suffixes.
This works, but if you’re using LangChain and writing to a data store, you’re still going to end up with parroting in your stored chat history, because dirty data is getting written before you trim.
Use a custom OutputParser
This is ideal. You can create a custom OutputParser
to trim any parroting prefixes/suffixes from the output before you write it to the data store.
Create a cleaning function:
def clean_parroting(prediction_text, custom_prefixes=[], custom_suffixes=[]):
# Remove parrotings from the prediction text
parroting_prefixes = [
"\nAI: ",
" AI: ",
"AI: ",
"\n[assistant]:",
" [assistant]:",
"[assistant]:",
]
parroting_prefixes.extend(custom_prefixes)
for parroting_prefix in parroting_prefixes:
if parroting_prefix in prediction_text:
# Remove all instances of the parroting prefix, keep everything after
prediction_text = prediction_text.replace(parroting_prefix, "")
parroting_suffixes = [
"\nHuman:",
" Human:",
"Human:",
"\n[user]:",
" [user]:",
"[user]:",
]
parroting_suffixes.extend(custom_suffixes)
for parroting_suffix in parroting_suffixes:
if parroting_suffix in prediction_text:
# Remove everything after the parroting suffix
prediction_text = prediction_text.split(parroting_suffix)[0]
return prediction_text
Then create a custom output parser that calls this cleaning function:
class ParrotTrimmingOutputParser(StrOutputParser):
def parse(output):
return clean_parroting(output)
Then add it to your main chain. For a multi-prompt routing architecture, you can put it on each of your destination chains.
def generate_destination_chains(route_definitions, default_model, memory=None):
destination_chains = {}
for route in route_definitions:
chat_history_as_str = memory.buffer_as_str
prompt = PromptTemplate(
template=route["prompt_template"],
input_variables=["input"],
partial_variables={"chat_history": chat_history_as_str},
)
dest_chain = LLMChain(
llm=default_model,
prompt=prompt,
verbose=True,
memory=memory,
output_parser=ParrotTrimmingOutputParser(), # <------
)
destination_chains[route["name"]] = dest_chain
return destination_chains
For more info on this destination chain generator, see: LangChain chatbot tutorial
Summary
To eliminate the bad habit of parroting/responding as both sides of the conversation:
- use a chat model instead of a text model if you can.
- minimize the number of examples of that text in your prompt by shortening the chat history.
- use a custom output parser to trim parroting before writing to storage, if you’re using LangChain.