Put context into separate user message before sending to chat model

The document, online search context are now passed as separate user
messages to chat model, instead of being added to the final user message.

This will improve

- Models ability to differentiate data from user query.
  That should improve response quality and reduce prompt injection
  probability

- Make truncation logic simpler and more robust
  When context window hit, can simply pop messages to auto truncate
  context in order of context, user, assistant message for each
  conversation turn in history until reach current user query

  The complex, brittle logic to extract user query from context in
  last user message isn't required.

Marking the context message with assistant role doesn't translate well
across chat models. E.g
- Gemini can't handle consecutive messages by role = model well
- Claude will merge consecutive messages by same role. In current
  message ordering the context message will result get merged into the
  previous assistant response. And if move context message after user
  query. The truncation logic will have to hop and skip while doing
  deletions
- GPT seems to handle consecutive roles of any type fine

Using context role = user generalizes better across chat models for
now and aligns with previous behavior.
This commit is contained in:
Debanjum Singh Solanky
2024-10-22 01:06:00 -07:00
parent 7ac241b766
commit 0c52a1169a
4 changed files with 36 additions and 32 deletions

View File

@@ -143,7 +143,6 @@ def converse(
"""
# Initialize Variables
current_date = datetime.now()
conversation_primer = prompts.query_prompt.format(query=user_query)
compiled_references = "\n\n".join({f"# File: {item['file']}\n## {item['compiled']}\n" for item in references})
if agent and agent.personality:
@@ -175,18 +174,18 @@ def converse(
completion_func(chat_response=prompts.no_online_results_found.format())
return iter([prompts.no_online_results_found.format()])
if not is_none_or_empty(online_results):
conversation_primer = (
f"{prompts.online_search_conversation.format(online_results=str(online_results))}\n{conversation_primer}"
)
context_message = ""
if not is_none_or_empty(compiled_references):
conversation_primer = f"{prompts.notes_conversation.format(query=user_query, references=compiled_references)}\n\n{conversation_primer}"
context_message = f"{prompts.notes_conversation.format(references=compiled_references)}\n\n"
if not is_none_or_empty(online_results):
context_message += f"{prompts.online_search_conversation.format(online_results=str(online_results))}"
# Setup Prompt with Primer or Conversation History
messages = generate_chatml_messages_with_context(
conversation_primer,
user_query,
system_prompt,
conversation_log,
context_message=context_message,
model_name=model,
max_prompt_size=max_prompt_size,
tokenizer_name=tokenizer_name,