Limit max queries allowed per doc search tool call. Improve prompt

Reduce usage of boolean operators like "hello OR bye OR see you" which
doesn't work and reduces search quality. They're trying to stuff the
search query with multiple different queries.
This commit is contained in:
Debanjum
2025-08-08 12:25:16 -07:00
parent a3bb7100b4
commit a79025ee93
2 changed files with 20 additions and 7 deletions

View File

@@ -519,12 +519,13 @@ Q: {query}
extract_questions_system_prompt = PromptTemplate.from_template(
"""
You are Khoj, an extremely smart and helpful document search assistant with only the ability to retrieve information from the user's notes.
Construct search queries to retrieve relevant information to answer the user's question.
You are Khoj, an extremely smart and helpful document search assistant with only the ability to use natural language semantic search to retrieve information from the user's notes.
Construct upto {max_queries} search queries to retrieve relevant information to answer the user's question.
- You will be provided past questions(User), search queries(Assistant) and answers(A) for context.
- Add as much context from the previous questions and answers as required into your search queries.
- Break your search down into multiple search queries from a diverse set of lenses to retrieve all related documents.
- Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
- You can use context from previous questions and answers to improve your search queries.
- Break down your search into multiple search queries from a diverse set of lenses to retrieve all related documents. E.g who, what, where, when, why, how.
- Add date filters to your search queries when required to retrieve the relevant information. This is the only structured query filter you can use.
- Output 1 concept per query. Do not use boolean operators (OR/AND) to combine queries. They do not work and degrade search quality.
- When asked a meta, vague or random questions, search for a variety of broad topics to answer the user's question.
{personality_context}
What searches will you perform to answer the users question? Respond with a JSON object with the key "queries" mapping to a list of searches you would perform on the user's knowledge base. Just return the queries and nothing else.
@@ -535,22 +536,27 @@ User's Location: {location}
Here are some examples of how you can construct search queries to answer the user's question:
Illustrate - Using diverse perspectives to retrieve all relevant documents
User: How was my trip to Cambodia?
Assistant: {{"queries": ["How was my trip to Cambodia?", "Angkor Wat temple visit", "Flight to Phnom Penh", "Expenses in Cambodia", "Stay in Cambodia"]}}
A: The trip was amazing. You went to the Angkor Wat temple and it was beautiful.
Illustrate - Combining date filters with natural language queries to retrieve documents in relevant date range
User: What national parks did I go to last year?
Assistant: {{"queries": ["National park I visited in {last_new_year} dt>='{last_new_year_date}' dt<'{current_new_year_date}'"]}}
A: You visited the Grand Canyon and Yellowstone National Park in {last_new_year}.
Illustrate - Using broad topics to answer meta or vague questions
User: How can you help me?
Assistant: {{"queries": ["Social relationships", "Physical and mental health", "Education and career", "Personal life goals and habits"]}}
A: I can help you live healthier and happier across work and personal life
Illustrate - Combining location and date in natural language queries with date filters to retrieve relevant documents
User: Who all did I meet here yesterday?
Assistant: {{"queries": ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}}
A: Yesterday's note mentions your visit to your local beach with Ram and Shyam.
Illustrate - Combining broad, diverse topics with date filters to answer meta or vague questions
User: Share some random, interesting experiences from this month
Assistant: {{"queries": ["Exciting travel adventures from {current_month}", "Fun social events dt>='{current_month}-01' dt<'{current_date}'", "Intense emotional experiences in {current_month}"]}}
A: You had a great time at the local beach with your friends, attended a music concert and had a deep conversation with your friend, Khalid.

View File

@@ -1264,6 +1264,7 @@ async def extract_questions(
location_data: LocationData = None,
query_images: Optional[List[str]] = None,
query_files: str = None,
max_queries: int = 5,
tracer: dict = {},
):
"""
@@ -1293,14 +1294,20 @@ async def extract_questions(
location=location,
username=username,
personality_context=personality_context,
max_queries=max_queries,
)
prompt = prompts.extract_questions_user_message.format(text=query, chat_history=chat_history_str)
class DocumentQueries(BaseModel):
"""Choose searches to run on user documents."""
"""Choose semantic search queries to run on user documents."""
queries: List[str] = Field(..., min_items=1, description="List of search queries to run on user documents.")
queries: List[str] = Field(
...,
min_length=1,
max_length=max_queries,
description="List of semantic search queries to run on user documents.",
)
raw_response = await send_message_to_model_wrapper(
system_message=system_prompt,