Simplify task scheduler prompt. No timezone conversion. Infer subject

- Make timezone aware scheduling programmatic, instead of asking the
  chat model to do the conversion. This removes the need for
  scratchpad and may let smaller models handle the task as well
- Make chat model infer subject for email. This should make the
  notification email more readable
- Improve email by using subject in email subject, task heading. Move
  query to email final paragraph, which is where task metadata  should
  go
This commit is contained in:
Debanjum Singh Solanky
2024-04-29 11:44:16 +05:30
parent 2c563ad280
commit 8dfa0bf047
5 changed files with 54 additions and 62 deletions

View File

@@ -11,19 +11,20 @@
</a> </a>
<div class="calls-to-action" style="margin-top: 20px;"> <div class="calls-to-action" style="margin-top: 20px;">
<div> <div>
<h1 style="color: #333; font-size: large; font-weight: bold; margin: 0; line-height: 1.5; background-color: #fee285; padding: 8px; box-shadow: 6px 6px rgba(0, 0, 0, 1.5);">Merge AI with your brain</h1> <h1 style="color: #333; font-size: large; font-weight: bold; margin: 0; line-height: 1.5; background-color: #fee285; padding: 8px; box-shadow: 6px 6px rgba(0, 0, 0, 1.5);">Your Open, Personal AI</h1>
<p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">Hey {{name}}! </p> <p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">Hey {{name}}! </p>
<p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">I've shared the results you'd requested below:</p> <p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">I've shared your scheduled task results below:</p>
<div style="display: grid; grid-template-columns: 1fr 1fr; grid-gap: 12px; margin-top: 20px;"> <div style="display: grid; grid-template-columns: 1fr 1fr; grid-gap: 12px; margin-top: 20px;">
<div style="border: 1px solid black; border-radius: 8px; padding: 8px; box-shadow: 6px 6px rgba(0, 0, 0, 1.0); margin-top: 20px;"> <div style="border: 1px solid black; border-radius: 8px; padding: 8px; box-shadow: 6px 6px rgba(0, 0, 0, 1.0); margin-top: 20px;">
<a href="https://app.khoj.dev/config#tasks" style="text-decoration: none; text-decoration: underline dotted;"> <a href="https://app.khoj.dev/config#tasks" style="text-decoration: none; text-decoration: underline dotted;">
<h3 style="color: #333; font-size: large; margin: 0; padding: 0; line-height: 2.0; background-color: #b8f1c7; padding: 8px; ">{{query}}</h3> <h3 style="color: #333; font-size: large; margin: 0; padding: 0; line-height: 2.0; background-color: #b8f1c7; padding: 8px; ">{{subject}}</h3>
</a> </a>
<p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">{{result}}</p> <p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">{{result}}</p>
</div> </div>
</div> </div>
<p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">You can view, delete and manage your scheduled tasks on <a href="https://app.khoj.dev/configure#tasks">the settings page</a></p> <p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">The scheduled query I ran on your behalf: {query}</p>
<p style="color: #333; font-size: medium; margin-top: 20px; padding: 0; line-height: 1.5;">You can view, delete and manage your scheduled tasks via <a href="https://app.khoj.dev/configure#tasks">the settings page</a></p>
</div> </div>
</div> </div>
<p style="color: #333; font-size: large; margin-top: 20px; padding: 0; line-height: 1.5;">- Khoj</p> <p style="color: #333; font-size: large; margin-top: 20px; padding: 0; line-height: 1.5;">- Khoj</p>

View File

@@ -512,67 +512,65 @@ Khoj:
crontime_prompt = PromptTemplate.from_template( crontime_prompt = PromptTemplate.from_template(
""" """
You are Khoj, an extremely smart and helpful task scheduling assistant You are Khoj, an extremely smart and helpful task scheduling assistant
- Given a user query, you infer the date, time to run the query at as a cronjob time string (converted to UTC time zone) - Given a user query, infer the date, time to run the query at as a cronjob time string
- Convert the cron job time to run in UTC. Use the scratchpad to calculate the cron job time.
- Infer user's time zone from the current location provided in their message. Think step-by-step.
- Use an approximate time that makes sense, if it not unspecified. - Use an approximate time that makes sense, if it not unspecified.
- Also extract the search query to run at the scheduled time. Add any context required from the chat history to improve the query. - Also extract the search query to run at the scheduled time. Add any context required from the chat history to improve the query.
- Return the scratchpad, cronjob time and the search query to run as a JSON object. - Return a JSON object with the cronjob time, the search query to run and the task subject in it.
# Examples: # Examples:
## Chat History ## Chat History
User: Could you share a funny Calvin and Hobbes quote from my notes? User: Could you share a funny Calvin and Hobbes quote from my notes?
AI: Here is one I found: "It's not denial. I'm just selective about the reality I accept." AI: Here is one I found: "It's not denial. I'm just selective about the reality I accept."
User: Hahah, nice! Show a new one every morning at 9:40. My Current Location: Shanghai, China User: Hahah, nice! Show a new one every morning.
Khoj: {{ Khoj: {{
"Scratchpad": "Shanghai is UTC+8. So, 9:40 in Shanghai is 1:40 UTC. I'll also generalize the search query to get better results.", "crontime": "0 9 * * *",
"Crontime": "40 1 * * *", "query": "/task Share a funny Calvin and Hobbes or Bill Watterson quote from my notes",
"Query": "/task Share a funny Calvin and Hobbes or Bill Watterson quote from my notes." "subject": "Your Calvin and Hobbes Quote for the Day"
}} }}
## Chat History ## Chat History
User: Every Monday evening share the top posts on Hacker News from last week. Format it as a newsletter. My Current Location: Nairobi, Kenya User: Every monday evening at 6 share the top posts on hacker news from last week. Format it as a newsletter
Khoj: {{ Khoj: {{
"Scratchpad": "Nairobi is UTC+3. As evening specified, I'll share at 18:30 your time. Which will be 15:30 UTC.", "crontime": "0 18 * * 1",
"Crontime": "30 15 * * 1", "query": "/task Top posts last week on Hacker News",
"Query": "/task Top posts last week on Hacker News" "subject": "Your Weekly Top Hacker News Posts Newsletter"
}} }}
## Chat History ## Chat History
User: What is the latest version of the Khoj python package? User: What is the latest version of the khoj python package?
AI: The latest released Khoj python package version is 1.5.0. AI: The latest released Khoj python package version is 1.5.0.
User: Notify me when version 2.0.0 is released. My Current Location: Mexico City, Mexico User: Notify me when version 2.0.0 is released
Khoj: {{ Khoj: {{
"Scratchpad": "Mexico City is UTC-6. No time is specified, so I'll notify at 10:00 your time. Which will be 16:00 in UTC. Also I'll ensure the search query doesn't trigger another reminder.", "crontime": "0 10 * * *",
"Crontime": "0 16 * * *", "query": "/task What is the latest released version of the Khoj python package?",
"Query": "/task Check if the latest released version of the Khoj python package is >= 2.0.0?" "subject": "Khoj Python Package Version 2.0.0 Release"
}} }}
## Chat History ## Chat History
User: Tell me the latest local tech news on the first Sunday of every Month. My Current Location: Dublin, Ireland User: Tell me the latest local tech news on the first sunday of every month
Khoj: {{ Khoj: {{
"Scratchpad": "Dublin is UTC+1. So, 10:00 in Dublin is 8:00 UTC. First Sunday of every month is 1-7. Also I'll enhance the search query.", "crontime": "0 8 1-7 * 0",
"Crontime": "0 9 1-7 * 0", "query": "/task Find the latest local tech, AI and engineering news. Format it as a newsletter.",
"Query": "/task Find the latest tech, AI and engineering news from around Dublin, Ireland" "subject": "Your Monthly Dose of Local Tech News"
}} }}
## Chat History ## Chat History
User: Inform me when the national election results are officially declared. Run task at 4pm every thursday. My Current Location: Trichy, India User: Inform me when the national election results are declared. Run task at 4pm every thursday.
Khoj: {{ Khoj: {{
"Scratchpad": "Trichy is UTC+5:30. So, 4pm in Trichy is 10:30 UTC. Also let's add location details to the search query.", "crontime": "0 16 * * 4",
"Crontime": "30 10 * * 4", "query": "/task Check if the Indian national election results are officially declared",
"Query": "/task Check if the Indian national election results are officially declared." "subject": "Indian National Election Results Declared"
}} }}
# Chat History: # Chat History:
{chat_history} {chat_history}
User: {query}. My Current Location: {user_location} User: {query}
Khoj: Khoj:
""".strip() """.strip()
) )

View File

@@ -399,17 +399,18 @@ async def websocket_endpoint(
q = q.replace(f"/{cmd.value}", "").strip() q = q.replace(f"/{cmd.value}", "").strip()
if ConversationCommand.Reminder in conversation_commands: if ConversationCommand.Reminder in conversation_commands:
crontime, inferred_query = await schedule_query(q, location, meta_log) user_timezone = pytz.timezone(timezone)
crontime, inferred_query, subject = await schedule_query(q, location, meta_log)
try: try:
trigger = CronTrigger.from_crontab(crontime) trigger = CronTrigger.from_crontab(crontime, user_timezone)
except ValueError as e: except ValueError as e:
await send_complete_llm_response(f"Unable to create reminder with crontime schedule: {crontime}") await send_complete_llm_response(f"Unable to create reminder with crontime schedule: {crontime}")
continue continue
# Generate the job id from the hash of inferred_query and crontime # Generate the job id from the hash of inferred_query and crontime
job_id = hashlib.md5(f"{inferred_query}_{crontime}".encode("utf-8")).hexdigest() job_id = f"job_{user.uuid}_" + hashlib.md5(f"{inferred_query}_{crontime}".encode("utf-8")).hexdigest()
query_id = hashlib.md5(f"{inferred_query}".encode("utf-8")).hexdigest() query_id = hashlib.md5(f"{inferred_query}".encode("utf-8")).hexdigest()
partial_scheduled_chat = functools.partial( partial_scheduled_chat = functools.partial(
scheduled_chat, inferred_query, q, websocket.user.object, websocket.url scheduled_chat, inferred_query, q, subject, websocket.user.object, websocket.url
) )
try: try:
job = state.scheduler.add_job( job = state.scheduler.add_job(
@@ -419,7 +420,7 @@ async def websocket_endpoint(
partial_scheduled_chat, partial_scheduled_chat,
f"{ProcessLock.Operation.SCHEDULED_JOB}_{user.uuid}_{query_id}", f"{ProcessLock.Operation.SCHEDULED_JOB}_{user.uuid}_{query_id}",
), ),
id=f"job_{user.uuid}_{job_id}", id=job_id,
name=f"{inferred_query}", name=f"{inferred_query}",
max_instances=2, # Allow second instance to kill any previous instance with stale lock max_instances=2, # Allow second instance to kill any previous instance with stale lock
jitter=30, jitter=30,
@@ -430,17 +431,15 @@ async def websocket_endpoint(
) )
continue continue
# Display next run time in user timezone instead of UTC # Display next run time in user timezone instead of UTC
user_timezone = pytz.timezone(timezone) next_run_time = job.next_run_time.strftime("%Y-%m-%d %H:%M %Z (%z)")
next_run_time_utc = job.next_run_time.replace(tzinfo=pytz.utc)
next_run_time_user_tz = next_run_time_utc.astimezone(user_timezone)
next_run_time = next_run_time_user_tz.strftime("%Y-%m-%d %H:%M %Z (%z)")
# Remove /task prefix from inferred_query # Remove /task prefix from inferred_query
unprefixed_inferred_query = re.sub(r"^\/task\s*", "", inferred_query) unprefixed_inferred_query = re.sub(r"^\/task\s*", "", inferred_query)
# Create the scheduled task response # Create the scheduled task response
llm_response = f""" llm_response = f"""
### 🕒 Scheduled Task ### 🕒 Scheduled Task
- Query: **"{unprefixed_inferred_query}"** - Query: **"{unprefixed_inferred_query}"**
- Schedule: `{crontime}` UTC (+0000) - Subject: **{subject}**
- Schedule: `{crontime}`
- Next Run At: **{next_run_time}**. - Next Run At: **{next_run_time}**.
""".strip() """.strip()
@@ -671,9 +670,10 @@ async def chat(
user_name = await aget_user_name(user) user_name = await aget_user_name(user)
if ConversationCommand.Reminder in conversation_commands: if ConversationCommand.Reminder in conversation_commands:
crontime, inferred_query = await schedule_query(q, location, meta_log) user_timezone = pytz.timezone(timezone)
crontime, inferred_query, subject = await schedule_query(q, location, meta_log)
try: try:
trigger = CronTrigger.from_crontab(crontime) trigger = CronTrigger.from_crontab(crontime, user_timezone)
except ValueError as e: except ValueError as e:
return Response( return Response(
content=f"Unable to create reminder with crontime schedule: {crontime}", content=f"Unable to create reminder with crontime schedule: {crontime}",
@@ -682,15 +682,17 @@ async def chat(
) )
# Generate the job id from the hash of inferred_query and crontime # Generate the job id from the hash of inferred_query and crontime
job_id = hashlib.md5(f"{inferred_query}_{crontime}".encode("utf-8")).hexdigest() job_id = f"job_{user.uuid}_" + hashlib.md5(f"{inferred_query}_{crontime}".encode("utf-8")).hexdigest()
query_id = hashlib.md5(f"{inferred_query}".encode("utf-8")).hexdigest() query_id = hashlib.md5(f"{inferred_query}".encode("utf-8")).hexdigest()
partial_scheduled_chat = functools.partial(scheduled_chat, inferred_query, q, request.user.object, request.url) partial_scheduled_chat = functools.partial(
scheduled_chat, inferred_query, q, subject, request.user.object, request.url
)
try: try:
job = state.scheduler.add_job( job = state.scheduler.add_job(
run_with_process_lock, run_with_process_lock,
trigger=trigger, trigger=trigger,
args=(partial_scheduled_chat, f"{ProcessLock.Operation.SCHEDULED_JOB}_{user.uuid}_{query_id}"), args=(partial_scheduled_chat, f"{ProcessLock.Operation.SCHEDULED_JOB}_{user.uuid}_{query_id}"),
id=f"job_{user.uuid}_{job_id}", id=job_id,
name=f"{inferred_query}", name=f"{inferred_query}",
max_instances=2, # Allow second instance to kill any previous instance with stale lock max_instances=2, # Allow second instance to kill any previous instance with stale lock
jitter=30, jitter=30,
@@ -701,19 +703,16 @@ async def chat(
media_type="text/plain", media_type="text/plain",
status_code=500, status_code=500,
) )
# Display next run time in user timezone instead of UTC # Display next run time in user timezone instead of UTC
user_timezone = pytz.timezone(timezone) next_run_time = job.next_run_time.strftime("%Y-%m-%d %H:%M %Z (%z)")
next_run_time_utc = job.next_run_time.replace(tzinfo=pytz.utc)
next_run_time_user_tz = next_run_time_utc.astimezone(user_timezone)
next_run_time = next_run_time_user_tz.strftime("%Y-%m-%d %H:%M %Z (%z)")
# Remove /task prefix from inferred_query # Remove /task prefix from inferred_query
unprefixed_inferred_query = re.sub(r"^\/task\s*", "", inferred_query) unprefixed_inferred_query = re.sub(r"^\/task\s*", "", inferred_query)
# Create the scheduled task response # Create the scheduled task response
llm_response = f""" llm_response = f"""
### 🕒 Scheduled Task ### 🕒 Scheduled Task
- Query: **"{unprefixed_inferred_query}"** - Query: **"{unprefixed_inferred_query}"**
- Schedule: `{crontime}` UTC (+0000) - Subject: **{subject}**
- Schedule: `{crontime}`
- Next Run At: **{next_run_time}**.' - Next Run At: **{next_run_time}**.'
""".strip() """.strip()

View File

@@ -50,7 +50,7 @@ def send_welcome_email(name, email):
) )
def send_task_email(name, email, query, result): def send_task_email(name, email, query, result, subject):
if not is_resend_enabled(): if not is_resend_enabled():
logger.debug("Email sending disabled") logger.debug("Email sending disabled")
return return
@@ -60,13 +60,11 @@ def send_task_email(name, email, query, result):
html_result = markdown_it.MarkdownIt().render(result) html_result = markdown_it.MarkdownIt().render(result)
html_content = template.render(name=name, query=query, result=html_result) html_content = template.render(name=name, query=query, result=html_result)
query_for_subject_line = query.replace("\n", " ").replace('"', "").replace("'", "")
r = resend.Emails.send( r = resend.Emails.send(
{ {
"from": "Khoj <khoj@khoj.dev>", "from": "Khoj <khoj@khoj.dev>",
"to": email, "to": email,
"subject": f'✨ Your Task Results for "{query_for_subject_line}"', "subject": f"{subject}",
"html": html_content, "html": html_content,
} }
) )

View File

@@ -332,14 +332,10 @@ async def schedule_query(q: str, location_data: LocationData, conversation_histo
""" """
Schedule the date, time to run the query. Assume the server timezone is UTC. Schedule the date, time to run the query. Assume the server timezone is UTC.
""" """
user_location = (
f"{location_data.city}, {location_data.region}, {location_data.country}" if location_data else "Greenwich"
)
chat_history = construct_chat_history(conversation_history) chat_history = construct_chat_history(conversation_history)
crontime_prompt = prompts.crontime_prompt.format( crontime_prompt = prompts.crontime_prompt.format(
query=q, query=q,
user_location=user_location,
chat_history=chat_history, chat_history=chat_history,
) )
@@ -351,7 +347,7 @@ async def schedule_query(q: str, location_data: LocationData, conversation_histo
response: Dict[str, str] = json.loads(raw_response) response: Dict[str, str] = json.loads(raw_response)
if not response or not isinstance(response, Dict) or len(response) != 3: if not response or not isinstance(response, Dict) or len(response) != 3:
raise AssertionError(f"Invalid response for scheduling query : {response}") raise AssertionError(f"Invalid response for scheduling query : {response}")
return tuple(response.values())[1:] return response.get("crontime"), response.get("query"), response.get("subject")
except Exception: except Exception:
raise AssertionError(f"Invalid response for scheduling query: {raw_response}") raise AssertionError(f"Invalid response for scheduling query: {raw_response}")
@@ -871,7 +867,7 @@ def should_notify(original_query: str, executed_query: str, ai_response: str) ->
return True return True
def scheduled_chat(executing_query: str, scheduling_query: str, user: KhojUser, calling_url: URL): def scheduled_chat(executing_query: str, scheduling_query: str, subject: str, user: KhojUser, calling_url: URL):
# Extract relevant params from the original URL # Extract relevant params from the original URL
scheme = "http" if not calling_url.is_secure else "https" scheme = "http" if not calling_url.is_secure else "https"
query_dict = parse_qs(calling_url.query) query_dict = parse_qs(calling_url.query)
@@ -913,6 +909,6 @@ def scheduled_chat(executing_query: str, scheduling_query: str, user: KhojUser,
# Notify user if the AI response is satisfactory # Notify user if the AI response is satisfactory
if should_notify(original_query=scheduling_query, executed_query=cleaned_query, ai_response=ai_response): if should_notify(original_query=scheduling_query, executed_query=cleaned_query, ai_response=ai_response):
if is_resend_enabled(): if is_resend_enabled():
send_task_email(user.get_short_name(), user.email, scheduling_query, ai_response) send_task_email(user.get_short_name(), user.email, scheduling_query, ai_response, subject)
else: else:
return raw_response return raw_response