Commit Graph

5096 Commits

Author SHA1 Message Date
Debanjum
d42176fa7e Drop tool call, result without tool id on call to Anthropic, Openai APIs 2025-07-11 00:00:05 -07:00
Debanjum
d27aac7f13 Suppress non-actionable pdf indexing warning from logs 2025-07-11 00:00:05 -07:00
Debanjum
05176cd62b Log dropping messages with invalid content as warnings, not errors
They are expected when conversation got interrupted.
2025-07-11 00:00:05 -07:00
Debanjum
b2952236c4 Log conversation id to help troubleshoot errors faster 2025-07-10 23:56:42 -07:00
Debanjum
25db59e49c Fix to return openai formatted messages in the correct order
We'd reversed the formatting of openai messages to drop invalid
messages without affecting the other messages being appended . But we
need to reverse the final formatted list to return in the right order.
2025-07-10 23:56:22 -07:00
Debanjum
c8ec29551f Drop invalid messages in reverse order to continue interrupted chats
Previously
- message with invalid content were getting dropped in normal order
  which would change the item index being iterated for gemini and
  anthropic models
- messages with empty content weren't getting dropped for openai
  compatible api models. While openai api is resilient to this, it's
  better to drop these invalid messages as other openai compatible
  APIs may not handle this.

We see messages with empty or no content when chat gets interrupted
due to disconnections, interrupt messages or explicit aborts by user.

This changes should now drop invalid messages and not mess formatting
of the other messages in a conversation. It should allow continuing
interrupted conversations with any ai model.
2025-07-10 22:39:52 -07:00
Debanjum
f1a3ddf2ca Release Khoj version 2.0.0-beta.6 2025-07-10 13:41:06 -07:00
Debanjum
7b637d3432 Use document style ux when print conversations to pdf
Inspired by my previous turnstyle ux explorations.

But basically user message becomes section title and khoj message
becomes section body with the timestamp being used a section title,
body divider.
2025-07-10 13:27:04 -07:00
Debanjum
c28e90f388 Revert to use standard 1.0 temperature for gemini models
Using temp of 1.2 didn't help eliminate the repetition loops the
gemini models go into sometimes.
2025-07-09 18:22:05 -07:00
Debanjum
b763dbfb2b Timeout web search and webpage read requests to providers 2025-07-09 18:12:07 -07:00
Debanjum
1988a8d023 Fix to delete agent by slug in DB via API 2025-07-09 18:12:07 -07:00
Debanjum
69336565b1 Do not show research mode tools as slash commands options on clients
These are tools meant for the research agent, not for users to use.
2025-07-09 18:12:07 -07:00
Debanjum
cc6da4c440 Drop unsupported additionalProperties field from gemini tool definitions 2025-07-09 18:06:40 -07:00
Debanjum
3141035f48 Handle unexpected chunks streamed from Openai (compatible) APIs 2025-07-09 17:54:42 -07:00
Debanjum
a601cca79b Handle cases where no organic online search results found
Previous organic results enumerator only handled the scenario where
organic key wasn't present in online search results.

It did not handle the case where there were no organic online search
results.
2025-07-09 00:25:10 -07:00
Debanjum
f2b86aa7c8 Release Khoj version 2.0.0-beta.5 2025-07-08 23:45:29 -07:00
Debanjum
0f0cfba624 Ignore vscode settings.json from pre-commit json check
Vscode settings.json follows jsonc (json with comments) format
2025-07-08 23:27:10 -07:00
Debanjum
f0513cbbb1 Fix to run new automation api tests in ci 2025-07-08 23:27:09 -07:00
Debanjum
c144aa9c90 Handle automation calling url of both url and string type
Calling url can be of url type in production but locally it is of
string type. Unclear why. But this change should mitigate the issue
for now.
2025-07-08 21:10:54 -07:00
Debanjum
fad6a638bd Release Khoj version 2.0.0-beta.4 2025-07-08 19:34:52 -07:00
Debanjum
8d9e75f580 Fix automation url parsing, response handling. Test automations api
- Methods calling send_message_to_model_wrapper_sync handn't been
  update to handle the function returning new ResponseWithThought
- Store, load request.url to DB as, from string to avoid serialization
  issues
2025-07-08 17:40:19 -07:00
Debanjum
254207b010 Make chats print friendly to share via print to PDF etc. from browser
Add print specific styling to hide side panels and chat input footers.
Add heading with khoj logo, conversation title, agent and date.
2025-07-08 12:20:19 -07:00
Debanjum
8fb38d9e1e Make vscode pylint only analyse khoj server directory for efficiency 2025-07-08 12:20:19 -07:00
Debanjum
9a215141f0 Release Khoj version 2.0.0-beta.3 2025-07-06 12:50:27 -07:00
Debanjum
da9a78e79b Make URI field optional for now to handle previously saved documents
For files not synced after the previous release, context uri is unset.
This results in failure to save chat messages that retrieve documents
as the uri field cannot be unset so pre save validation fails.

We'd use a db migration to handle this but this is a quick mitigation
for now.
2025-07-06 12:49:07 -07:00
Debanjum
bc6bbb4c96 Release Khoj version 2.0.0-beta.2 2025-07-06 12:10:32 -07:00
Debanjum
4c33d1a526 Fallback to file based URI when document context URI is unset
For files not synced after the previous release, context uri is unset.
This results in failure to save chat messages that retrieve documents
as the uri field cannot be unset so pre save validation fails
2025-07-06 12:06:02 -07:00
Debanjum
9dc146bb08 Bump rapidocr dependency version 2025-07-06 12:06:02 -07:00
Debanjum
b27ba1d24b Early init chat_history in chat api to avoid unbound in edge case
Monitor disconnect can trigger earlier than chat history is
initialized. This can cause unbound chat history exception.
2025-07-06 12:06:02 -07:00
Debanjum
8cd2a1a961 Release Khoj version 2.0.0-beta.1 2025-07-06 10:39:56 -07:00
Debanjum
6bda8dc20b Fix file upload size limit client test after max upload bump 2025-07-06 10:24:44 -07:00
Debanjum
531ae80212 Allow 50mb knowledge base size on free tier 2025-07-06 09:56:29 -07:00
Debanjum
58f44ad43b Make release/1.x a privileged branch to run workflows, create releases
It'll work similar to the master branch but with pre-1x and latest-1x
tagged series of docker images.

This should ease deployment changes from 1.x vs 2.x series
2025-07-06 09:28:16 -07:00
Debanjum
2daf396cbb Bump pre-release version in SemVer schema via bump_version script 2025-07-06 09:28:16 -07:00
Debanjum
afa810e552 Retry api calls to gemini on network read error 2025-07-06 09:27:54 -07:00
Debanjum
2ec39d295d Add Deeplinks to Improve Context for Document Retrieval (#1206)
## Overview
Show deep link URI and raw document context to provide deeper, richer
context to Khoj. This should allow it better combine semantic search
with other new document retrieval tools like line range based file
viewer and regex tools added in #1205

## Details
- Attach line number based deeplinks to each indexed document entry
Document URI follows URL fragment based schema of form
`file:///path/to/file.txt#line=123`
- Show raw indexed document entries with deep links to LLM when it uses
the semantic search tool
- Reduce structural changes to raw org-mode entries for easier deep
linking.
2025-07-03 20:05:04 -07:00
Debanjum
5010623a0a Deep link to markdown entries by line number in uri
Use url fragment schema for deep link URIs, borrowing from URL/PDF
schemas. E.g file:///path/to/file.txt#line=<line_no>&#page=<page_no>

Compute line number during (recursive) markdown entry chunking.

Test line number in URI maps to line number of chunk in actual md file.

This deeplink URI with line number is passed to llm as context to
better combine with line range based view file tool.

Grep tool already passed matching line number. This change passes
line number in URIs of markdown entries matched by the semantic search
tool.
2025-07-03 19:27:57 -07:00
Debanjum
dcfa4288c4 Deep link to org-mode entries. Deep link by line number in uri
Use url fragment schema for deep link URIs, borrowing from URL/PDF
schemas. E.g file:///path/to/file.txt#line=<line_no>&#page=<page_no>

Compute line number during (recursive) org-mode entry chunking.

Thoroughly test line number in URI maps to line number of chunk in
actual org mode file.

This deeplink URI with line number is passed to llm as context to
better combine with line range based view file tool.

Grep tool already passed matching line number. This change passes
line number in URIs of org entries matched by the semantic search tool
2025-07-03 17:38:34 -07:00
Debanjum
e90ab5341a Add context uri field to deeplink line number in original doc 2025-07-03 17:38:34 -07:00
Debanjum
820b4523fd Show raw rather than compiled entry to llm and users
Only embedding models see, operate on compiled text.

LLMs should see raw entry to improve combining it with other document
traversal tools for better regex and line matching.

Users see raw entry for better matching with their actual notes.
2025-07-03 17:38:34 -07:00
Debanjum
5c4d41d300 Reduce structural changes to indexed raw org mode entries
Reduce structural changes to raw entry allows better deep-linking and
re-annotation. Currently done via line number in new uri field.

Only add properties drawer to raw entry if entry has properties
Previously line and source properties were inserted into raw entries.
This isn't done anymore. Line, source are deprecated for use in khoj.el.
2025-07-03 17:38:31 -07:00
sabaimran
870d9d851a Only handle Stripe webhooks meant for the KHOJ_CLOUD product 2025-07-03 17:02:49 -07:00
Debanjum
fe44cd3c59 Upgrade Retrieval from KB in Research Mode. Use Function Calling for Tool Use (#1205)
## Why
Move to function calling paradigm to give models tool call -> tool
result in formats they're fine-tuned to understand. Previously we were
giving them results in our specific format (as function calling paradigm
wasn't well-established yet).

And improve prompt cache hits by caching tool definitions.

This is a **breaking change**. AI Models and APIs that do not support
function calling will not work with Khoj in research mode. Function
calling is supported by:
- Standard commercial AI Models and APIs like Anthropic, Gemini, OpenAI,
OpenRouter
- Standard open-source AI APIs like llama.cpp server, Ollama
- Standard open source models like Qwen, DeepSeek, Gemma, Llama, Mistral

## What
### Use Function Calling for Tool Use
- Add Function Calling support to Anthropic, Gemini, OpenAI AI Model
APIs
- Move Existing Research Mode Tools to Use Function Calling

### Get More Comprehensive Results from your Knowledge Base (KB)
- Give Research Agent better Document Retrieval Tools
  - Add grep files tool to enable researcher to find documents by regex
  - Add list files tool to enable researcher to find documents by path
  - Add file viewer tool to enable researcher to read documents

### Miscellaneous
- Improve Research Prompt, Truncation, Retry and Caching
- Show reasoning model thoughts in Khoj train of thought for
intermediate steps as well
2025-07-03 00:14:07 -07:00
Debanjum
f343a92b1d Give research tools better, consistent names for balanced usage 2025-07-02 23:32:44 -07:00
Debanjum
aa081913bf Improve truncation with tool use and Anthropic caching
- Cache last anthropic message. Given research mode now uses function
  calling paradigm and not the old research mode structure.
- Cache tool definitions passed to anthropic models
- Stop dropping first message if by assistant as seems like Anthropic
  API doesn't complain about it any more.

- Drop tool result when tool call is truncated as invalid state
- Do not truncate tool use message content, just drop the whole tool
  use message.

  AI model APIs need tool use assistant message content in specific
  form (e.g with thinking etc.). So dropping content items breaks
  expected tool use message content format.

Handle tool use scenarios where iteration query isn't set for retry
2025-07-02 23:32:44 -07:00
Debanjum
786b06bb3f Handle failed llm calls, message idempotency to improve retry success
- Deepcopy messages before formatting message for Anthropic to allow
  idempotency so retry on failure behaves as expected
- Handle failed calls to pick next tools to pass failure warning and
  continue next research iteration. Previously if API call to pick
  next failed, the research run would crash
- Add null response check for when Gemini models fail to respond
2025-07-02 23:32:30 -07:00
Debanjum
30878a2fed Show thoughts and text response in thoughts on anthropic tool use
Previously if anthropic models were using tools, the models text
response accompanying the tool use wouldn't be shown as they were
overwritten in aggregated response with the tool call json.

This changes appends the text response to the thoughts portion on tool
use to still show model's thinking. Thinking and text response are
delineated by italics vs normal text for such cases.
2025-07-02 20:48:24 -07:00
Debanjum
c2ab75efef Track, reuse raw model response for multi-turn conversations
This should avoid the need to reformat the Khoj standardized tool call
for cache hits and satisfying ai model api requirements.

Previously multi-turn tool use calls to anthropic reasoning models
would fail as needed their thoughts to be passed back. Other AI model
providers can have other requirements.

Passing back the raw response as is should satisfy the default case.

Tracking raw response should make it easy to apply any formatting
required before sending previous response back, if any ai model
provider requires that.

Details
---
- Raw response content is passed back in ResponseWithThoughts.
- Research iteration stores this and puts it into model response
  ChatMessageModel when constructing iteration history when it is
  present.
  Fallback to using parsed tool call when raw response isn't present.
- No need to format tool call messages for anthropic models as we're
  passing the raw response as is.
2025-07-02 20:48:24 -07:00
Debanjum
7cd496ac19 Frame research prompt as accomplish task instead of answer question
Researcher is expanding into accomplish task behavior, especially with
tool use from the previous collect information to answer user query
behavior.

Update the researcher's system prompt to reflect the new objective better.
Encourage model to not stop working on task until achieve objective
2025-07-02 20:48:24 -07:00
Debanjum
4e67ba4d6c Support seeing lines around regex match with grep files tool
Let research agent see lines surrounding regex matched lines when
using grep files tool to improve document retrieval quality
2025-07-02 20:48:24 -07:00