Commit Graph

5081 Commits

Author SHA1 Message Date
Debanjum
d57c597245 Refactor count_tokens, get_encoder methods to utils/helper.py
Simplify get_encoder to not rely on global state. The caching
simplification is not necessary for now.
2025-11-16 10:50:30 -08:00
Debanjum
15482c54b5 Fix type of count total tokens system_message argument 2025-11-16 10:50:30 -08:00
Debanjum
761af5f98c Drop spurious results close xml tag in research shared with llm 2025-11-16 10:50:30 -08:00
Debanjum
1f3c1e1221 Remove spurious comments in desktop app chatutils.js 2025-11-16 10:50:30 -08:00
Debanjum
8490f2826b Reduce evaluator llm verbosity during eval 2025-11-16 10:50:30 -08:00
Debanjum
630ce77b5f Improve agent update/create safety check. Make reason field optional
Issue
---
When agent personality/instructions are safe, we do not require the
safety agent to give a reason. The safety check agent was told this in
the prompt but it was not reflected in the json schema being used.

Latest openai library started throwing error if response doesn't match
requested json schema.

This broke creating/updating agents when using openai models as safety
agent.

Fix
---
Make reason field optional.

Also put send_message_to_model_wrapper in try/catch for more readable
error stacktrace.
2025-11-14 06:47:51 -08:00
Debanjum
cbeb220f00 Show validation errors in UX if agent creation, update fails
Previously we only showed unsafe prompt errors to user when
creating/updating agent. Errors in name collision were not shown on
the web app ux.

This change ensures that such validation errors are bubbled up to the
user in the UX. So they can resolve the agent create/update error on
their end.
2025-11-12 10:51:07 -08:00
Debanjum
d7e936678d Release Khoj version 2.0.0-beta.17 2025-11-11 16:38:46 -08:00
Debanjum
4556773f42 Improve support for new kimi k2 thinking model
Recognize thinking by kimi k2 thinking model in <think> xml blocks
2025-11-11 16:20:41 -08:00
Debanjum
2c54a2cd10 Improve web browsing train of thought status text shown on web app 2025-11-11 16:14:27 -08:00
Debanjum
aab0653025 Add price for grok 4, grok 4 fast for cost estimatation 2025-11-11 16:14:19 -08:00
Debanjum
b14e6eb069 Count cache, reasoning tokens to estimate cost for models served over openai api
Count cached tokens, reasoning tokens for better cost estimates for
models served over an openai compatible api. Previously we didn't
include cached token or reasoning tokens in costing.
2025-11-11 16:12:48 -08:00
Debanjum
ce6d75e5a2 Support Exa as web search provider 2025-11-11 16:12:48 -08:00
Debanjum
5760f3b534 Drop support for web search, read using Jina as provider
There are faster, better web search, webpage read providers. Only keep
reasonable quality online context providers.

Jina was good for self-hosting quickstart as it provided a free api
key without login. It does not provide that now. Its latencies are
pretty high vs other online context providers.
2025-11-11 16:12:48 -08:00
Debanjum
c022e7d553 Upgrade Anthropic Operator editor version 2025-11-11 16:12:48 -08:00
Debanjum
88a1fc75cc Track cost of claude haiku 4.5 model 2025-11-11 16:12:48 -08:00
Debanjum
a809de8970 Handle skip indexing of unsupported image files
Previously unsupported image file types would trigger an unbound local
variable error.
2025-11-11 16:09:08 -08:00
Debanjum
749bbed23d Track cost of claude sonnet 4.5 models 2025-11-11 16:09:04 -08:00
Debanjum
69cceda9ab Bump server dependencies 2025-11-11 16:08:38 -08:00
Debanjum
140a3ef943 Avoid unbound chunk variable error in ai api call from completion func 2025-11-11 16:08:38 -08:00
Debanjum
f2e0b62217 Remove unused default source from default tool picker prompt, help msg 2025-11-11 16:08:37 -08:00
Debanjum
5ef3a3f027 Remove unused eval workflow config to auto read webpage in default mode 2025-09-16 14:55:06 +05:30
Debanjum
6ac2280e41 Release Khoj version 2.0.0-beta.16 2025-09-16 14:47:14 +05:30
Debanjum
1179a4c8f8 Update dev docs to suggest using bun instead of yarn for web app
Resolves #1218
2025-09-16 14:13:34 +05:30
Debanjum
51e5c86fcc Bump desktop app dependencies 2025-09-16 14:03:58 +05:30
Debanjum
534ee32664 Bump web app dependencies 2025-09-16 14:03:58 +05:30
Debanjum
e854c1a5a8 Bump django, langchain python server dependencies 2025-09-16 14:03:58 +05:30
Debanjum
2fdb1fcc93 Remove unsupported tool schema fields minimum, maximum for groq api
Groq API has stopped support minimum and maximum items fields from
tool schema. This unexpectedly broke using AI models served via Groq
API like Kimi K2 and GPT-OSS in research mode.

Improve typing of relevant fields
2025-09-16 14:03:29 +05:30
Debanjum
3e699e5476 Fix date time rendering when print conversation on web app 2025-09-01 07:09:32 -07:00
Debanjum
0bd4bf182c Show shared chats without login popup shown to unauthenticated users
The login popup is an unnecessary distraction as you do not need
to be logged in to view shared chats.
2025-08-31 23:40:09 -07:00
Debanjum
52b1928023 Make gpqa answer evaluator more versatile at extracting mcq answers 2025-08-31 23:40:09 -07:00
Debanjum
703e189979 Deterministically shuffle dataset for consistent data in a eval run
Previously eval run across modes would use different dataset shuffles.

This change enables a strict apples to apples perf comparison of the
different khoj modes across the same (random) subset of questions by
using a dataset seed per workflow run to sample questions
2025-08-31 23:40:08 -07:00
Debanjum
edf9ea6312 Release Khoj version 2.0.0-beta.15 2025-08-31 13:21:58 -07:00
Debanjum
d53ede604c Only enable web search with Searxng if KHOJ_SEARXNG_URL env var set
Instead of implicitly defaulting to assuming it is available as:
- For pip install searxng has to be explicitly setup to work
- For docker install we explicitly do set it up and set the
  KHOJ_SEARXNG_URL env var already

Also check if Searxng URL is also unset before disable web search
tools now that it is required explicit enablement.
2025-08-31 13:17:05 -07:00
Debanjum
7533e3eecf Use prompt cache key to improve cache hits with openai responses api
Using prompt cache key enables sticky routing to openai servers.
This increases probability of a chat actor hitting same server and
reusing cached prompts.

We use stable hash of first N characters to uniquely identify a chat
actor prompt
2025-08-31 12:44:38 -07:00
Debanjum
3c1948e9de Disable code sandbox if no code sandbox configured by admin
Either set the Terrarium sandbox url or the E2B api key to enable code
sandbox
2025-08-31 10:15:14 -07:00
Debanjum
3441783d5b Disable web search tool if no search engine configured by admin
Webpage read is gated behind having a web search engine configured for
now. It can later be decoupled from web search and depend on whether
any web scrapers is configured.
2025-08-30 00:22:26 -07:00
Debanjum
3aa6f8ba1f Add gemini cached tokens costs for more accurate cost tracking 2025-08-29 15:55:07 -07:00
Debanjum
0babab580a Avoid null ref error when no organic online search results found 2025-08-29 15:54:06 -07:00
Debanjum
00f0d23224 Fix indexing Github, Notion content by linking embeddings model on init 2025-08-29 15:54:06 -07:00
Debanjum
81c651b5b2 Fix truncation tests to check output chat history for truncation
New truncation logic return a new message list.
It does not update message list by reference/in place since 8a16f5a2a.
So truncation tests should run verification on the truncated chat
history returned by the truncation func instead of the original chat
history passed into the truncation func.
2025-08-28 15:50:32 -07:00
Debanjum
c0f192b436 Set minimum table width on web app for better readability 2025-08-28 01:58:04 -07:00
Debanjum
dd8e805cfe Add support for Cerebras ai model api
- It does not support strict mode for json schema, tool use
- It likes text content to be plain string, not nested in a dictionary
- Verified to work with gpt oss models on cerebras
2025-08-28 01:57:39 -07:00
Debanjum
0a5a882e54 Check if openai compatible ai api supports the responses api endpoint
Responses API is starting to get supported by other ai apis as well.
This change does preparatory improvements to ease moving to use
responses api with other ai apis.

Use the new, better named `supports_responses_api' method.
The method currently just maps to `is_openai_api'. It will add other
ai apis once support for using responses api with them is added.
2025-08-28 01:38:47 -07:00
Debanjum
9395c17f34 Fix openai reasoning model handling
- Fix identifying gpt-oss as openai reasoning model
- Drop unsupported stop param for openai reasoning models
- Drop the Formatting re-enabled logic for openai reasoing only models
  We use responses api for openai models and latest openai models are
  hybrid models, they don't seem to need this convoluted system
  message to format response as markdown
2025-08-28 01:38:47 -07:00
Debanjum
be79b8a633 Drop unused arguments to default tool picker, research mode
is_automated_task check isn't required as automation cannot be created
via chat anymore.

conversation specific file_filters are extracted directly in document
search, so doesn't need to be passed down from chat api endpoint
2025-08-27 14:37:28 -07:00
Debanjum
9d7adbcbaa Pass user attached images to default tool picker for informed selection
Previously we were just passing placeholder informing the default mode
tool picker that images were attached.
2025-08-27 14:37:10 -07:00
Debanjum
2091044db5 Prefer agent chat model to extract document search queries
Make chat model preference order for document search consistent with
all other tools.
2025-08-27 13:55:50 -07:00
Debanjum
7a42042488 Share context builder for chat final response across model types
The context building logic was nearly identical across all model
types.

This change extracts that logic into a shared function and calls it
once in the `agenerate_chat_response', the entrypoint to the converse
methods for all 3 model types.

Main differences handled are
- Gemini system prompt had additional verbosity instructions. Keep it
- Pass system messsage via chatml messages list to anthropic, gemini
  models as well (like openai models) instead of passing it as
  separate arg to chat_completion_* funcs.

  The model specific message formatters for both already extract
  system instruction from the messages list. So system messages wil be
  automatically extracted from the chat_completion_* funcs to pass as
  separate arg required by anthropic, gemini api libraries.
2025-08-27 13:48:33 -07:00
Debanjum
02e220f5f5 Pass args to context builder funcs grouped consistently
Put context params together, followed by model params
Use consistent ordering to improve readability
2025-08-27 13:45:36 -07:00