Commit Graph

5109 Commits

Author SHA1 Message Date
Debanjum
bdf9afa726 Set openai api output tokens high by default to not hit length limits
Explicitly set completion tokens high to avoid early termination
issues, especially when trying to generate structured responses.
2025-12-07 20:23:58 -08:00
Debanjum
5b6dab1627 Use consistent summarizer result for failed research iteration 2025-12-05 10:02:08 -08:00
Debanjum
3941159bd6 Release Khoj version 2.0.0-beta.21 2025-11-29 17:31:14 -08:00
Debanjum
e2340c709f Fix extracting image by URL in chat history when using Nano Banana Pro
Logical error due to else conditional being not correctly indented.
This would result in error in using gemini 3 pro image when images are
in S3 bucket.
2025-11-29 17:08:17 -08:00
Debanjum
856864147b Drop image generation support for Stability AI models
Reduces maintenance burden by dropping support for old ai model
providers
2025-11-29 17:08:11 -08:00
Debanjum
c41e37d734 Release Khoj version 2.0.0-beta.20 2025-11-29 16:12:38 -08:00
Debanjum
731700ac43 Support fallback deep, fast chat models via server chat settings
Overview
---
This change enables specifying fallback chat models for each task
type (fast, deep, default) and user type (free, paid).

Previously we did not fallback to other chat models if the chat model
assigned for a task failed.

Details
---
You can now specify multiple ServerChatSettings via the Admin Panel
with their usage priority. If the highest priority chat model for the
task, user type fails, the task is assigned to a lower priority chat
model configured for the current user and task type.

This change also reduces the retry attempts for openai chat actor
models from 3 to 2 as:
- multiple fallback server chat settings can now be created. So
  reducing retries with same model reduces latency.
- 2 attempts is inline with retry attempts with other model
  types (gemini, anthropic)
2025-11-29 15:57:35 -08:00
Debanjum
99f16df7e2 Use fast model in default mode and for most chat actors
What
--
- Default to using fast model for most chat actors. Specifically in this
  change we default to using fast model for doc, web search chat actors
- Only research chat director uses the deep chat model.
- Make using fast model by chat actors configurable via func argument

Code chat actor continues to use deep chat model and webpage reader
continues to use fast chat model.

Deep, fast chat models can be configured via ServerChatSettings on the
admin panel.

Why
--
Modern models are good enough at instruction following. So defaulting
most chat actor to use the fast model should improve chat speed with
acceptable response quality.

The option to fallback to research mode for higher quality
responses or deeper research always exists.
2025-11-29 15:57:35 -08:00
Debanjum
da493be417 Support image generation with Gemini Nano Banana 2025-11-29 15:57:35 -08:00
Debanjum
dd4381c25c Do not try render invalid image paths in message on web app
Avoids rendering flicker from attempt to render invalid image paths
referenced in message by khoj on web app.

The rendering flicker made it very annoying to interact with
conversations containing such messages on the web app.

The current change does lightweight validation of image url before
attempting to render it. If invalid image url detected, the image is
replaced with just its alt text.
2025-11-29 15:23:51 -08:00
Debanjum
51b893d51d Fix to ensure rectangular generated images are not cropped on web app
Previously non-square images would get cropped when being displayed on
web app
2025-11-29 15:23:51 -08:00
Debanjum
32966646e2 Avoid ai hover summaries in vscode dev env for now 2025-11-29 15:23:51 -08:00
Debanjum
043777c1bd Release Khoj version 2.0.0-beta.19 2025-11-18 15:30:50 -08:00
Debanjum
47a55c20a0 Associate folder icon with all doc tools use in thinking UX on web app
The newer grep_files and list_files should also be associated with
document search in train of thought visualization on the web app.
2025-11-18 15:17:38 -08:00
Debanjum
6459150870 Upgrade packages for documentation and desktop app 2025-11-18 15:17:38 -08:00
Debanjum
03dad1348a Support Minimax M2. Extract its thinking from response
- Use qwen style <think> tags to extract Minimax M2 model thoughts
- Use function to mark models that use in-stream thinking (including
  Kimi K2 thinking)
2025-11-18 14:13:28 -08:00
Debanjum
57d6ebb1b8 Support Google Gemini 3
- Use thinking level for gemini 3 models instead of thinking budget.
- Bump google gemini library
- Add default context, pricing
2025-11-18 14:13:24 -08:00
Debanjum
a30c5f245d Skip non-serializable, binary content parts when token counting 2025-11-18 12:42:18 -08:00
Debanjum
ec31df7154 Test khoj.el with more recent emacs versions 2025-11-18 10:29:02 -08:00
Debanjum
895af42039 Fix unbound response var exception in agent safety checker 2025-11-18 10:29:02 -08:00
Debanjum
748a4f9941 Release Khoj version 2.0.0-beta.18 2025-11-16 11:08:44 -08:00
Debanjum
3496189618 Support using MCP tools in research mode
- Server admin can add MCP servers via the admin panel
- Enabled MCP server tools are exposed to the research agent for use
- Use MCP library to standardize interactions with mcp servers
  - Support SSE or Stdio as transport to interact with mcp servers
  - Reuse session established to MCP servers across research iterations
2025-11-16 10:50:30 -08:00
Debanjum
2ac7359092 Simplify webpage read function names and drop unused return args 2025-11-16 10:50:30 -08:00
Debanjum
f1a34f0c2a Prefer Exa for web search over Google, Firecrawl
Google and Firecrawl do not provide good web search descriptions (within
given latency requirements). Exa does better than them.

So prioritize using Exa over Google or Firecrawl when multiple web
search providers available.
2025-11-16 10:50:30 -08:00
Debanjum
45f4253120 Move Olostep scraping config into its webpage reader for cleaner code 2025-11-16 10:50:30 -08:00
Debanjum
e6a5d3dc3d Deprecate support for using Firecrawl webpage summarizer
Better speed and control by using Khoj webpage summarizer. Reduce code
cruft by clearing unused features.
2025-11-16 10:50:30 -08:00
Debanjum
0415b31a23 Upgrade Firecrawl web provider to use their v2 api 2025-11-16 10:50:30 -08:00
Debanjum
61cb2d5b7e Enable webpage reading with Exa. Remove Jina web page reader
Support using Exa for webpage reading. It seems much faster than
currently available providers.

Remove Jina as a webpage reader and remaining references to Jina from
code, docs. It was anyway slow and API may shut down soon (as it was
bought by Elastic).

Update docs to mention Exa for web search and webpage reading.
2025-11-16 10:50:30 -08:00
Debanjum
d57c597245 Refactor count_tokens, get_encoder methods to utils/helper.py
Simplify get_encoder to not rely on global state. The caching
simplification is not necessary for now.
2025-11-16 10:50:30 -08:00
Debanjum
15482c54b5 Fix type of count total tokens system_message argument 2025-11-16 10:50:30 -08:00
Debanjum
761af5f98c Drop spurious results close xml tag in research shared with llm 2025-11-16 10:50:30 -08:00
Debanjum
1f3c1e1221 Remove spurious comments in desktop app chatutils.js 2025-11-16 10:50:30 -08:00
Debanjum
8490f2826b Reduce evaluator llm verbosity during eval 2025-11-16 10:50:30 -08:00
Debanjum
630ce77b5f Improve agent update/create safety check. Make reason field optional
Issue
---
When agent personality/instructions are safe, we do not require the
safety agent to give a reason. The safety check agent was told this in
the prompt but it was not reflected in the json schema being used.

Latest openai library started throwing error if response doesn't match
requested json schema.

This broke creating/updating agents when using openai models as safety
agent.

Fix
---
Make reason field optional.

Also put send_message_to_model_wrapper in try/catch for more readable
error stacktrace.
2025-11-14 06:47:51 -08:00
Debanjum
cbeb220f00 Show validation errors in UX if agent creation, update fails
Previously we only showed unsafe prompt errors to user when
creating/updating agent. Errors in name collision were not shown on
the web app ux.

This change ensures that such validation errors are bubbled up to the
user in the UX. So they can resolve the agent create/update error on
their end.
2025-11-12 10:51:07 -08:00
Debanjum
d7e936678d Release Khoj version 2.0.0-beta.17 2025-11-11 16:38:46 -08:00
Debanjum
4556773f42 Improve support for new kimi k2 thinking model
Recognize thinking by kimi k2 thinking model in <think> xml blocks
2025-11-11 16:20:41 -08:00
Debanjum
2c54a2cd10 Improve web browsing train of thought status text shown on web app 2025-11-11 16:14:27 -08:00
Debanjum
aab0653025 Add price for grok 4, grok 4 fast for cost estimatation 2025-11-11 16:14:19 -08:00
Debanjum
b14e6eb069 Count cache, reasoning tokens to estimate cost for models served over openai api
Count cached tokens, reasoning tokens for better cost estimates for
models served over an openai compatible api. Previously we didn't
include cached token or reasoning tokens in costing.
2025-11-11 16:12:48 -08:00
Debanjum
ce6d75e5a2 Support Exa as web search provider 2025-11-11 16:12:48 -08:00
Debanjum
5760f3b534 Drop support for web search, read using Jina as provider
There are faster, better web search, webpage read providers. Only keep
reasonable quality online context providers.

Jina was good for self-hosting quickstart as it provided a free api
key without login. It does not provide that now. Its latencies are
pretty high vs other online context providers.
2025-11-11 16:12:48 -08:00
Debanjum
c022e7d553 Upgrade Anthropic Operator editor version 2025-11-11 16:12:48 -08:00
Debanjum
88a1fc75cc Track cost of claude haiku 4.5 model 2025-11-11 16:12:48 -08:00
Debanjum
a809de8970 Handle skip indexing of unsupported image files
Previously unsupported image file types would trigger an unbound local
variable error.
2025-11-11 16:09:08 -08:00
Debanjum
749bbed23d Track cost of claude sonnet 4.5 models 2025-11-11 16:09:04 -08:00
Debanjum
69cceda9ab Bump server dependencies 2025-11-11 16:08:38 -08:00
Debanjum
140a3ef943 Avoid unbound chunk variable error in ai api call from completion func 2025-11-11 16:08:38 -08:00
Debanjum
f2e0b62217 Remove unused default source from default tool picker prompt, help msg 2025-11-11 16:08:37 -08:00
Debanjum
5ef3a3f027 Remove unused eval workflow config to auto read webpage in default mode 2025-09-16 14:55:06 +05:30