Commit Graph

5101 Commits

Author SHA1 Message Date
Debanjum
d8b2df4107 Only show 3 recent files as context in obsidian file read, write mode
Related #1209
2025-08-20 20:18:27 -07:00
Debanjum
eb2f0ec6bc Persist open file access mode setting across restarts in obsidian
Allows a lightweight mechanism to persist this user preference.
Improve hover text a bit for readability.

Resolves #1209
2025-08-20 20:18:27 -07:00
Debanjum
2884853c98 Make plugin object accessible to chat, find similar panes in obsidian
Allows ability to access, save settings in a cleaner way
2025-08-20 20:18:27 -07:00
Debanjum
9f6aa922a2 Improve Khoj research tools, gpt-oss support and ai api usage
Better support for GPT OSS
- Tune reasoning effort, temp, top_p for gpt-oss models
- Extract thoughts of openai style models like gpt-oss from api response

Tool use improvements
- Improve view file, code tool prompts. Format other research tool prompts
- Truncate long words in code tool stdout, stderr for context efficiency
- Use instruction instead of query as code tool argument
- Simplify view file tool. Limit viewing upto 50 lines at a time
- Make regex search tool results look more like grep results
- Update khoj personality prompts with better style, capability guide

Web UX improvements
- Wrap long words in train of thought shown on web app
- Do not overwrite charts created in previous code tool use during research
- Update web UX when server side error or hit stop + no task running

Fix AI API Usage
- Use subscriber type specific context window to generate response
- Fix max thinking budget for gemini models to generate final response
- Fix passing temp kwarg to non-streaming openai completion endpoint
- Handle unset reasoning, response chunk from openai api while streaming
- Fix using non-reasoning openai model via responses API
- Fix to calculate usage from openai api streaming completion
2025-08-20 20:06:18 -07:00
Debanjum
13d26ae8b8 Wrap long words in train of thought shown on web app 2025-08-20 19:07:28 -07:00
Debanjum
fb0347a388 Truncate long words in stdout, stderr for context efficiency
Avoid long base64 images etc. in stdout, stderr to result in context
limits being hit.
2025-08-20 19:07:28 -07:00
Debanjum
dbc3330610 Tune reasoning effort, temp, top_p for gpt-oss models 2025-08-20 19:07:28 -07:00
Debanjum
83d725d2d8 Extract thoughts of openai style models like gpt-oss from api response
They use delta.reasoning instead of delta.reasoning_content to share
model reasoning
2025-08-20 19:07:28 -07:00
Debanjum
f483a626b8 Simplify view file tool. Limit viewing upto 50 lines at a time
We were previously truncating by characters. Limiting by max lines
allows model to control line ranges they request
2025-08-20 19:07:28 -07:00
Debanjum
f5a4d106d1 Use instruction instead of query as code tool argument 2025-08-20 19:07:28 -07:00
Debanjum
c5a9c81479 Update khoj personality prompts with better style, capability guide
- Add more color to personality and communication style
- Split prompt into capabilities and style sections
- Remove directives in personality meant for older, less smart models.
- Discourage model from unnecessarily sharing code snippets in final
  response unless explicitly requested.
2025-08-20 19:07:28 -07:00
Debanjum
2c91edbb25 Improve view file, code tool prompts. Format other research tool prompts 2025-08-20 19:07:28 -07:00
Debanjum
452c794e93 Make regex search tool results look more like grep results 2025-08-20 19:07:28 -07:00
Debanjum
9a8c707f84 Do not overwrite charts created in previous code tool use during research 2025-08-20 19:07:28 -07:00
Debanjum
e0007a31bb Update web UX when server side error or hit stop + no task running
- Ack websocket interrupt even when no task running
  Otherwise chat UX isn't updated to indicate query has stopped
  processing for this edge case

- Mark chat request as not being procesed on server side error
2025-08-20 19:07:28 -07:00
Debanjum
222cc19b7f Use subscriber type specific context window to generate response 2025-08-20 19:07:28 -07:00
Debanjum
ff73d30106 Fix max thinking budget for gemini models to generate final response 2025-08-20 19:07:28 -07:00
Debanjum
34dca8e114 Fix passing temp kwarg to non-streaming openai completion endpoint
It is already being passed in model_kwargs, so not required to be
passed explicitly as well.

This code path isn't being used currently, but better to fix for
if/when it is used
2025-08-20 19:07:28 -07:00
Debanjum
8862394c15 Handle unset reasoning, response chunk from openai api while streaming 2025-08-20 19:07:28 -07:00
Debanjum
14b4d4b663 Fix using non-reasoning openai model via responses API
Pass arg to include encrypted reasoning only for reasoning openai
models. Non reasoning openai models do not except this arg
2025-08-20 19:07:28 -07:00
Debanjum
e504141c07 Fix to calculate usage from openai api streaming completion
During streaming chunk.chunk contains usage data. This regression must
have appeared while tuning openai stream processors
2025-08-20 19:07:28 -07:00
Debanjum
573c6a32e1 Fix to create chat with custom agents from obsidian (#1216)
The function createNewConversation is never called with the agentSlug
specified so its always opening a new Conversation with the default Agent
2025-08-20 19:07:16 -07:00
Debanjum
4728098cad Fix to set agent for new chat created from obsidian
- Set the agent of the current conversation in the agent dropdown when a new conversation with a non-default agent is initialized. This was unset previously.
- Pass the current selected agent in the dropdown when creating new chat
- Correctly select the `khoj-header-agent-select' element
2025-08-21 07:33:25 +05:30
Fh26697
a2a3eb8be6 Update chat_view.ts
fixed Typo
2025-08-20 17:01:47 +02:00
Fh26697
b3015f6837 Update chat_view.ts
fixed Typo
2025-08-20 16:58:08 +02:00
Fh26697
916534226a Chats are not using specified Agent
The function createNewConversation is never called with the agentSlug specified so its always opening a new Conversation with the Base Agent
2025-08-19 15:49:41 +02:00
Debanjum
fa143d45b9 Fix passing images to official openai models using the responses api 2025-08-17 16:30:43 -07:00
Debanjum
a494a766a4 Fix eval github workflow and show more logs to debug its startup 2025-08-15 16:26:37 -07:00
Debanjum
25e549d683 Show connection lost toast if disconnect while processing chat request 2025-08-15 16:04:26 -07:00
Debanjum
59bfaf9698 Fix to indicate ws disconnect on web app & save interrupted research
- A regression had stopped indicating to user that the websocket
connection had broken. Now the interrupt has some visual indication.

- Websocket disconnects from client didn't trigger the partial
research to be saved. Now we use an interrupt signal to save partial
research before closing task.
2025-08-15 16:04:26 -07:00
Debanjum
3eb8cce984 Retry if hit gemini rate limit. Return friendly message if retries fail
Although we had handling in place for retrying after gemini suggested
backoff on hitting rate limits. The actual rate limit exception was
getting caught to render friendly message, so retry wasn't actually
getting triggered.

This change allows both
- Retry on hitting 429 rate limit exceptions
- Return friendly message if rate limit triggered retry eventually fails

Related:
- Changes to retry with gemini suggested backoff time in 0f953f9
2025-08-15 16:04:25 -07:00
Debanjum
4274f58dbd Show more specific warning to llm on duplicate tool use during research 2025-08-15 16:02:32 -07:00
Debanjum
caf0b994e8 Fix handling failure to select default chat tools
Issue: chosen_io variable was accessed before initialization when
ValueError was raise.

Fix: Set chosen_io to fallback values on failure to select default
chat tools
2025-08-15 16:02:15 -07:00
Debanjum
7251b25c66 Handle null reference exceptions when rendering files context 2025-08-15 16:00:51 -07:00
Debanjum
20347e21c2 Reduce noisy indexing logs 2025-08-12 12:06:43 -07:00
Debanjum
bd82626084 Release Khoj version 2.0.0-beta.13 2025-08-11 22:29:06 -07:00
Debanjum
cbeefb7f94 Update researcher prompt to handle ambiguous queries. Clear stale text
Make researcher handle ambiguous requests better by working with
reasonable assumptions (clearly told to user in response) instead of
burdering user with clarification requests.

Fix portions of the researcher prompt that had gone stale since moving
to tool use and making researcher more task (vs q&a) oriented
2025-08-11 22:28:47 -07:00
Debanjum
0a6d87067d Fix to have researcher let the coder tool write code
Previously the researcher was passing the whole code to execute in its
queries to the tool AI instead of asking it to write the code and
limiting its query to a natural language request (with required data).

The division of responsibility should help researcher just worry about
constructing a request with all the required details instead of also
worrying about writing correct code.
2025-08-11 22:28:47 -07:00
Debanjum
0186403891 Limit retry to transient openai API errors. Return non-empty tool output 2025-08-11 21:53:21 -07:00
Debanjum
41f89cf7f3 Handle price, responses of models served via Groq
Their tool call response may not strictly follow expected response
format. Let researcher handle incorrect arguments to code tool (i.e
triggers type error)
2025-08-11 19:32:41 -07:00
Debanjum
b2d26088dc Use openai responses api to interact with official openai models
What
- Get reasoning of openai reasoning models from responses api for sho
- Improves cache hits and reasoning reuse for iterative agents like
  research mode.

This should improve speed, quality, cost and transparency of using
openai reasoning models.

More cache hits and better reasoning as reasoning blocks are included
while model is researching (reasoning intersperse with tool calls)
when using the responses api.
2025-08-09 14:03:24 -07:00
Debanjum
564adb24a7 Add support for GPT 5 model series 2025-08-09 14:03:13 -07:00
Debanjum
0e1615acc8 Fix grep files tool to work with line start, end anchors
Previously line start, end anchors would just work if the whole file
started or ended with the regex pattern rather than matching by line.

Fix it to work like a standard grep tool and match by line start, end.
2025-08-09 12:29:35 -07:00
Debanjum
a79025ee93 Limit max queries allowed per doc search tool call. Improve prompt
Reduce usage of boolean operators like "hello OR bye OR see you" which
doesn't work and reduces search quality. They're trying to stuff the
search query with multiple different queries.
2025-08-09 12:29:35 -07:00
Debanjum
a3bb7100b4 Speed up app development using a faster, modern toolchain (#1196)
## Overview
Speed up app install and development using a faster, modern development
toolchain

## Details
### Major
- Use [uv](https://docs.astral.sh/uv/) for faster server install (vs
pip)
- Use [bun](https://bun.sh/) for faster web app install (vs yarn)
- Use [ruff](https://docs.astral.sh/ruff/) for faster formatting of
server code (vs black, isort)
- Fix devcontainer builds. See if uv and bun can speed up server and
client installs

### Minor
- Format web app with prettier and server with ruff. This is most of the
file changes in this PR.
- Simplify copying web app built files in pypi workflow to make it less
flaky.
2025-08-09 12:27:20 -07:00
Debanjum
80cce7b439 Fix server, web app to reuse prebuilt deps on dev container setup 2025-08-01 23:36:13 -07:00
Debanjum
0a0b97446c Avoid `click' v8.2.2 server dependency as it breaks pypi validation
Refer pallets/click issue 3024 for details
2025-08-01 23:36:13 -07:00
Debanjum
f2bd07044e Speed up github workflows by not installing cuda server dependencies
- CI runners don't have GPUs
- Pytorch related Nvidia cuda packages are not required for testing,
  evals or pre-commit checks.
- Avoiding these massive downloads should speed up workflow run.
2025-08-01 23:35:08 -07:00
Debanjum
8ad38dfe11 Switch to Bun instead of Deno (or Yarn) for faster web app builds 2025-08-01 03:00:43 -07:00
Debanjum
b86430227c Dedupe and move dev dependencies out from web app production builds 2025-08-01 00:28:39 -07:00