Overview
- Khoj references files it used in its response as markdown links.
For example [1](file://path/to/file.txt#line=121)
- Previously these file links were just shown as raw text
- This change renders khoj's inline file references as a proper links
and shows file content preview (around specified line if deeplink)
on hover or click in the web app
Details
- Render inline file references as links in chat message on web app.
Previously references like [1](file://path/to/file.txt#line=120)
would be shown as plain text. Now they are rendered as links
- Preview file content of referenced files on click or hover.
If reference uses a deeplink with line number, the file content
around that line is shown on hover, click. Click allows viewing file
preview on mobile, unlike hover. Hover is easier with mouse.
Fixes
- Fix to allow khoj to delete content in obsidian write mode
- Do not throw error when no edit blocks in write mode on obsidian
- Limit retries to fix invalid edit blocks in obsidian write mode
Improvements
- Only show 3 recent files as context in obsidian file read, write mode
- Persist open file access mode setting across restarts in obsidian
- Make khoj obsidian keyboard shortcuts toggle voice chat, chat history
- Do not show <SYSTEM> instructions in chat session title on obsidian
Closes#1209
In obsidian we have a hacky system instruction being passed in read,
write file access modes. This shouldn't be shown in chat sessions list
during view or edit. It is an internal implementation detail.
Previously hitting voice chat keybinding would just start voice chat,
not end it and just open chat history and not close it.
This is unintuitive and different from the equivalent button click
behaviors.
Fix toggles voice chat on/off and shows/hides chat history when hit
Ctrl+Alt+V, Ctrl+Alt+O keybindings in khoj obsidian chat view
Better support for GPT OSS
- Tune reasoning effort, temp, top_p for gpt-oss models
- Extract thoughts of openai style models like gpt-oss from api response
Tool use improvements
- Improve view file, code tool prompts. Format other research tool prompts
- Truncate long words in code tool stdout, stderr for context efficiency
- Use instruction instead of query as code tool argument
- Simplify view file tool. Limit viewing upto 50 lines at a time
- Make regex search tool results look more like grep results
- Update khoj personality prompts with better style, capability guide
Web UX improvements
- Wrap long words in train of thought shown on web app
- Do not overwrite charts created in previous code tool use during research
- Update web UX when server side error or hit stop + no task running
Fix AI API Usage
- Use subscriber type specific context window to generate response
- Fix max thinking budget for gemini models to generate final response
- Fix passing temp kwarg to non-streaming openai completion endpoint
- Handle unset reasoning, response chunk from openai api while streaming
- Fix using non-reasoning openai model via responses API
- Fix to calculate usage from openai api streaming completion
- Add more color to personality and communication style
- Split prompt into capabilities and style sections
- Remove directives in personality meant for older, less smart models.
- Discourage model from unnecessarily sharing code snippets in final
response unless explicitly requested.
- Ack websocket interrupt even when no task running
Otherwise chat UX isn't updated to indicate query has stopped
processing for this edge case
- Mark chat request as not being procesed on server side error
It is already being passed in model_kwargs, so not required to be
passed explicitly as well.
This code path isn't being used currently, but better to fix for
if/when it is used
- Set the agent of the current conversation in the agent dropdown when a new conversation with a non-default agent is initialized. This was unset previously.
- Pass the current selected agent in the dropdown when creating new chat
- Correctly select the `khoj-header-agent-select' element
- A regression had stopped indicating to user that the websocket
connection had broken. Now the interrupt has some visual indication.
- Websocket disconnects from client didn't trigger the partial
research to be saved. Now we use an interrupt signal to save partial
research before closing task.
Although we had handling in place for retrying after gemini suggested
backoff on hitting rate limits. The actual rate limit exception was
getting caught to render friendly message, so retry wasn't actually
getting triggered.
This change allows both
- Retry on hitting 429 rate limit exceptions
- Return friendly message if rate limit triggered retry eventually fails
Related:
- Changes to retry with gemini suggested backoff time in 0f953f9