klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-04-19 17:14:35 +00:00

Author	SHA1	Message	Date
Debanjum	2091044db5	Prefer agent chat model to extract document search queries Make chat model preference order for document search consistent with all other tools.	2025-08-27 13:55:50 -07:00
Debanjum	7a42042488	Share context builder for chat final response across model types The context building logic was nearly identical across all model types. This change extracts that logic into a shared function and calls it once in the `agenerate_chat_response', the entrypoint to the converse methods for all 3 model types. Main differences handled are - Gemini system prompt had additional verbosity instructions. Keep it - Pass system messsage via chatml messages list to anthropic, gemini models as well (like openai models) instead of passing it as separate arg to chat_completion_* funcs. The model specific message formatters for both already extract system instruction from the messages list. So system messages wil be automatically extracted from the chat_completion_* funcs to pass as separate arg required by anthropic, gemini api libraries.	2025-08-27 13:48:33 -07:00
Debanjum	02e220f5f5	Pass args to context builder funcs grouped consistently Put context params together, followed by model params Use consistent ordering to improve readability	2025-08-27 13:45:36 -07:00
Debanjum	4976b244a4	Set fast, deep think models for intermediary steps via admin panel Overview Enable improving speed and cost of chat by setting fast, deep think models for intermediate steps and non user facing operations. Details - Allow decoupling default chat models from models used for intermediate steps by setting server chat settings on admin panel - Use deep think models for most intermediate steps like tool selection, subquery construction etc. in default and research mode - Use fast think models for webpage read, chat title setting etc. Faster webpage read should improve conversation latency	2025-08-27 13:45:36 -07:00
Debanjum	a99eb841ff	Do not search documents when default tool selected by agent What Explicit selection of notes tool/conversation command by agent is required now. Why - Newer models are good at deciding when to look up notes - Modern khoj is less of a notes only chat to search notes by default	2025-08-27 13:45:28 -07:00
Debanjum	a52a06ad9d	Prefer olostep over firecrawl for webpage read by default Default to Olostep as faster and higher webpage read success rate. Fallback logic will use Firecrawl if Olostep fails.	2025-08-27 13:45:09 -07:00
Debanjum	e150dc5a91	Improve copying message with math, file links to clipboard on web app	2025-08-27 13:45:09 -07:00
Debanjum	15d1f39d0b	Improve instruction to ai model for writing math expressions in LaTeX	2025-08-27 13:45:09 -07:00
Debanjum	05dbb6a7c1	Drop unused generated_files arg from chat context generated_files wasn't being set (anymore?). But it was being passed around through for chat context and being saved to db. Also reduce variables used to set mermaid diagram description	2025-08-27 13:45:09 -07:00
Debanjum	8a16f5a2af	Reduce logical complexity of constructing context from chat history - Process chat history in default order instead of processing it in reverse. Improve legibility of context construction for minor performance hit in dropping message from front of list. - Handle multiple system messages by collating them into list - Remove logic to drop system role for gemma-2, o1 models. Better to make code more readable than support old models.	2025-08-27 13:43:10 -07:00
Debanjum	1e81b51abc	Support generating images with different aspect ratios You can now specify shape of images to be generated. It can be one of portrait, landscape or square.	2025-08-27 13:43:10 -07:00
Debanjum	5a2cae3756	Improve, simplify image generation prompts and context flow Use seed to stabilize image change consistency across turns when - KHOJ_LLM_SEED env var is set - Using Image models via Replicate OpenAI, Google do not support image seed	2025-08-27 13:43:10 -07:00
Debanjum	0fb6020f30	Remove model type check to construct structured messages All model types use a normalized, chatml structured message format This check isn't used since offline model support was dropped.	2025-08-27 13:43:10 -07:00
Debanjum	386a17371d	Fix identifying deepseek r1 model to process its thinking tokens	2025-08-27 13:43:04 -07:00
Debanjum	ff004d31ef	Fix extracting inferred queries from chat history db Inferred queries is stored with underscore in db but aliased with - in memory. This conversation.messages logic was broken, so inferred queries field of chat message history was getting ignored. This change fixes that issue and improve previous image generation description for better context for subsequent image generation attempts.	2025-08-25 14:19:27 -07:00
Debanjum	892e4d4077	Fix system prompt construction for gemini models System prompt was duplicating instructions for gemini models previously	2025-08-24 18:22:31 -07:00
Debanjum	00c5aec614	Release Khoj version 2.0.0-beta.14	2025-08-23 12:12:33 -07:00
Debanjum	b99ccbc4c3	Improve table styling, fix chat sidebar height on web app	2025-08-23 02:05:50 -07:00
Debanjum	29ae476a26	Use groq with service tier auto to fallback to flex on rate limit Merge gpt-oss config with openai reasoning config as similar tuning. Add pricing for gpt oss 20b model	2025-08-23 02:05:50 -07:00
Debanjum	c89c5c7b46	Reorder ai model api columns on admin panel for readability	2025-08-23 01:40:05 -07:00
Debanjum	464c1546b7	Support deepseek v3.1 via official deepseek api The new deepseek-chat is powered by deepseek v3.1, which is a hybrid reasoning model unlike it's predecessor, deepseek v3.	2025-08-23 01:40:05 -07:00
Debanjum	40488b3b68	Remove redundant exception for retry calls to gemini api httpx ReadError inherits from NetworkError so not required to mention it explicitly in gemini api call retry check	2025-08-23 00:48:10 -07:00
Debanjum	8aa9c0f534	Reduce max reasoning tokens for gemini models A high reasoning tokens does not seem to help for standard Khoj use cases. And hopefully reducing it may avoid repetition loops by model.	2025-08-23 00:48:10 -07:00
Debanjum	2823c84bb4	Default to gemini 2.5 model series on init and for eval	2025-08-22 20:34:38 -07:00
Debanjum	c53a70c997	Share debug logs from github eval run for debugging	2025-08-22 19:06:37 -07:00
Debanjum	e2f377c27b	Render file reference as link with file preview on hover/click in web app Overview - Khoj references files it used in its response as markdown links. For example [1](file://path/to/file.txt#line=121) - Previously these file links were just shown as raw text - This change renders khoj's inline file references as a proper links and shows file content preview (around specified line if deeplink) on hover or click in the web app Details - Render inline file references as links in chat message on web app. Previously references like [1](file://path/to/file.txt#line=120) would be shown as plain text. Now they are rendered as links - Preview file content of referenced files on click or hover. If reference uses a deeplink with line number, the file content around that line is shown on hover, click. Click allows viewing file preview on mobile, unlike hover. Hover is easier with mouse.	2025-08-22 18:24:27 -07:00
Debanjum	d8b7e9c8a5	Handle unset content type key when indexing knowledge base on server	2025-08-22 18:24:27 -07:00
Debanjum	3c3205bb06	Fix and improve file read, write handling in Obsidian Fixes - Fix to allow khoj to delete content in obsidian write mode - Do not throw error when no edit blocks in write mode on obsidian - Limit retries to fix invalid edit blocks in obsidian write mode Improvements - Only show 3 recent files as context in obsidian file read, write mode - Persist open file access mode setting across restarts in obsidian - Make khoj obsidian keyboard shortcuts toggle voice chat, chat history - Do not show <SYSTEM> instructions in chat session title on obsidian Closes #1209	2025-08-20 20:20:12 -07:00
Debanjum	48ed7afab8	Do not show <SYSTEM> instructions in chat session title on obsidian In obsidian we have a hacky system instruction being passed in read, write file access modes. This shouldn't be shown in chat sessions list during view or edit. It is an internal implementation detail.	2025-08-20 20:18:27 -07:00
Debanjum	82dc7b115b	Fix to allow khoj to delete content in obsidian write mode Previous regex and replacement logic did not allow replace block to be empty	2025-08-20 20:18:27 -07:00
Debanjum	7645cbea3b	Do not throw error when no edit blocks in write mode on obsidian Editing is an option, not a requirement in file write/edit mode.	2025-08-20 20:18:27 -07:00
Debanjum	2e6928c582	Limit retries to fix invalid edit blocks in obsidian write mode	2025-08-20 20:18:27 -07:00
Debanjum	c5e2373d73	Make khoj obsidian keyboard shortcuts toggle voice chat, chat history Previously hitting voice chat keybinding would just start voice chat, not end it and just open chat history and not close it. This is unintuitive and different from the equivalent button click behaviors. Fix toggles voice chat on/off and shows/hides chat history when hit Ctrl+Alt+V, Ctrl+Alt+O keybindings in khoj obsidian chat view	2025-08-20 20:18:27 -07:00
Debanjum	d8b2df4107	Only show 3 recent files as context in obsidian file read, write mode Related #1209	2025-08-20 20:18:27 -07:00
Debanjum	eb2f0ec6bc	Persist open file access mode setting across restarts in obsidian Allows a lightweight mechanism to persist this user preference. Improve hover text a bit for readability. Resolves #1209	2025-08-20 20:18:27 -07:00
Debanjum	2884853c98	Make plugin object accessible to chat, find similar panes in obsidian Allows ability to access, save settings in a cleaner way	2025-08-20 20:18:27 -07:00
Debanjum	9f6aa922a2	Improve Khoj research tools, gpt-oss support and ai api usage Better support for GPT OSS - Tune reasoning effort, temp, top_p for gpt-oss models - Extract thoughts of openai style models like gpt-oss from api response Tool use improvements - Improve view file, code tool prompts. Format other research tool prompts - Truncate long words in code tool stdout, stderr for context efficiency - Use instruction instead of query as code tool argument - Simplify view file tool. Limit viewing upto 50 lines at a time - Make regex search tool results look more like grep results - Update khoj personality prompts with better style, capability guide Web UX improvements - Wrap long words in train of thought shown on web app - Do not overwrite charts created in previous code tool use during research - Update web UX when server side error or hit stop + no task running Fix AI API Usage - Use subscriber type specific context window to generate response - Fix max thinking budget for gemini models to generate final response - Fix passing temp kwarg to non-streaming openai completion endpoint - Handle unset reasoning, response chunk from openai api while streaming - Fix using non-reasoning openai model via responses API - Fix to calculate usage from openai api streaming completion	2025-08-20 20:06:18 -07:00
Debanjum	13d26ae8b8	Wrap long words in train of thought shown on web app	2025-08-20 19:07:28 -07:00
Debanjum	fb0347a388	Truncate long words in stdout, stderr for context efficiency Avoid long base64 images etc. in stdout, stderr to result in context limits being hit.	2025-08-20 19:07:28 -07:00
Debanjum	dbc3330610	Tune reasoning effort, temp, top_p for gpt-oss models	2025-08-20 19:07:28 -07:00
Debanjum	83d725d2d8	Extract thoughts of openai style models like gpt-oss from api response They use delta.reasoning instead of delta.reasoning_content to share model reasoning	2025-08-20 19:07:28 -07:00
Debanjum	f483a626b8	Simplify view file tool. Limit viewing upto 50 lines at a time We were previously truncating by characters. Limiting by max lines allows model to control line ranges they request	2025-08-20 19:07:28 -07:00
Debanjum	f5a4d106d1	Use instruction instead of query as code tool argument	2025-08-20 19:07:28 -07:00
Debanjum	c5a9c81479	Update khoj personality prompts with better style, capability guide - Add more color to personality and communication style - Split prompt into capabilities and style sections - Remove directives in personality meant for older, less smart models. - Discourage model from unnecessarily sharing code snippets in final response unless explicitly requested.	2025-08-20 19:07:28 -07:00
Debanjum	2c91edbb25	Improve view file, code tool prompts. Format other research tool prompts	2025-08-20 19:07:28 -07:00
Debanjum	452c794e93	Make regex search tool results look more like grep results	2025-08-20 19:07:28 -07:00
Debanjum	9a8c707f84	Do not overwrite charts created in previous code tool use during research	2025-08-20 19:07:28 -07:00
Debanjum	e0007a31bb	Update web UX when server side error or hit stop + no task running - Ack websocket interrupt even when no task running Otherwise chat UX isn't updated to indicate query has stopped processing for this edge case - Mark chat request as not being procesed on server side error	2025-08-20 19:07:28 -07:00
Debanjum	222cc19b7f	Use subscriber type specific context window to generate response	2025-08-20 19:07:28 -07:00
Debanjum	ff73d30106	Fix max thinking budget for gemini models to generate final response	2025-08-20 19:07:28 -07:00

1 2 3 4 5 ...

5134 Commits