klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-03-02 13:18:18 +00:00

Author	SHA1	Message	Date
Debanjum	2694734d22	Update truncation logic to handle multi-part message content	2025-05-17 17:37:15 -07:00
Debanjum	a337d9e4b8	Structure research iteration msgs for more granular context management Previously research iterations and conversation logs were added to a single user message. This prevented truncating each past iteration separately on hitting context limits. So the whole past research context had to be dropped on hitting context limits. This change splits each research iteration into a separate item in a message content list. It uses the ability for message content to be a list, that is supported by all major ai model apis like openai, anthropic and gemini. The change in message format seen by pick next tool chat actor: - New Format - System: System Message - User/Assistant: Chat History - User: Raw Query - Assistant: Iteration History - Iteration 1 - Iteration 2 - User: Query with Pick Next Tool Nudge - Old Format - User: System + Chat History + Previous Iterations Message - User: Query - Collateral Changes The construct_structured_message function has been updated to always return a list[dict[str, Any]]. Previously it'd only use list if attached_file_context or vision model with images for wider compatibility with other openai compatible api	2025-05-17 17:37:15 -07:00
Debanjum	0f53a67837	Prompt web page reader to extract quantitative data as is from pages Previously the research agent would have a hard time getting quantitative data extracted by the web page reader tool AI. This change aims to encourage the web page reader tool to extract relevant data in verbatim form for higher granularity research and responses.	2025-05-17 17:37:15 -07:00
Debanjum	99a2305246	Improve tool chat history constructor and fix its usage during research. Code tool should see code context and webpage tool should see online context during research runs Fix to include code context from past conversations to answer queries. Add all queries to tool chat history when no specific tool to limit extracting inferred queries for provided.	2025-05-17 17:37:15 -07:00
Debanjum	8050173ee1	Timeout calls to khoj api in evals to continue to next question	2025-05-17 17:37:11 -07:00
Debanjum	442c7b6153	Retry running code on more request exception	2025-05-17 17:37:11 -07:00
Debanjum	10a5d68a2c	Improve retry, increase timeouts of gemini api calls - Catch specific retryable exceptions for retry - Increase httpx timeout from default of 5s to 20s	2025-05-17 16:38:55 -07:00
Debanjum	20f08ca564	Reduce timeouts on calling local and online llms via openai api - Use much larger read, connect timeout if llm served over local url - Use larger timeout duration than default (5s) for online llms too This matches timeout duration increase calls to gemini api	2025-05-17 16:38:55 -07:00
Debanjum	e0352cd8e1	Handle unset ttft in metadata of failed chat response. Fixes evals. This was causing evals to stop processing rest of batch as well.	2025-05-17 15:06:22 -07:00
Debanjum	673a15b6eb	Upgrade hf hub package to include hf_xet for faster downloads	2025-05-17 15:06:22 -07:00
Debanjum	d867dca310	Fix send_message_to_model_wrapper by using sync is_user_subscribed check Calling an async function from a sync function wouldn't work.	2025-05-17 15:06:22 -07:00
Sajjad Baloch	a4ab498aec	Update README for better contributions (#1170 ) - Improve overall flow of the contribute section of Readme - Fix where to look for good first issues. The contributors board is outdated. Easier to maintain and view good-first-issue with issue tags directly. Co-authored-by: Debanjum <debanjum@gmail.com>	2025-05-12 09:51:01 -06:00
Debanjum	2feed544a6	Add Gemini 2.0 flash back to default gemini chat models list Remove once gemini 2.5 flash is GA	2025-05-11 19:05:09 -06:00
Debanjum	2e290ea690	Pass conversation history to generate non-streaming chat model responses Allows send_message_to_model_wrapper func to also use conversation logs as context to generate response. This is an optional parameter	2025-05-09 00:02:14 -06:00
Debanjum	8787586e7e	Dedupe code to format messages before sending to appropriate chat model Fallback to assume not a subscribed user if user not passed. This allows user arg to be actually optional in the async send_message_to_model_wrapper function	2025-05-09 00:02:14 -06:00
Debanjum	e94bf00e1e	Add cancellation support to research mode via asyncio.Event	2025-05-09 00:01:45 -06:00
Debanjum	1572781946	Parse and show reasoning model thoughts (#1172 ) ### Major All reasoning models return thoughts differently due to lack of standardization. We normalize thoughts by reasoning models and providers to ease handling within Khoj. The model thoughts are parsed during research mode when generating final response. These model thoughts are returned by the chat API and shown in train of thought shown on web app. Thoughts are enabled for Deepseek, Anthropic, Grok and Qwen3 reasoning models served via API. Gemini and Openai reasoning models do not show their thoughts via standard APIs. ### Minor - Fix ability to use Deepseek reasoner for intermediate stages of chat - Enable handling Qwen3 reasoning models	2025-05-02 20:29:38 -06:00
Debanjum	2cd7302966	Parse Grok reasoning model thoughts returned by API	2025-05-02 19:59:17 -06:00
Debanjum	8cadb0dbc0	Parse Anthropic reasoning model thoughts returned by API	2025-05-02 19:59:13 -06:00
Debanjum	ae4e352b42	Fix formatting to use Deepseek reasoner for completion via OpenAI API Previously Deepseek reasoner couldn't be used via API for completion because of the additional formatting constrains it required was being applied in this function. The formatting fix was being applied in the chat completion endpoint.	2025-05-02 19:11:16 -06:00
Debanjum	61a50efcc3	Parse DeepSeek reasoning model thoughts served via OpenAI compatible API DeepSeek reasoners returns reasoning in reasoning_content field. Create an async stream processor to parse the reasoning out when using the deepseek reasoner model.	2025-05-02 19:11:16 -06:00
Debanjum	16f3c85dde	Handle thinking by reasoning models. Show in train of thought on web client	2025-05-02 19:11:16 -06:00
Debanjum	d10dcc83d4	Only enable reasoning by qwen3 models in deepthought mode	2025-05-02 18:36:49 -06:00
Debanjum	6eaf54eb7a	Parse Qwen3 reasoning model thoughts served via OpenAI compatible API The Qwen3 reasoning models return thoughts within <think></think> tags before response. This change parses the thoughts out from final response from the response stream and returns as structured response with thoughts. These thoughts aren't passed to client yet	2025-05-02 18:36:45 -06:00
Debanjum	7b9f2c21c7	Parse thoughts from thinking models served via OpenAI compatible API OpenAI API doesn't support thoughts via chat completion by default. But there are thinking models served via OpenAI compatible APIs like deepseek and qwen3. Add stream handlers and modified response types that can contain thoughts as well apart from content returned by a model. This can be used to instantiate stream handlers for different model types like deepseek, qwen3 etc served over an OpenAI compatible API.	2025-05-02 17:49:16 -06:00
Debanjum	6843db1647	Use conversation specific chat model to respond to free tier users Recent changes enabled free tier users to switch free tier chat models per conversation or the default. This change enables free tier users to generate responses with their conversation specific chat model. Related: #725, #1151	2025-05-02 17:48:48 -06:00
Debanjum	5b5efe463d	Remove inline base64 images from webpages read with Firecrawl	2025-05-02 14:11:27 -06:00
Debanjum	559b323475	Support attaching jupyter/ipython notebooks from the web app to chat	2025-05-02 14:11:27 -06:00
sabaimran	dab6977fed	add number 1 repo of day badge	2025-04-23 16:49:12 -07:00
Debanjum	964a784acf	Release Khoj version 1.41.0	2025-04-23 19:01:27 +05:30
Debanjum	23dae72420	Update default models: Gemini models to 2.5 series, Gpt 4o to 4.1	2025-04-23 18:40:38 +05:30
Debanjum	d84a0f6e2c	Use latest node base image to build web app for khoj docker image	2025-04-23 17:53:33 +05:30
Debanjum	dd46bcabc2	Track gpt-4.1 model costs. Set prompt size of new gemini, openai models	2025-04-23 17:53:33 +05:30
Debanjum	87262d15bb	Save conversation to DB in the background, as an asyncio task	2025-04-22 17:42:33 +05:30
Debanjum	f929ff8438	Simplify AI Chat Response Streaming (#1167 ) Reason --- - Simplify code and logic to stream chat response by solely relying on asyncio event loop. - Reduce overhead of managing threads to increase efficiency and throughput (where possible). Details --- - Use async/await with no threading when generating chat response via OpenAI, Gemini, Anthropic AI model APIs - Use threading for offline chat model as llama-cpp doesn't support async streaming yet	2025-04-21 14:28:02 +05:30
Debanjum	a4b5842ac3	Remove ThreadedGenerator class, previously used to stream chat response	2025-04-21 14:16:40 +05:30
Debanjum	763fa2fa79	Refactor Offline chat response to stream async, with separate thread	2025-04-21 10:48:38 +05:30
Debanjum	932a9615ef	Refactor Anthropic chat response to stream async, no separate thread	2025-04-21 10:46:07 +05:30
Debanjum	a557031447	Refactor Gemini chat response to stream async, no separate thread	2025-04-21 10:46:07 +05:30
Debanjum	0751f2ea30	Refactor Openai chat response to stream async, no separate thread - Refactor chat API to use async/await for Openai streaming - Fix and clean Openai chat response async streaming	2025-04-21 10:44:49 +05:30
Debanjum	c93c0d982e	Create async get anthropic, openai client funcs, move to reusable package This package is where the get openai client functions also reside.	2025-04-21 09:30:26 +05:30
Debanjum	973aded6c5	Fix system prompt to make openai reasoning models md format response	2025-04-20 20:33:45 +05:30
Debanjum	21d19163ba	Just pass user rather than whole request object to doc search func	2025-04-20 20:33:45 +05:30
Debanjum	b2390fa977	Allow attaching typescript files to chat on web app	2025-04-19 19:08:11 +05:30
Debanjum	4d331e5ad2	Bump documentation dependencies	2025-04-19 18:38:31 +05:30
Debanjum	d6aafef464	Fix formatting of FAQ section in README.md	2025-04-19 18:31:16 +05:30
Debanjum	8f9090940b	Resolve datetime utcnow deprecation warnings (#1164 ) # PR Summary This small PR resolves the deprecation warnings on `datetime` in Python3.12+. You can find them in the [CI logs](https://github.com/khoj-ai/khoj/actions/runs/14538833837/job/40792624987#step:9:134): ```python /__w/khoj/khoj/src/khoj/processor/content/images/image_to_entries.py:61: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC). timestamp_now = datetime.utcnow().timestamp() ```	2025-04-19 18:26:52 +05:30
Debanjum	5441793a10	Allow AI model switching based on User Tier (#1151 ) Overview --- Enable free tier users to chat with any AI model made available on free tier of production deployments like [Khoj cloud](https://app.khoj.dev). Previously model switching was completely disabled for users on free tier. Details --- - Track price tier of each Chat, Speech, Image, Voice AI model in DB - Update API to allow free tier users to switch between free models - Update web app to allow model switching on agent creation, settings chat page (via right side pane), even for free tier users.	2025-04-19 18:14:37 +05:30
Debanjum	ab29ffd799	Fix web app packaging for pypi since upgrade to python 3.11.12 in CI	2025-04-19 18:03:29 +05:30
Debanjum	79fc911633	Enable free tier users to switch between free tier AI models - Update API to allow free tier users to switch between free models - Update web app to allow model switching on agent creation, settings chat page (via right side pane), even for free tier users. Previously the model switching APIs and UX fields on web app were completely disabled for free tier users	2025-04-19 17:29:53 +05:30

1 2 3 4 5 ...

4623 Commits