Commit Graph

3583 Commits

Author SHA1 Message Date
Debanjum
e0352cd8e1 Handle unset ttft in metadata of failed chat response. Fixes evals.
This was causing evals to stop processing rest of batch as well.
2025-05-17 15:06:22 -07:00
Debanjum
d867dca310 Fix send_message_to_model_wrapper by using sync is_user_subscribed check
Calling an async function from a sync function wouldn't work.
2025-05-17 15:06:22 -07:00
Debanjum
2feed544a6 Add Gemini 2.0 flash back to default gemini chat models list
Remove once gemini 2.5 flash is GA
2025-05-11 19:05:09 -06:00
Debanjum
2e290ea690 Pass conversation history to generate non-streaming chat model responses
Allows send_message_to_model_wrapper func to also use conversation
logs as context to generate response. This is an optional parameter
2025-05-09 00:02:14 -06:00
Debanjum
8787586e7e Dedupe code to format messages before sending to appropriate chat model
Fallback to assume not a subscribed user if user not passed.
This allows user arg to be actually optional in the async
send_message_to_model_wrapper function
2025-05-09 00:02:14 -06:00
Debanjum
e94bf00e1e Add cancellation support to research mode via asyncio.Event 2025-05-09 00:01:45 -06:00
Debanjum
2cd7302966 Parse Grok reasoning model thoughts returned by API 2025-05-02 19:59:17 -06:00
Debanjum
8cadb0dbc0 Parse Anthropic reasoning model thoughts returned by API 2025-05-02 19:59:13 -06:00
Debanjum
ae4e352b42 Fix formatting to use Deepseek reasoner for completion via OpenAI API
Previously Deepseek reasoner couldn't be used via API for completion
because of the additional formatting constrains it required was being
applied in this function.

The formatting fix was being applied in the chat completion endpoint.
2025-05-02 19:11:16 -06:00
Debanjum
61a50efcc3 Parse DeepSeek reasoning model thoughts served via OpenAI compatible API
DeepSeek reasoners returns reasoning in reasoning_content field.

Create an async stream processor to parse the reasoning out when using
the deepseek reasoner model.
2025-05-02 19:11:16 -06:00
Debanjum
16f3c85dde Handle thinking by reasoning models. Show in train of thought on web client 2025-05-02 19:11:16 -06:00
Debanjum
d10dcc83d4 Only enable reasoning by qwen3 models in deepthought mode 2025-05-02 18:36:49 -06:00
Debanjum
6eaf54eb7a Parse Qwen3 reasoning model thoughts served via OpenAI compatible API
The Qwen3 reasoning models return thoughts within <think></think> tags
before response.

This change parses the thoughts out from final response from the
response stream and returns as structured response with thoughts.

These thoughts aren't passed to client yet
2025-05-02 18:36:45 -06:00
Debanjum
7b9f2c21c7 Parse thoughts from thinking models served via OpenAI compatible API
OpenAI API doesn't support thoughts via chat completion by default.
But there are thinking models served via OpenAI compatible APIs like
deepseek and qwen3.

Add stream handlers and modified response types that can contain
thoughts as well apart from content returned by a model.

This can be used to instantiate stream handlers for different model
types like deepseek, qwen3 etc served over an OpenAI compatible API.
2025-05-02 17:49:16 -06:00
Debanjum
6843db1647 Use conversation specific chat model to respond to free tier users
Recent changes enabled free tier users to switch free tier chat models
per conversation or the default.

This change enables free tier users to generate responses with their
conversation specific chat model.

Related: #725, #1151
2025-05-02 17:48:48 -06:00
Debanjum
5b5efe463d Remove inline base64 images from webpages read with Firecrawl 2025-05-02 14:11:27 -06:00
Debanjum
559b323475 Support attaching jupyter/ipython notebooks from the web app to chat 2025-05-02 14:11:27 -06:00
Debanjum
964a784acf Release Khoj version 1.41.0 2025-04-23 19:01:27 +05:30
Debanjum
23dae72420 Update default models: Gemini models to 2.5 series, Gpt 4o to 4.1 2025-04-23 18:40:38 +05:30
Debanjum
dd46bcabc2 Track gpt-4.1 model costs. Set prompt size of new gemini, openai models 2025-04-23 17:53:33 +05:30
Debanjum
87262d15bb Save conversation to DB in the background, as an asyncio task 2025-04-22 17:42:33 +05:30
Debanjum
a4b5842ac3 Remove ThreadedGenerator class, previously used to stream chat response 2025-04-21 14:16:40 +05:30
Debanjum
763fa2fa79 Refactor Offline chat response to stream async, with separate thread 2025-04-21 10:48:38 +05:30
Debanjum
932a9615ef Refactor Anthropic chat response to stream async, no separate thread 2025-04-21 10:46:07 +05:30
Debanjum
a557031447 Refactor Gemini chat response to stream async, no separate thread 2025-04-21 10:46:07 +05:30
Debanjum
0751f2ea30 Refactor Openai chat response to stream async, no separate thread
- Refactor chat API to use async/await for Openai streaming
- Fix and clean Openai chat response async streaming
2025-04-21 10:44:49 +05:30
Debanjum
c93c0d982e Create async get anthropic, openai client funcs, move to reusable package
This package is where the get openai client functions also reside.
2025-04-21 09:30:26 +05:30
Debanjum
973aded6c5 Fix system prompt to make openai reasoning models md format response 2025-04-20 20:33:45 +05:30
Debanjum
21d19163ba Just pass user rather than whole request object to doc search func 2025-04-20 20:33:45 +05:30
Debanjum
b2390fa977 Allow attaching typescript files to chat on web app 2025-04-19 19:08:11 +05:30
Debanjum
8f9090940b Resolve datetime utcnow deprecation warnings (#1164)
# PR Summary
This small PR resolves the deprecation warnings on `datetime` in
Python3.12+. You can find them in the [CI
logs](https://github.com/khoj-ai/khoj/actions/runs/14538833837/job/40792624987#step:9:134):
```python
  /__w/khoj/khoj/src/khoj/processor/content/images/image_to_entries.py:61: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
    timestamp_now = datetime.utcnow().timestamp()
```
2025-04-19 18:26:52 +05:30
Debanjum
ab29ffd799 Fix web app packaging for pypi since upgrade to python 3.11.12 in CI 2025-04-19 18:03:29 +05:30
Debanjum
79fc911633 Enable free tier users to switch between free tier AI models
- Update API to allow free tier users to switch between free models
- Update web app to allow model switching on agent creation, settings
  chat page (via right side pane), even for free tier users.

Previously the model switching APIs and UX fields on web app were
completely disabled for free tier users
2025-04-19 17:29:53 +05:30
Debanjum
30570e3e06 Track Price tier for each Chat, Speech, Image, Voice AI model in DB
Enables users on free plan to choose AI models marked for free tier
2025-04-19 09:44:33 +05:30
Emmanuel Ferdman
fee1d3682b Resolve datetime deprecation warnings
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-04-18 10:41:16 -07:00
Debanjum
eb1406bcb4 Support deepthought in research mode with new Gemini 2.5 reasoning model
The 2.5 flash model is the first hybrid reasoning models by Google

- Track costs of thoughts separately as they are priced differently
2025-04-18 17:37:05 +05:30
Debanjum
f95173bb0a Support deepthought in research mode with new Grok 3 reasoning model
Rely on deepthought flag to control reasoning effort of low/high for
the grok model

This is different from the openai reasoning models which support
low/medium/high and for which we use low/medium effort based on the
deepthought flag

Note: grok is accessible over an openai compatible API
2025-04-18 17:37:05 +05:30
Debanjum
9c70a0f3f5 Support recently released Openai reasoning models
- Rely on deepthought flag to control reasoning effort
- Generalize Openai reasoning model check for all o- series models
2025-04-18 17:32:29 +05:30
Debanjum
2f8283935a Warn and drop empty messages when format messages for Anthropic
Log dropped empty messages to debug this unexpected state.

Related 0eb2d17
2025-04-18 17:32:29 +05:30
Debanjum
51e19c6199 Simplify KHOJ_DOMAIN states. All production deployments should set it.
Do not need KHOJ_DOMAIN to be tri-state.
KHOJ_DOMAIN set to empty does not change behavior anymore.

Related 5a3c7b1
2025-04-18 17:32:29 +05:30
Debanjum
e072530471 Deduplicate images generated using the e2b code tool
Disregard chart types as not using rich chart rendering
and they are duplicate of chart images that are rendered

Disregard text output associated with generated image files
2025-04-18 17:32:29 +05:30
sabaimran
6a30da3e9e Fix default state for tools in the agent settings for the chat sidebar 2025-04-11 11:12:22 -07:00
Debanjum
2470eea421 Release Khoj version 1.40.0 2025-04-11 18:10:56 +05:30
Debanjum
d0a933b072 Add email based rate limiting to email login API endpoint
Server:
 - Rate limit based on unverified email before creating user
 - Check email address for deliverability before creating user
 - Track rate limit for unverified email in new non-user keyed table

Web app:
 - Show error in login popup to user on failure/throttling
 - Simplify login popup logic by moving magic link handling logic
   into EmailSigninContext instead of passing require props via parent
2025-04-11 17:49:18 +05:30
Debanjum
fe308c2911 Handle scenario where no valid otps for selected users on admin panel 2025-04-11 17:49:18 +05:30
Debanjum
2935ea52cf Set chatSidebar prompt, Setting name fields to empty str if value null
TextArea and Input field values cannot be null.
2025-04-10 19:59:01 +05:30
Debanjum
aea7b90fec Track if agent modified in chatSidebar to simplify code, fix looping
Previously the sidebar could recurse on opening chat page (from home?)
due to child modelSelector component updating parent chatSidebar prop
which was passed back down to it in a loop.

The chatSidebar decides if agent has been modified in a single
useEffect and enables the Save button accordingly.
- Track agent modification wrt agent info received from server in
  chatSidebar instead.
- Reduce modelSelector's mandate to just notify
  when the user changes the model.

- Fix to infer, show & update agent state from chat sidebar on web app
  This logic is fragile and convoluted because:
  - the default agent chat model is dynamically determined.
  - need to disambiguate tools not set vs none set vs all set by user
    The default agent's tool selection is stored as undefined to show
    not set scenario, which allows for all tools to be dynamically
    used by  agent.
    But the user can also set no tools or all tools for their agents.
    All 3 scenarios are handled differently.
  - Track tools to be displayed vs tools to be stored
2025-04-10 19:59:01 +05:30
Debanjum
e9ee9004fb Suppress spurious dark mode hydration warnings on the web app
This is triggered by mismatch between "dark" class present on server
sent layout but not in client sent layout on initial render.

That mismatch exists because the server applies dark-mode styling
early to avoid FOUC flickering of UX.

Related 43e032e
2025-04-10 19:59:01 +05:30
Debanjum
9ab5ead3ca Set key for chatMessage parent to get UX efficiently updated by react
By fixing the no key prop in ChatHistory error on web app
2025-04-10 19:59:01 +05:30
Debanjum
1ad7314fe6 Let only root next.js layout handle html, body tags, not child layouts
Remove html, body elements from child page layouts. Let only the root
layout handle it.

Next.js router structure mounts child layouts inside parent layouts,
as defined by their directory hierarchy. So the html, body component
should only be defined in the parent layout.

This avoids the child layout mounting its html, body component within
the actual root layout's existing html, body component.
2025-04-10 19:59:01 +05:30