Commit Graph

3659 Commits

Author SHA1 Message Date
Debanjum
0eb2d17771 Warn and drop invalid messages when format messages for gemini
Previously we were setting message content part with empty text. This
results in error from Gemini API. Warn and drop such messages instead.

Log empty message content found during construction to root-cause the
issue but allow Khoj to respond without the offending messages in
context for call to Gemini API.
2025-04-03 01:11:22 +05:30
sabaimran
d53b740197 Improve online search and allow server to skip auto webpage read 2025-03-31 13:52:48 -05:00
Debanjum
d62dd4ef61 Support Firecrawl as a online search provider 2025-03-31 17:06:00 +05:30
Debanjum
3939e995e4 Fallback to enabled, lower priority online search providers on error
Make serper.dev higher priority than official google serp api because
it provides more detailed results with knowledge cards etc.
2025-03-31 17:05:44 +05:30
Debanjum
9b7442f28f Truncate online query to Serper if query exceeds max supported length
Previously query to serper with longer than max supported would throw
error instead of returning at least some results.

Truncating the onlien search query to serper to max supported length
mitigates that issue.
2025-03-31 17:01:42 +05:30
Debanjum
db7eba56f6 Fix webpage read and improve web search with Jina
- Improve webpage read to include image alt text
- Improve Jina webpage search to not include each page content
- Use POST instead of GET for web search, webpage read with Jina
2025-03-31 17:01:42 +05:30
Debanjum
db68372b81 Update code sandbox prompts to allow network access when using E2B
Tell Khoj code writing chat actor that it has access to the network
and can use the python requests library in the E2B code sandbox.
2025-03-31 15:33:47 +05:30
Debanjum
5b8c2989d6 Add hover text on button to unshare a conversation on web app 2025-03-31 15:32:43 +05:30
Debanjum
713ba06a8d Release Khoj version 1.38.0 2025-03-29 18:30:06 +05:30
Debanjum
e9132d4fee Support attaching programming language file types to web app for chat 2025-03-29 01:22:35 +05:30
Debanjum
5ee513707e Use embedded postgres db to simplify self-hosted setup (#1141)
Use pgserver python package as an embedded postgres db,
installed directly as a khoj python package dependency.

This significantly simplifies self-hosting with just a `pip install khoj'. 
No need to also install postgres separately.

Still use standard postgres server for multi-user, production use-cases.
2025-03-29 00:03:55 +05:30
Debanjum
56b63f95ea Suggest Google image gen model, new Anthropic chat models on first run
- Update default anthropic chat models to latest good models.

- Now that Google supports a good text to image model. Suggest adding
  that if Google AI API is setup on first run.
2025-03-28 23:07:17 +05:30
Debanjum
72986c905a Fix default agent creation to allow chat on first run
Previously agent slug was not considered on create even when passed
explicitly in agent creation step.

This made the default agent slug different until next run when it was
updated after creation. And didn't allow chat to work on first run

The fix to use the agent slug when explicitly passed allows users to
chat on first run.
2025-03-28 22:49:00 +05:30
Debanjum
03de2803f0 Fallback to default agent for chat when unset in get conversation API 2025-03-28 00:56:18 +05:30
Debanjum
a387f638cd Enforce json schema on more chat actors to improve schema compliance
Including infer webpage urls, gemini documents search, pick default
mode tools chat actors
2025-03-28 00:56:18 +05:30
Debanjum
ccd9de7792 Improve safety settings for Gemini chat models
- Align remaining harm categories to only refuse in high harm
  scenarios as well
- Handle response for new "negligible" harm probability as well
2025-03-28 00:56:18 +05:30
Debanjum
2ec5cf3ae7 Normalize type of chat messages arg sent to Anthropic completion funcs
Previously messages got Anthropic specific formatting done before
being passed to Anthropic (chat) completion functions.

Move the code to format messages of type list[ChatMessage] into Anthropic
specific format down to the Anthropic (chat) completion functions.

This allows the rest of the functionality like prompt tracing to work
with normalized list[ChatMesssage] type of chat messages across AI API
providers
2025-03-26 18:24:17 +05:30
Debanjum
4085c9b991 Fix infer webpage url step actor to request upto specified max urls
Previously we'd always request up to 3 webpage url via the prompt but
read only one of the requested webpage url.

This would degrade quality of research and default mode. As model may
request reading upto 3 webpage links but get only one of the requested
webpages read.

This change passes the number of webpages to read down to the AI model
dynamically via the updated prompt. So number of webpages requested to
be read should mostly be same as number of webpages actually read.

Note: For now, the max webpages to read is kept same as before at 1.
2025-03-26 18:24:17 +05:30
Debanjum
c337c53452 Fix to use agent chat model for research model planning
Previously the research mode planner ignored the current agent or
conversation specific chat model the user was chatting with. Only the
server chat settings, user default chat model, first created chat model
were considered to decide the planner chat model.

This change considers the agent chat model to be used for the planner
as well. The actual chat model picked is decided by the existing
prioritization of server > agent > user > first chat model.
2025-03-25 18:31:55 +05:30
Debanjum
9dfa7757c5 Unshare public conversations from the title pane on web app
Only show the unshare button on public conversations created by the
currently logged in user. Otherwise hide the button

Set conversation.isOwner = true only if currently logged in user
shared the current conversation.

This isOwner information is passed by the get shared conversation API
endpoint
2025-03-25 14:05:29 +05:30
Debanjum
d9c758bcd2 Create API endpoint to unshare a public conversation
Pass isOwner field from the get shared conversation API endpoint if
the currently authenticated user created the requested public
conversation
2025-03-25 14:05:29 +05:30
Debanjum
e3f6d241dd Normalize chat messages sent to gemini funcs to work with prompt tracer
Previously messages passed to gemini (chat) completion functions
got a little of Gemini specific formatting mixed in.

These functions expect a message of type list[ChatMessage] to work
with prompt tracer etc.

Move the code to format messages of type list[ChatMessage] into gemini
specific format down to the gemini (chat) completion functions.

This allows the rest of the functionality like prompt tracing to work
with normalize list[ChatMesssage] type of chat messages across
providers
2025-03-25 14:04:16 +05:30
Debanjum
7976aa30f8 Terminate research if query or tool is empty 2025-03-25 14:04:16 +05:30
Debanjum
39aa48738f Set effort for openai reasoning models to pick tool in research mode
This is analogous to how we enable extended thinking for claude models
in research mode.

Default to medium effort irrespective of deepthought for openai
reasoning models as high effort is currently flaky with regular
timeouts and low effort isn't great.
2025-03-25 14:04:16 +05:30
Debanjum
b4929905b2 Add costs of ai prompt cache read, write. Use for calls to Anthropic 2025-03-25 14:04:16 +05:30
sabaimran
a8285deed7 Release Khoj version 1.37.2 2025-03-23 11:38:25 -07:00
sabaimran
12e7409da9 Release Khoj version 1.37.1 2025-03-23 11:10:34 -07:00
Debanjum
d1df9586ca Standardize AI model response temperature across provider specific ranges
- Anthropic expects a 0-1 range. Gemini & OpenAI expect a 0-2 range
- Anneal temperature to explore reasoning trajectories but respond factually
- Default send_message_to_model and extract_question temps to the same
2025-03-23 18:09:22 +05:30
Debanjum
55ae0eda7a Upgrade package dependencies nextjs for web app and torch on server 2025-03-23 17:10:40 +05:30
Debanjum
7153d27528 Cache Google AI API client for reuse 2025-03-23 16:12:46 +05:30
Debanjum
da33c7d83c Support access to Gemini models via GCP Vertex AI 2025-03-23 16:12:46 +05:30
Debanjum
603c4bf2df Support access to Anthropic models via GCP Vertex AI
Enable configuring a Khoj AI model API for Vertex AI using GCP credentials.

Specifically use the api key & api base url fields of the AI Model API
associated with the current chat model to extract gcp region, gcp
project id & credentials. This helps create a AnthropicVertex client.

The api key field should contain the GCP service account keyfile as a
base64 encoded string.

The api base url field should be of the form
`https://{MODEL_GCP_REGION}-aiplatform.googleapis.com/v1/projects/{YOUR_GCP_PROJECT_ID}`

Accepting GCP credentials via the AI model API makes it easy to use
across local and cloud environments. As it bypasses the need for a
separate service account key file on the Khoj server.
2025-03-23 16:12:46 +05:30
Debanjum
8bebcd5f81 Support longer API key field in DB to store GCP service account keyfile 2025-03-23 14:55:50 +05:30
Debanjum
510cbed61c Make google auth package dependency explicit to simplify code
Previously google auth library was explicitly installed only for the
cloud variant of Khoj to minimize packages installed for non
production use-cases.

But it was being implicitly installed as a dependency of an explicit
package in the default installation anyway.

Making the dependency on google auth package explicit simplifies
the conditional import of google auth in code while not incurring any
additional cost in terms of space or complexity.
2025-03-23 09:02:57 +05:30
Debanjum
5fff05add3 Set seed for Google Gemini models using KHOJ_LLM_SEED env variable
This env var was already being used to set seed for OpenAI and Offline
models
2025-03-22 08:59:31 +05:30
Debanjum
45015dae27 Limit to json enforcement via json object with DeepInfra hosted models
DeepInfra based models do not seem to support json schema. See
https://deepinfra.com/docs/advanced/json_mode for reference
2025-03-22 08:04:09 +05:30
Debanjum
80d864ada7 Release Khoj version 1.37.0 2025-03-20 14:06:57 +05:30
Debanjum
0c53106b30 Fix passing inline images to vision models
- Fix regression: Inline images were not getting passed to the AI
  models since #992
- Format inline images passed to Gemini models correctly
- Format inline images passed to Anthropic models correctly

Verified vision working with inline and url images for OpenAI,
Anthropic and Gemini models.

Resolves #1112
2025-03-20 13:22:46 +05:30
Debanjum
1ce1d2f5ab Deduplicate, clean code for S3 images uploads 2025-03-20 12:30:07 +05:30
Debanjum
f15a95dccf Show Khoj agent in agent dropdown by default on mobile in web app home
Previously on slow connection you'd see the agent dropdown flicker
from undefined to Khoj default agent on phones and other thin screens.

This is unnecessary and jarring. Populate with default agent to remove
this issue
2025-03-20 12:27:52 +05:30
Debanjum
9a0b126f12 Allow chat input on web app while Khoj responds to speed interactions
Previously the chat input area didn't allow inputting text while Khoj is
researching and generating response.

This change allows the user to add their next text while Khoj
responds. This should speed up interaction cycles as user can have
their next query ready to send when Khoj finishes its response.
2025-03-19 23:08:22 +05:30
Debanjum
a5627ef787 Use json schema to enforce generate online queries format 2025-03-19 22:32:53 +05:30
Debanjum
2c53eb9de1 Use json schema to enforce research mode tool pick format 2025-03-19 22:32:53 +05:30
Debanjum
6980014838 Support constraining Gemini model output to specified response schema
If the response_schema argument is passed to
send_message_to_model_wrapper it is used to constrain output by Gemini
models
2025-03-19 22:32:53 +05:30
Debanjum
ac4b36b9fd Support constraining OpenAI model output to specified response schema 2025-03-19 22:32:52 +05:30
Debanjum
4a4d225455 Only enforce json output in supported AI model APIs
Deepseek reasoner does not support json object or schema via deepseek API
Azure Ai API does not support json schema

Resolves #1126
2025-03-19 22:32:11 +05:30
Debanjum
d74c3a1db4 Simplify OpenAI reasoning model specific arguments to OpenAI API
Previously OpenAI reasoning models didn't support stream_options and
response_format

Add reasoning_effort arg for calls to OpenAI reasoning models via API.
Right now it defaults to medium but can be changed to low or high
2025-03-19 21:12:02 +05:30
Debanjum
9b6d626a09 Fix to store e2b code execution text output file content as string
Previously was encoding E2B code execution text output content as b64.

This was breaking
- The AI model's ability to see the content of the file
- Downloading the output text file with appropriately encoded content

Issue created when adding E2B code sandbox in #1120
2025-03-19 20:09:41 +05:30
Debanjum
931f555cf8 Configure max allowed iterations in research mode via env var 2025-03-18 18:15:50 +05:30
Debanjum
2ab8e711d3 Fix Gemini models to output valid json when configured 2025-03-18 17:02:45 +05:30