Sets env vars to empty if condition not met so:
- Terrarium (not e2b) used as code sandbox on release triggered eval
- Internet turned off for math500 eval
- Anthropic expects a 0-1 range. Gemini & OpenAI expect a 0-2 range
- Anneal temperature to explore reasoning trajectories but respond factually
- Default send_message_to_model and extract_question temps to the same
Enable configuring a Khoj AI model API for Vertex AI using GCP credentials.
Specifically use the api key & api base url fields of the AI Model API
associated with the current chat model to extract gcp region, gcp
project id & credentials. This helps create a AnthropicVertex client.
The api key field should contain the GCP service account keyfile as a
base64 encoded string.
The api base url field should be of the form
`https://{MODEL_GCP_REGION}-aiplatform.googleapis.com/v1/projects/{YOUR_GCP_PROJECT_ID}`
Accepting GCP credentials via the AI model API makes it easy to use
across local and cloud environments. As it bypasses the need for a
separate service account key file on the Khoj server.
- The 3.4.1 release of sentence tranformer fixes offline load latency
of sentence transformer models (and Khoj) by avoiding call to HF
- The 4.50.0 release of transformers is resulting in
jax error (unexpected keyword argument 'flatten_with_keys') on load.
Previously google auth library was explicitly installed only for the
cloud variant of Khoj to minimize packages installed for non
production use-cases.
But it was being implicitly installed as a dependency of an explicit
package in the default installation anyway.
Making the dependency on google auth package explicit simplifies
the conditional import of google auth in code while not incurring any
additional cost in terms of space or complexity.
Reaching >94% in research mode on SimpleQA. When answers can be
researched online, it becomes too easy. And the FRAMES eval does a
more thorough job of evaluating that use-case anyway.
- Fix regression: Inline images were not getting passed to the AI
models since #992
- Format inline images passed to Gemini models correctly
- Format inline images passed to Anthropic models correctly
Verified vision working with inline and url images for OpenAI,
Anthropic and Gemini models.
Resolves#1112
Previously on slow connection you'd see the agent dropdown flicker
from undefined to Khoj default agent on phones and other thin screens.
This is unnecessary and jarring. Populate with default agent to remove
this issue
Previously the chat input area didn't allow inputting text while Khoj is
researching and generating response.
This change allows the user to add their next text while Khoj
responds. This should speed up interaction cycles as user can have
their next query ready to send when Khoj finishes its response.
- Trigger
Gemini 2.0 Flash doesn't always follow JSON schema in research prompt
- Details
- Use json schema to enforce generate online queries format
- Use json schema to enforce research mode tool pick format
- Support constraining Gemini model output to specified response schema
- Support constraining OpenAI model output to specified response schema
- Only enforce json output in supported AI model APIs
- Simplify OpenAI reasoning model specific arguments to OpenAI API
Previously OpenAI reasoning models didn't support stream_options and
response_format
Add reasoning_effort arg for calls to OpenAI reasoning models via API.
Right now it defaults to medium but can be changed to low or high
Previously was encoding E2B code execution text output content as b64.
This was breaking
- The AI model's ability to see the content of the file
- Downloading the output text file with appropriately encoded content
Issue created when adding E2B code sandbox in #1120
* Implement better bug issue template
* Fix IDs in new bug issue template
* Reduce, reorder and improve field descriptions in the bug issue template
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
Claude 3.7 Sonnet is Anthropic's first reasoning model. It provides a
single model/api capable of standard and extended thinking. Utilize
the extended thinking in Khoj's research mode.
Increase default max output tokens to 8K for Anthropic models.
# Improve Code Tool, Sandbox
- Improve code gen chat actor to output code in inline md code blocks
- Stop code sandbox on request timeout to allow sandbox process restarts
- Use tenacity retry decorator to retry executing code in sandbox
- Add retry logic to code execution and add health check to sandbox container
- Add E2B as an optional code sandbox provider
# Improve Gemini Chat Models
- Default to non-zero temperature for all queries to Gemini models
- Default to Gemini 2.0 flash instead of 1.5 flash on setup
- Set default chat model to KHOJ_CHAT_MODEL env var if set
Simplify code gen chat actor to improve correct code gen success,
especially for smaller models & models with limited json mode support
Allow specify code blocks inline with reasoning to try improve
code quality
Infer input files based on user file paths referenced in code.