klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-03-02 13:18:18 +00:00

Author	SHA1	Message	Date
Debanjum	16ffebf765	Document how to configure using AI models via GCP Vertex AI	2025-03-23 16:12:46 +05:30
Debanjum	7153d27528	Cache Google AI API client for reuse	2025-03-23 16:12:46 +05:30
Debanjum	da33c7d83c	Support access to Gemini models via GCP Vertex AI	2025-03-23 16:12:46 +05:30
Debanjum	603c4bf2df	Support access to Anthropic models via GCP Vertex AI Enable configuring a Khoj AI model API for Vertex AI using GCP credentials. Specifically use the api key & api base url fields of the AI Model API associated with the current chat model to extract gcp region, gcp project id & credentials. This helps create a AnthropicVertex client. The api key field should contain the GCP service account keyfile as a base64 encoded string. The api base url field should be of the form `https://{MODEL_GCP_REGION}-aiplatform.googleapis.com/v1/projects/{YOUR_GCP_PROJECT_ID}` Accepting GCP credentials via the AI model API makes it easy to use across local and cloud environments. As it bypasses the need for a separate service account key file on the Khoj server.	2025-03-23 16:12:46 +05:30
Debanjum	8bebcd5f81	Support longer API key field in DB to store GCP service account keyfile	2025-03-23 14:55:50 +05:30
Debanjum	f2b438145f	Upgrade sentence-transformers. Avoid transformers v4.50.0 as problematic - The 3.4.1 release of sentence tranformer fixes offline load latency of sentence transformer models (and Khoj) by avoiding call to HF - The 4.50.0 release of transformers is resulting in jax error (unexpected keyword argument 'flatten_with_keys') on load.	2025-03-23 09:02:57 +05:30
Debanjum	510cbed61c	Make google auth package dependency explicit to simplify code Previously google auth library was explicitly installed only for the cloud variant of Khoj to minimize packages installed for non production use-cases. But it was being implicitly installed as a dependency of an explicit package in the default installation anyway. Making the dependency on google auth package explicit simplifies the conditional import of google auth in code while not incurring any additional cost in terms of space or complexity.	2025-03-23 09:02:57 +05:30
Debanjum	5fff05add3	Set seed for Google Gemini models using KHOJ_LLM_SEED env variable This env var was already being used to set seed for OpenAI and Offline models	2025-03-22 08:59:31 +05:30
Debanjum	6cc5a10b09	Disable SimpleQA eval on release as saturated & low signal for usecase Reaching >94% in research mode on SimpleQA. When answers can be researched online, it becomes too easy. And the FRAMES eval does a more thorough job of evaluating that use-case anyway.	2025-03-22 08:05:12 +05:30
Debanjum	45015dae27	Limit to json enforcement via json object with DeepInfra hosted models DeepInfra based models do not seem to support json schema. See https://deepinfra.com/docs/advanced/json_mode for reference	2025-03-22 08:04:09 +05:30
Debanjum	dc473015fe	Set default model, sandbox to display in eval workflow summary on release	2025-03-20 14:44:56 +05:30
Debanjum	80d864ada7	Release Khoj version 1.37.0	2025-03-20 14:06:57 +05:30
Debanjum	0c53106b30	Fix passing inline images to vision models - Fix regression: Inline images were not getting passed to the AI models since #992 - Format inline images passed to Gemini models correctly - Format inline images passed to Anthropic models correctly Verified vision working with inline and url images for OpenAI, Anthropic and Gemini models. Resolves #1112	2025-03-20 13:22:46 +05:30
Debanjum	1ce1d2f5ab	Deduplicate, clean code for S3 images uploads	2025-03-20 12:30:07 +05:30
Debanjum	f15a95dccf	Show Khoj agent in agent dropdown by default on mobile in web app home Previously on slow connection you'd see the agent dropdown flicker from undefined to Khoj default agent on phones and other thin screens. This is unnecessary and jarring. Populate with default agent to remove this issue	2025-03-20 12:27:52 +05:30
Debanjum	9a0b126f12	Allow chat input on web app while Khoj responds to speed interactions Previously the chat input area didn't allow inputting text while Khoj is researching and generating response. This change allows the user to add their next text while Khoj responds. This should speed up interaction cycles as user can have their next query ready to send when Khoj finishes its response.	2025-03-19 23:08:22 +05:30
Debanjum	e68428dd24	Support enforcing json schema in supported AI model APIs (#1133 ) - Trigger Gemini 2.0 Flash doesn't always follow JSON schema in research prompt - Details - Use json schema to enforce generate online queries format - Use json schema to enforce research mode tool pick format - Support constraining Gemini model output to specified response schema - Support constraining OpenAI model output to specified response schema - Only enforce json output in supported AI model APIs - Simplify OpenAI reasoning model specific arguments to OpenAI API	2025-03-19 22:59:23 +05:30
Debanjum	a5627ef787	Use json schema to enforce generate online queries format	2025-03-19 22:32:53 +05:30
Debanjum	2c53eb9de1	Use json schema to enforce research mode tool pick format	2025-03-19 22:32:53 +05:30
Debanjum	6980014838	Support constraining Gemini model output to specified response schema If the response_schema argument is passed to send_message_to_model_wrapper it is used to constrain output by Gemini models	2025-03-19 22:32:53 +05:30
Debanjum	ac4b36b9fd	Support constraining OpenAI model output to specified response schema	2025-03-19 22:32:52 +05:30
Debanjum	4a4d225455	Only enforce json output in supported AI model APIs Deepseek reasoner does not support json object or schema via deepseek API Azure Ai API does not support json schema Resolves #1126	2025-03-19 22:32:11 +05:30
Debanjum	d74c3a1db4	Simplify OpenAI reasoning model specific arguments to OpenAI API Previously OpenAI reasoning models didn't support stream_options and response_format Add reasoning_effort arg for calls to OpenAI reasoning models via API. Right now it defaults to medium but can be changed to low or high	2025-03-19 21:12:02 +05:30
Debanjum	9b6d626a09	Fix to store e2b code execution text output file content as string Previously was encoding E2B code execution text output content as b64. This was breaking - The AI model's ability to see the content of the file - Downloading the output text file with appropriately encoded content Issue created when adding E2B code sandbox in #1120	2025-03-19 20:09:41 +05:30
Artem Yurchenko	a7e261a191	Implement better bug issue template (#1129 ) * Implement better bug issue template * Fix IDs in new bug issue template * Reduce, reorder and improve field descriptions in the bug issue template --------- Co-authored-by: Debanjum <debanjum@gmail.com>	2025-03-18 20:53:57 +05:30
Debanjum	931f555cf8	Configure max allowed iterations in research mode via env var	2025-03-18 18:15:50 +05:30
Debanjum	2ab8e711d3	Fix Gemini models to output valid json when configured	2025-03-18 17:02:45 +05:30
sabaimran	ce60cb9779	Remove max-w 80vw, which was smushing AI responses	2025-03-13 13:44:05 -07:00
sabaimran	a3c4347c11	Add a one-click action to export all conversations. Add a self-service delete account action to the settings page	2025-03-12 23:54:02 -07:00
Debanjum	79816d2b9b	Upgrade package dependencies of server, clients and docs	2025-03-12 00:22:08 +05:30
Debanjum	7bb6facdea	Add support for Google Imagen AI models for image generation Use the new Google GenAI client to generate images with Imagen	2025-03-11 23:39:46 +05:30
Debanjum	bd06fcd9be	Stop using old google generativeai package to raise, catch exceptions	2025-03-11 23:39:46 +05:30
Debanjum	bdfa6400ef	Upgrade to new Gemini package to interface with Google AI	2025-03-11 22:18:07 +05:30
Debanjum	2790ba3121	Update default temperature for calls to Gemini models to 0.6 from 0.2 This aligns with default temperature used by google ai studio and may reduce loops and repetitions	2025-03-11 21:28:04 +05:30
Debanjum	50f71be03d	Support Claude 3.7 and use its extended thinking in research mode Claude 3.7 Sonnet is Anthropic's first reasoning model. It provides a single model/api capable of standard and extended thinking. Utilize the extended thinking in Khoj's research mode. Increase default max output tokens to 8K for Anthropic models.	2025-03-11 21:27:59 +05:30
Debanjum	69048a859f	Fix E2B tool description prompt to mention plotly package available	2025-03-11 02:20:06 +05:30
Debanjum	9751adb1a2	Improve Code Tool, Sandbox and Eval (#1120 ) # Improve Code Tool, Sandbox - Improve code gen chat actor to output code in inline md code blocks - Stop code sandbox on request timeout to allow sandbox process restarts - Use tenacity retry decorator to retry executing code in sandbox - Add retry logic to code execution and add health check to sandbox container - Add E2B as an optional code sandbox provider # Improve Gemini Chat Models - Default to non-zero temperature for all queries to Gemini models - Default to Gemini 2.0 flash instead of 1.5 flash on setup - Set default chat model to KHOJ_CHAT_MODEL env var if set	2025-03-09 18:49:59 +05:30
Debanjum	c133d11556	Improvements based on code feedback	2025-03-09 18:23:30 +05:30
Debanjum	94ca458639	Set default chat model to KHOJ_CHAT_MODEL env var if set Simplify code log to set default_use_model during init for readability	2025-03-09 18:23:30 +05:30
Debanjum	7b2d0fdddc	Improve code gen chat actor to output code in inline md code blocks Simplify code gen chat actor to improve correct code gen success, especially for smaller models & models with limited json mode support Allow specify code blocks inline with reasoning to try improve code quality Infer input files based on user file paths referenced in code.	2025-03-09 18:23:30 +05:30
Debanjum	8305fddb14	Default to non-zero temperature for all queries to Gemini models. It may mitigate the intermittent invalid json output issues. Model maybe going into repetition loops, non-zero temp may avoid that.	2025-03-09 18:23:30 +05:30
Debanjum	45fb85f1df	Add E2B as an optional code sandbox provider - Specify E2B api key and template to use via env variables - Try load, use e2b library when E2B api key set - Fallback to try use terrarium sandbox otherwise - Enable more python packages in e2b sandbox like rdkit via custom e2b template - Use Async E2B Sandbox - Parallelize file IO with sandbox - Add documentation on how to enable E2B as code sandbox instead of Terrarium	2025-03-09 18:23:30 +05:30
Debanjum	b4183c7333	Default to gemini 2.0 flash instead of 1.5 flash on Gemini setup Add price of gemini 2.0 flash for cost calculations	2025-03-07 13:48:15 +05:30
Debanjum	701a7be291	Stop code sandbox on request timeout to allow sandbox process restarts	2025-03-07 13:48:15 +05:30
Debanjum	ecc2f79571	Use tenacity retry decorator to retry executing code in sandbox	2025-03-07 13:48:15 +05:30
sabaimran	4a28714a08	Add retry logic to code execution and add health check to sandbox container	2025-03-07 13:48:15 +05:30
Debanjum	f13bdc5135	Log eval run progress percentage for orientation	2025-03-07 13:48:15 +05:30
Debanjum	bbe1b63361	Improve Obsidian Sync for Large Vaults (#1078 ) - Batch sync files by size to try not exceed API request payload size limits - Fix force sync of large vaults from Obsidian - Add API endpoint to delete all indexed files by file type - Fix to also delete file objects when call DELETE content source API	2025-03-07 13:47:21 +05:30
Debanjum	043de068ff	Fix force sync of large vaults from Obsidian Previously if you tried to force sync a vault with more than 1000 files it would only end up keeping the last batch because the PUT API call would delete all previous entries. This change calls DELETE for all previously indexed data first, followed by a PATCH to index current vault on a force sync (regenerate) request. This ensures that files from previous batches are not deleted.	2025-03-07 13:34:48 +05:30
Debanjum	86fa528a73	Add API endpoint to delete all indexed files by file type	2025-03-07 13:28:53 +05:30

1 2 3 4 5 ...

4527 Commits