klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-03-04 21:29:12 +00:00

Author	SHA1	Message	Date
Debanjum	b660c494bc	Use recognizable DB model names to ease selection UX on Admin Panel Previously id were used (by default) for model display strings. This made it hard to select chat model options, server chat settings etc. in the admin panel dropdowns. This change uses more recognizable names for the DB objects to ease selection in dropdowns and display in general on the admin panel.	2024-12-08 20:34:50 -08:00
Debanjum	9dd3782f5c	Rename OpenAIProcessorConversationConfig DB model to more apt AiModelApi (#998 ) * Rename OpenAIProcessorConversationConfig to more apt AiModelAPI The DB model name had drifted from what it is being used for, a general chat api provider that supports other chat api providers like anthropic and google chat models apart from openai based chat models. This change renames the DB model and updates the docs to remove this confusion. Using Ai Model Api we catch most use-cases including chat, stt, image generation etc.	2024-12-08 18:02:29 -08:00
sabaimran	a2251f01eb	Make result optional for code context, relevant when code execution was unsuccessful	2024-12-08 13:27:33 -08:00
sabaimran	7cd2855146	Make attributes optional in the knowledge graph model	2024-12-08 12:23:17 -08:00
sabaimran	2af687d1c5	Allow snippetHighlighted to also be nullable	2024-12-08 11:51:24 -08:00
sabaimran	efa23a8ad8	Update validation requirements for online searches	2024-12-08 11:30:17 -08:00
sabaimran	3552032827	Rename additional context to additional_context_for_llm_response	2024-12-03 21:23:15 -08:00
sabaimran	991577aa17	Allow a None turnId to accommodate historic chat messages	2024-11-30 14:39:08 -08:00
sabaimran	a0b00ce4a1	Don't include null attributes when filling in stored conversation metadata - Prompt adjustments to indicate to LLM what context it has	2024-11-29 18:10:14 -08:00
sabaimran	d91935c880	Initial commit of a functional but not yet elegant prototype for this concept	2024-11-28 17:28:23 -08:00
sabaimran	b6714c202f	Increase the title character limit to 500 for conversations	2024-11-12 01:51:19 -08:00
sabaimran	807687a0ac	Automatically generate titles for conversations from history	2024-11-08 16:02:34 -08:00
sabaimran	1e89baca7b	Deprecate the UserSearchModelConfig and remove all references - The server has moved to a model of standardization for the embeddings generation workflow. Remove references to the support for differentiated models. - The migration script fo ra new model needs to be updated to accommodate full regeneration.	2024-11-04 12:24:41 -08:00
sabaimran	5120597d4e	Remove user customized search model (#946 ) - Use a single standard search model across the server. There's diminishing benefits for having multiple user-customizable search models. - We may want to add server-level customization for specific tasks - Store the search model used to generate a given entry on the `Entry` object - Remove user-facing APIs and view - Add a management command for migrating the default search model on the server In a future PR (after running the migration), we'll also remove the `UserSearchModelConfig`	2024-10-23 17:38:37 -07:00
sabaimran	f3ce47b445	Create explicit flow to enable the free trial (#944 ) * Create explicit flow to enable the free trial The current design is confusing. It obfuscates the fact that the user is on a free trial. This design will make the opt-in explicit and more intuitive. * Use the Subscription Type enum instead of hardcoded strings everywhere * Use length of free trial in the frontend code as well	2024-10-23 15:29:23 -07:00
sabaimran	59fec37943	Improve agents management, and limit agents view to private and official agents - Default to None for the input_tools and output_modes so that they can be managed in the admin panel - Hold off on showing off all Public Agents until we have a better experience for user profiles etc.	2024-10-20 22:24:51 -07:00
Debanjum Singh Solanky	0db52786ed	Make web scraper priority configurable via admin panel - Simplifies changing order in which web scrapers are invoked to read web page by just changing their priority number on the admin panel. Previously you'd have to delete/, re-add the scrapers to change their priority. - Add help text for each scraper field to ease admin setup experience - Friendlier env var to use Firecrawl's LLM to extract content - Remove use of separate friendly name for scraper types. Reuse actual name and just make actual name better	2024-10-17 17:42:42 -07:00
Debanjum Singh Solanky	20b6f0c2f4	Access internal links directly via a simple get request The other webpage scrapers will not work for internal webpages. Try access those urls directly if they are visible to the Khoj server over the network. Only enable this by default for self-hosted, single user setups. Otherwise ability to scan internal network would be a liability! For use-cases where it makes sense, the Khoj server admin can explicitly add the direct webpage scraper via the admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	d94abba2dc	Fallback through enabled scrapers to reduce web page read failures - Set up scrapers via API keys, explicitly adding them via admin panel or enabling only a single scraper to use via server chat settings. - Use validation to ensure only valid scrapers added via admin panel Example API key is present for scrapers that require it etc. - Modularize the read webpage functions to take api key, url as args Removes dependence on constants loaded in online_search. Functions are now mostly self contained - Improve ability to read webpages by using the speed, success rate of different scrapers. Optimal configuration needs to be discovered	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	c841abe13f	Change webpage scraper to use via server admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	884fe42602	Allow automation as an output mode supported by custom agents	2024-10-17 11:58:52 -07:00
sabaimran	405c047c0c	Include agent personality through subtasks and support custom agents (#916 ) Currently, the personality of the agent is only included in the final response that it returns to the user. Historically, this was because models were quite bad at navigating the additional context of personality, and there was a bias towards having more control over certain operations (e.g., tool selection, question extraction). Going forward, it should be more approachable to have prompts included in the sub tasks that Khoj runs in order to response to a given query. Make this possible in this PR. This also sets us up for agent creation becoming available soon. Create custom agents in #928 Agents are useful insofar as you can personalize them to fulfill specific subtasks you need to accomplish. In this PR, we add support for using custom agents that can be configured with a custom system prompt (aka persona) and knowledge base (from your own indexed documents). Once created, private agents can be accessible only to the creator, and protected agents can be accessible via a direct link. Custom tool selection for agents in #930 Expose the functionality to select which tools a given agent has access to. By default, they have all. Can limit both information sources and output modes. Add new tools to the agent modification form	2024-10-07 00:21:55 -07:00
sabaimran	06777e1660	Convert the default conversation id to a uuid, plus other fixes (#918 ) * Update the conversation_id primary key field to be a uuid - update associated API endpoints - this is to improve the overall application health, by obfuscating some information about the internal database - conversation_id type is now implicitly a string, rather than an int - ensure automations are also migrated in place, such that the conversation_ids they're pointing to are now mapped to the new IDs * Update client-side API calls to correctly query with a string field * Allow modifying of conversation properties from the chat title * Improve drag and drop file experience for chat input area * Use a phosphor icon for the copy to clipboard experience for code snippets * Update conversation_id parameter to be a str type * If django_apscheduler is not in the environment, skip the migration script * Fix create automation flow by storing conversation id as string The new UUID used for conversation id can't be directly serialized. Convert to string for serializing it for later execution --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-24 14:12:50 -07:00
sabaimran	0a568244fd	Revert "Convert conversationId int to string before making api request to bulk update file filters" This reverts commit `c9665fb20b`. Revert "Fix handling for new conversation in agents page" This reverts commit `3466f04992`. Revert "Add a unique_id field for identifiying conversations (#914)" This reverts commit `ece2ec2d90`.	2024-09-18 20:36:57 -07:00
sabaimran	ece2ec2d90	Add a unique_id field for identifiying conversations (#914 ) * Add a unique_id field to the conversation object - This helps us keep track of the unique identity of the conversation without expose the internal id - Create three staged migrations in order to first add the field, then add unique values to pre-fill, and then set the unique constraint. Without this, it tries to initialize all the existing conversations with the same ID. * Parse and utilize the unique_id field in the query parameters of the front-end view - Handle the unique_id field when creating a new conversation from the home page - Parse the id field with a lightweight parameter called v in the chat page - Share page should not be affected, as it uses the public slug * Fix suggested card category	2024-09-16 12:19:16 -07:00
Debanjum Singh Solanky	1b82aea753	Support using image generation models like Flux via Replicate Enables using any image generation model on Replicate's Predictions API endpoints. The server admin just needs to add text-to-image model on the server/admin panel in organization/model_name format and input their Replicate API key with it Create db migration (including merge)	2024-09-12 19:58:56 -07:00
Alexander Matyasko	9570933506	Support Google's Gemini model series (#902 ) * Add functions to chat with Google's gemini model series * Gracefully close thread when there's an exception in the gemini llm thread * Use enums for verifying the chat model option type * Add a migration to add the gemini chat model type to the db model * Fix chat model selection verification and math prompt tuning * Fix extract questions method with gemini. Enforce json response in extract questions. * Add standard stop sequence for Gemini chat response generation --------- Co-authored-by: sabaimran <narmiabas@gmail.com> Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-12 18:17:55 -07:00
Raghav Tirumale	549686a7a4	Add Vision Support (#889 ) # Summary of Changes * New UI to show preview of image uploads * ChatML message changes to support gpt-4o vision based responses on images * AWS S3 image uploads for persistent image context in conversations * Database changes to have `vision_enabled` option in server admin panel while configuring models * Render previously uploaded images in the chat history, show uploaded images for pending msgs * Pass the uploaded_image_url through to subqueries * Allow image to render upon first message from the homepage * Add rendering support for images to shared chat as well * Fix some UI/functionality bugs in the share page * Convert user attached images for chat to webp format before upload * Use placeholder to attached image for data source, response mode actors * Update all clients to call /api/chat as a POST instead of GET request * Fix copying chat messages with images to clipboard TLDR; Add vision support for openai models on Khoj via the web UI! --------- Co-authored-by: sabaimran <narmiabas@gmail.com> Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-09 15:22:18 -07:00
sabaimran	e919d28f1c	Add support for custom search model-specific thresholds	2024-08-24 19:28:26 -07:00
Debanjum Singh Solanky	58c8068079	Upgrade default offline chat model to llama 3.1	2024-08-20 09:28:56 -07:00
sabaimran	c0316a6b5d	Enable free tier users to have unlimited chats with the default chat model (#886 ) - Allow free tier users to have unlimited chats with default chat model. It'll only be rate-limited and at the same rate as subscribed users - In the server chat settings, replace the concept of default/summarizer models with default/advanced chat models. Use the advanced models as a default for subscribed users. - For each `ChatModelOption' configuration, allow the admin to specify a separate value of `max_tokens' for subscribed users. This allows server admins to configure different max token limits for unsubscribed and subscribed users - Show error message in web app when hit rate limit or other server errors	2024-08-16 12:14:44 -07:00
Alexander Matyasko	823f8d58bb	Add model_config for crossencoder model Add model_config for crossencoder model, so the user can use models which require trust_remote_code.	2024-08-07 18:00:12 +08:00
sabaimran	1eab6c8590	Add additional icons for agents, pencil line and chalkboard	2024-08-05 17:23:29 +05:30
sabaimran	e0775446c9	fix spelling of fuschia :(	2024-08-05 11:50:11 +05:30
sabaimran	c837f3779e	Update the agents page with new UX (#850 ) - Use icons/colors for setting the styling of agents - Update automations page to use the shadcn cards: https://github.com/shadcn-ui/ui	2024-07-16 10:10:55 +05:30
Debanjum Singh Solanky	a353d883a0	Make it optional to set the encoder, cross-encoder configs via admin UI Upgrade sentence-transformer, add einops dependency for some sentence transformer models like nomic	2024-07-05 16:09:30 +05:30
Debanjum	826c3dc9cc	Enable using Stable Diffusion 3 for Image Generation via API (#830 ) - Support Stable Diffusion 3 via API Server Admin needs to setup model similar to DALLE-3 via Django Admin Panel - Use shorter prompt generator to prompt SD3 to create better images - Allow users to set paint model to use from web client config page	2024-07-02 17:28:50 +05:30
sabaimran	c83b8f2768	Allow just one worker to be the background schedule leader (#836 ) * Add a leader election mechanism to circumvent runtime issues for multiple schedulers - Reduce the load on the DB and risk of issues on the service side by limiting the execution environment to one elected leader at a given time. This one is responsible for managing all of the execution of the jobs, though all workers are capable of adding and removing jobs * Set a max duration for the schedule leader position (12 hrs), add some error if automation not added successfully	2024-06-28 13:13:25 +05:30
sabaimran	870d9ecdbf	Add a fact checker feature with updated styling (#835 ) - Add an experimental feature used for fact-checking falsifiable statements with customizable models. See attached screenshot for example. Once you input a statement that needs to be fact-checked, Khoj goes on a research spree to verify or refute it. - Integrate frontend libraries for [Tailwind](https://tailwindcss.com/) and [ShadCN](https://ui.shadcn.com/) for easier UI development. Update corresponding styling for some existing UI components. - Add component for model selection - Add backend support for sharing arbitrary packets of data that will be consumed by specific front-end views in shareable scenarios	2024-06-27 18:45:38 +05:30
Debanjum Singh Solanky	c793d8a69e	Add Validation logic to save PaintModel. Use API key from Paint Model Rename Paint Model, Adapters to TextToImage for consistency	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	2c4bf91a61	Allow user to set paint model to use from web client config page	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	eda33e092f	Enable using Stable Diffusion 3 for Image Generation via API	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	732332a3c5	Spell fix s/e.g/e.g./ across code, tests and docs	2024-06-24 15:24:45 +05:30
sabaimran	b9966eb3d4	Add support for text to speech in chat responses (#821 ) * Enable speech to text responses in khoj chat - Current issue: reads out all the markdown formatting, plus waits for the whole result to be streamed before playing it * Extract content from markdown-formatted text * Add a loader for while you're waiting for Khoj's response * Add user configuration option for chat model options, allow server side configuration for option list * Join up APIs, views, admin pages to allow configuring custom voice models	2024-06-21 11:30:28 +05:30
sabaimran	3cfe5aabe5	Add support for magic link email sign-in (#820 ) * Add magic link email sign-in option * Adding backend routes and model changes to keep state of email verification code and status * Test and fix end to end email verification flow * Add documentation for how to use the magic link sign-in when self-hosting Khoj * Add magic link sign in to public conversation page	2024-06-20 13:32:58 +05:30
Raghav Tirumale	bd3b590153	Support Indexing Docx Files (#801 ) * Add support for indexing docx files and associated unit tests --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-20 11:18:01 +05:30
Raghav Tirumale	d4e5c95711	Add Ability to Summarize Documents (#800 ) * Uses entire file text and summarizer model to generate document summary. * Uses the contents of the user's query to create a tailored summary. * Integrates with File Filters #788 for a better UX.	2024-06-18 19:31:07 +05:30
Raghav Tirumale	ba16afd3c2	New Feature: Adding File Filtering to Conversations (#788 ) * UI update for file filtered conversations * Interactive file menu #UI to add/remove files on each conversation as references. * Backend changes implemented to load selected file filters from a conversation into the querying process. --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-07 10:53:37 +05:30
sabaimran	01cdc54ad0	Add support for Anthropic models (#760 ) * Add support for chatting with Anthropic's suite of models - Had to use a custom class because there was enough nuance with how the anthropic SDK works that it would be better to simply separate out the logic. The extract questions flow needed modification of the system prompt in order to work as intended with the haiku model	2024-05-26 22:50:34 +05:30
sabaimran	3f9c20a399	Make it easier to manage server-level chat settings (#729 ) * Add support for server-wide model settings fix web page reading results returning logic	2024-05-24 20:15:18 +05:30

1 2

81 Commits