Commit Graph

28 Commits

Author SHA1 Message Date
Debanjum
2823c84bb4 Default to gemini 2.5 model series on init and for eval 2025-08-22 20:34:38 -07:00
Debanjum
892d57314e Update test setup to index test data after old indexing code removed
- Delete tests testing deprecated server side indexing flows
- Delete `Local(Plaintext|Org|Markdown|Pdf)Config' methods, files and
  references in tests
- Index test data via new helper method, `get_index_files'
  - It is modelled after the old `get_org_files' variants in main app
  - It passes the test data in required format to `configure_content'
    Allows maintaining the more realistic tests from before while
    using new indexing mechanism (rather than the deprecated server
    side indexing mechanism
2025-07-31 18:25:32 -07:00
Debanjum
b1f2737c9a Drop native offline chat support with llama-cpp-python
It is recommended to chat with open-source models by running an
open-source server like Ollama, Llama.cpp on your GPU powered machine
or use a commercial provider of open-source models like DeepInfra or
OpenRouter.

These chat model serving options provide a mature Openai compatible
API that already works with Khoj.

Directly using offline chat models only worked reasonably with pip
install on a machine with GPU. Docker setup of khoj had trouble with
accessing GPU. And without GPU access offline chat is too slow.

Deprecating support for an offline chat provider directly from within
Khoj will reduce code complexity and increase developement velocity.
Offline models are subsumed to use existing Openai ai model provider.
2025-07-31 18:25:32 -07:00
Debanjum
d2c7e5516f Fix online chat actor tests, improve offline chat actor tests
The chat actor (and director) tests haven't been looked into in a long
while. They'd gone stale in how they were calling thee functions. And
what was required to run them. Now the online chat actor tests work
again.
2025-06-06 13:28:18 -07:00
Debanjum
05d4e19cb8 Pass deep typed chat history for more ergonomic, readable, safe code
The chat dictionary is an artifact from earlier non-db chat history
storage. We've been ensuring new chat messages have valid type before
being written to DB for more than 6 months now.

Move to using the deeply typed chat history helps avoids null refs,
makes code more readable and easier to reason about.

Next Steps:
The current update entangles chat_history written to DB
with any virtual chat history message generated for intermediate
steps. The chat message type written to DB should be decoupled from
type that can be passed to AI model APIs (maybe?).

For now we've made the ChatMessage.message type looser to allow
for list[dict] type (apart from string). But later maybe a good idea
to decouple the chat_history recieved by send_message_to_model from
the chat_history saved to DB (which can then have its stricter type check)
2025-06-04 00:03:14 -07:00
Debanjum
01bc6d35dc Rename Chat Model Options table to Chat Model as short & readable (#1003)
- Previous was incorrectly plural but was defining only a single model
- Rename chat model table field to name
- Update documentation
- Update references functions and variables to match new name
2024-12-12 11:24:16 -08:00
Debanjum
9dd3782f5c Rename OpenAIProcessorConversationConfig DB model to more apt AiModelApi (#998)
* Rename OpenAIProcessorConversationConfig to more apt AiModelAPI

The DB model name had drifted from what it is being used for,
a general chat api provider that supports other chat api providers like
anthropic and google chat models apart from openai based chat models.

This change renames the DB model and updates the docs to remove this
confusion.

Using Ai Model Api we catch most use-cases including chat, stt, image generation etc.
2024-12-08 18:02:29 -08:00
Debanjum
45c623f95c Dedupe, organize chat actor, director tests
- Move Chat actor tests that were previously in chat director tests file
- Dedupe online, offline io selector chat actor tests
2024-11-18 16:10:50 -08:00
Debanjum
2a76c69d0d Run online, offine chat actor, director tests for any supported provider
- Previously online chat actors, director tests only worked with openai.
  This change allows running them for any supported onlnie provider
  including Google, Anthropic and Openai.

- Enable online/offline chat actor, director in two ways:
  1. Explicitly setting KHOJ_TEST_CHAT_PROVIDER environment variable to
     google, anthropic, openai, offline
  2. Implicitly by the first API key found from openai, google or anthropic.

- Default offline chat provider to use Llama 3.1 3B for faster, lower
  compute test runs
2024-11-18 15:11:37 -08:00
sabaimran
f3ce47b445 Create explicit flow to enable the free trial (#944)
* Create explicit flow to enable the free trial

The current design is confusing. It obfuscates the fact that the user is on a free trial. This design will make the opt-in explicit and more intuitive.

* Use the Subscription Type enum instead of hardcoded strings everywhere

* Use length of free trial in the frontend code as well
2024-10-23 15:29:23 -07:00
Debanjum Singh Solanky
58c8068079 Upgrade default offline chat model to llama 3.1 2024-08-20 09:28:56 -07:00
Raghav Tirumale
d4e5c95711 Add Ability to Summarize Documents (#800)
* Uses entire file text and summarizer model to generate document summary.
* Uses the contents of the user's query to create a tailored summary.
* Integrates with File Filters #788 for a better UX.
2024-06-18 19:31:07 +05:30
Debanjum Singh Solanky
f440ddbe1d Fix openai chat actor, director tests
- Update test ChatModelOptions setup since update to it's schema
- Fix stale function calls using their updated signatures
2024-06-09 07:24:47 +05:30
Debanjum Singh Solanky
ec41482324 Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall
Previous cross-encoder model was a few years old, newer models should
have improved in quality. Model size increases by 50% compared to
previous for better performance, at least on benchmarks
2024-04-24 09:50:09 +05:30
sabaimran
60658a8037 Get rid of enable flag for the offline chat processor config
- Default, assume that offline chat is enabled if there is an offline chat model option configured
2024-04-23 23:08:29 +05:30
sabaimran
3132430737 Add tests for the db lock 2024-04-17 13:22:41 +05:30
Debanjum Singh Solanky
8ca39a436c Use llama.cpp for offline chat models
- Benefits of moving to llama-cpp-python from gpt4all:
  - Support for all GGUF format chat models
  - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac)
  - Supports models with more capabilities like tools, schema
    enforcement, speculative ddecoding, image gen etc.
- Upgrade default chat model, prompt size, tokenizer for new supported
  chat models

- Load offline chat model when present on disk without requiring internet
  - Load model onto GPU if not disabled and device has GPU
  - Load model onto CPU if loading model onto GPU fails
  - Create helper function to check and load model from disk, when model
    glob is present on disk.

    `Llama.from_pretrained' needs internet to get repo info from
    HuggingFace. This isn't required, if the model is already downloaded

    Didn't find any existing HF or llama.cpp method that looked for model
    glob on disk without internet
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
7dfbcd2e5a Handle subscribe renew date, langchain, pydantic & logger.warn warnings
- Ensure langchain less than 0.2.0 is used, to prevent breaking
  ChatOpenAI, PyMuPDF usage due to their deprecation after 0.2.0
- Set subscription renewal date to a timezone aware datetime
- Use logger.warning instead of logger.warn as latter is deprecated
- Use `model_dump' not deprecated dict to get all configured content_types
2024-01-12 01:46:52 +05:30
sabaimran
79913d4c17 Add isort to the pre-commit configuration and apply it to the whole project (#595)
* Apply isort to the entire repository
* Fix missing import issues in text_to_entries
* Fix imports in migration files
2023-12-28 18:04:02 +05:30
sabaimran
a474c31e02 Move the django app into the src/khoj folder for better organization and functionality
- Our pypi package currently does not work because the django app and associated database is not included. To remedy this issue, move the app into the src/khoj folder. This has the added benefit of improved organization of the codebase, as all server related code is now in a single folder
- Update associated file paths and system references
2023-11-21 10:56:04 -08:00
Debanjum Singh Solanky
08a057bdd5 Rename SearchModel to SearchModelConfig DB model, Require Cross-Encoder 2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky
4af194d74b Make search model configurable on server
- Expose ability to modify search model via Django admin interface
- Previously the bi_encoder and cross_encoder models to use were set
  in code
- Now it's user configurable but with a default config generated by
  default
2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
e98141f4c3 Subscribe default user to standard plan with a far away renewal date
Self hosted users in anonymous mode have all capabilities unlocked
2023-11-14 16:31:39 -08:00
sabaimran
e2e96f9aa4 Add default settings to let new users be subscribed on trial
- Add the default user to a subscription trial
- Update associated unit tests
2023-11-10 22:38:28 -08:00
Debanjum Singh Solanky
345856e7be Merge branch 'master' of github.com:khoj-ai/khoj into features/multi-user-support-khoj
Merge changes to use latest GPT4All with GPU, GGUF model support into
khoj multi-user support rearchitecture branch
2023-11-02 22:44:25 -07:00
sabaimran
fe6720fa06 [Multi-User Part 8]: Make conversation processor settings server-wide (#529)
- Rather than having each individual user configure their conversation settings, allow the server admin to configure the OpenAI API key or offline model once, and let all the users re-use that code.
- To configure the settings, the admin should go to the `django/admin` page and configure the relevant chat settings. To create an admin, run `python3 src/manage.py createsuperuser` and enter in the details. For simplicity, the email and username should match.
- Remove deprecated/unnecessary endpoints and views for configuring per-user chat settings
2023-11-02 10:43:27 -07:00
Debanjum
9acc722f7f [Multi-User Part 4]: Authenticate using API Tokens (#513)
###  New
- Use API keys to authenticate from Desktop, Obsidian, Emacs clients
- Create API, UI on web app config page to CRUD API Keys
- Create user API keys table and functions to CRUD them in Database

### 🧪 Improve
- Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality
- Only load chat model to GPU if enough space, throw error on load failure
- Show encoding progress, truncate headings to max chars supported
- Add instruction to create db in Django DB setup Readme

### ⚙️ Fix
- Fix error handling when configure offline chat via Web UI
- Do not warn in anon mode about Google OAuth env vars not being set
- Fix path to load static files when server started from project root
2023-10-26 12:33:03 -07:00
sabaimran
4b6ec248a6 [Multi-User Part 3]: Separate chat sesssions based on authenticated users (#511)
- Add a data model which allows us to store Conversations with users. This does a minimal lift over the current setup, where the underlying data is stored in a JSON file. This maintains parity with that configuration.
- There does _seem_ to be some regression in chat quality, which is most likely attributable to search results.

This will help us with #275. It should become much easier to maintain multiple Conversations in a given table in the backend now. We will have to do some thinking on the UI.
2023-10-26 11:37:41 -07:00