Document using OpenAI-compatible LLM API server for Khoj chat

This allows using open or commerical, local or hosted LLM models that are not supported in Khoj by default. It also allows users to use other local LLM API servers that support their GPU Closes #407
2026-03-02 13:18:18 +00:00 · 2024-02-02 10:31:27 +05:30
parent 1c6f1d94f5
commit 474afa5efe
1 changed files with 21 additions and 0 deletions
--- a/documentation/docs/get-started/setup.mdx
+++ b/documentation/docs/get-started/setup.mdx
@@ -264,6 +264,27 @@ You can head to http://localhost:42110 to use the web interface. You can also us
  </Tabs>
 ```

+## Advanced
+### Use OpenAI compatible LLM API Server
+Use this if you want to use non-standard, open or commercial, local or hosted LLM models for Khoj chat
+1. Install an OpenAI compatible LLM API Server like [LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start), [Llama-cpp-python](https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#openai-compatible-web-server) etc.
+2. Set `OPENAI_API_BASE="<url-of-your-llm-server>"` environment variables before starting Khoj
+
+#### Sample Setup using LiteLLM and Mistral API
+
+```shell
+# Install LiteLLM
+pip install litellm[proxy]
+
+# Start LiteLLM and use Mistral tiny via Mistral API
+export MISTRAL_API_KEY=<MISTRAL_API_KEY>
+litellm --model mistral/mistral-tiny --drop_params
+
+# Set OpenAI API Base to LiteLLM server URL and start Khoj
+export OPENAI_API_BASE='http://localhost:8000'
+khoj --anonymous-mode
+```
+
 ## Troubleshoot

 #### Install fails while building Tokenizer dependency