mirror of
https://github.com/khoaliber/khoj.git
synced 2026-03-09 21:29:11 +00:00
Update references to all documentation to reflect instructions for managed service
- By default assume the audience of this website is people looking to understand the featuer offering of Khoj, and then people who are looking to self-host
This commit is contained in:
@@ -24,13 +24,13 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj. Unless otherwise mentioned, the docs only pertain to self-hosted Khoj instances.
|
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj. We have instructions on self-hosting, using Khoj with Emacs, Obsidian, and the Web, and more. We also include setup instructions for users on the hosted instance at [app.khoj.dev](https://app.khoj.dev).
|
||||||
|
|
||||||
- Khoj is an application to dynamically engage with your notes, documents and images. We support APIs for [semantic search](./search.md) and [chat](./chat.md).
|
- Khoj is an application to dynamically engage with your notes, documents and images. We support APIs for [semantic search](./search.md) and [chat](./chat.md).
|
||||||
- It can be easily self-hosted and run on your consumer hardware or private cloud.
|
- It can be easily self-hosted and run on your consumer hardware or private cloud.
|
||||||
- It provides an open source, AI personal assistant accessible from your [Emacs](./emacs.md), [Obsidian](./obsidian.md) or [Web browser](./web.md), or our [desktop app](https://khoj.dev/downloads).
|
- It provides an open source, AI personal assistant accessible from your [Emacs](./emacs.md), [Obsidian](./obsidian.md) or [Web browser](./web.md), or our [desktop app](https://khoj.dev/downloads).
|
||||||
- It works with plaintext, markdown, [notion](./notion_integration.md) org-mode, pdf files and [github repositories](./github_integration.md)
|
- It works with plaintext, markdown, [notion](./notion_integration.md) org-mode, pdf files and [github repositories](./github_integration.md)
|
||||||
- It can support use with multiple users, so you and your family, friends, or team can have a shared assistance server. As the admin, you can configure the server settings at `/server/admin`.
|
- It can support use with multiple users. If you're self-hosting, your family, friends, or team can have a shared assistance server. You'll the the suite of server admin settings at `/server/admin`.
|
||||||
|
|
||||||
## Quickstart
|
## Quickstart
|
||||||
[Click here](./setup.md) for full setup instructions
|
[Click here](./setup.md) for full setup instructions
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
- Get Started
|
- Get Started
|
||||||
- [Overview](README.md)
|
- [Overview](README.md)
|
||||||
- [Install](setup.md)
|
- [Self-Host](setup.md)
|
||||||
- [Demos](demos.md)
|
- [Demos](demos.md)
|
||||||
- Use
|
- Use
|
||||||
- [Features](features.md)
|
- [Features](features.md)
|
||||||
|
|||||||
@@ -1,63 +1,11 @@
|
|||||||
|
|
||||||
## Advanced Usage
|
## Advanced Usage
|
||||||
### Search across Different Languages
|
|
||||||
|
### Search across Different Languages (Self-Hosting)
|
||||||
To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
|
To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
|
||||||
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
|
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
|
||||||
1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration:
|
1. Manually update the search config in server's admin settings page. Go to [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/). Either create a new one, if none exists, or update the existing one. Set the bi_encoder to `sentence-transformers/multi-qa-MiniLM-L6-cos-v1` and the cross_encoder to `cross-encoder/ms-marco-MiniLM-L-6-v2`.
|
||||||
|
2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.
|
||||||
```diff
|
|
||||||
asymmetric:
|
|
||||||
- encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
|
|
||||||
+ encoder: paraphrase-multilingual-MiniLM-L12-v2
|
|
||||||
cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2
|
|
||||||
model_directory: "~/.khoj/search/asymmetric/"
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Regenerate your content index. For example, by opening [\<khoj-url\>/api/update?t=force](http://localhost:42110/api/update?t=force)
|
|
||||||
|
|
||||||
### Access Khoj on Mobile
|
|
||||||
1. [Setup Khoj](/#/setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc
|
|
||||||
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
|
|
||||||
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:42110` or `http://name-of-server:42110` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
|
|
||||||
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
|
|
||||||
5. Enjoy exploring your notes, documents and images from your phone!
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
### Use OpenAI Models for Search
|
|
||||||
#### Setup
|
|
||||||
1. Set `encoder-type`, `encoder` and `model-directory` under `asymmetric` and/or `symmetric` `search-type` in your `khoj.yml` (at `~/.khoj/khoj.yml`):
|
|
||||||
```diff
|
|
||||||
asymmetric:
|
|
||||||
- encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
|
|
||||||
+ encoder: text-embedding-ada-002
|
|
||||||
+ encoder-type: khoj.utils.models.OpenAI
|
|
||||||
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
|
|
||||||
- encoder-type: sentence_transformers.SentenceTransformer
|
|
||||||
- model_directory: "~/.khoj/search/asymmetric/"
|
|
||||||
+ model-directory: null
|
|
||||||
```
|
|
||||||
2. [Setup your OpenAI API key in Khoj](/#/chat?id=setup)
|
|
||||||
3. Restart Khoj server to generate embeddings. It will take longer than with the offline search models.
|
|
||||||
|
|
||||||
#### Warnings
|
|
||||||
This configuration *uses an online model*
|
|
||||||
- It will **send all notes to OpenAI** to generate embeddings
|
|
||||||
- **All queries will be sent to OpenAI** when you search with Khoj
|
|
||||||
- You will be **charged by OpenAI** based on the total tokens processed
|
|
||||||
- It *requires an active internet connection* to search and index
|
|
||||||
|
|
||||||
### Bootstrap Khoj Search for Offline Usage later
|
|
||||||
|
|
||||||
You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine.
|
|
||||||
Note: *Only search can currently run in fully offline mode, not chat.*
|
|
||||||
|
|
||||||
- With Internet
|
|
||||||
1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace
|
|
||||||
2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant`
|
|
||||||
- Without Internet
|
|
||||||
1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine
|
|
||||||
2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj`
|
|
||||||
|
|
||||||
### Query Filters
|
### Query Filters
|
||||||
|
|
||||||
|
|||||||
17
docs/chat.md
17
docs/chat.md
@@ -5,9 +5,9 @@
|
|||||||
- Supports multi-turn conversations with the relevant notes for context
|
- Supports multi-turn conversations with the relevant notes for context
|
||||||
- Shows reference notes used to generate a response
|
- Shows reference notes used to generate a response
|
||||||
|
|
||||||
### Setup
|
### Setup (Self-Hosting)
|
||||||
#### Offline Chat
|
#### Offline Chat
|
||||||
Offline chat stays completely private and works without internet. But it is slower, lower quality and more compute intensive.
|
Offline chat stays completely private and works without internet using open-source models.
|
||||||
|
|
||||||
> **System Requirements**:
|
> **System Requirements**:
|
||||||
> - Minimum 8 GB RAM. Recommend **16Gb VRAM**
|
> - Minimum 8 GB RAM. Recommend **16Gb VRAM**
|
||||||
@@ -15,9 +15,10 @@ Offline chat stays completely private and works without internet. But it is slow
|
|||||||
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
|
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
|
||||||
> - A Mac M1+ or [Vulcan supported GPU](https://vulkan.gpuinfo.org/) should significantly speed up chat response times
|
> - A Mac M1+ or [Vulcan supported GPU](https://vulkan.gpuinfo.org/) should significantly speed up chat response times
|
||||||
|
|
||||||
- Open your [Khoj settings](http://localhost:42110/config/) and click *Enable* on the Offline Chat card
|
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
|
||||||
|
2. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the offline chat model you want to use. Make sure to use `Offline` as its type. We currently only support offline models that use the [Llama chat prompt](https://replicate.com/blog/how-to-prompt-llama#wrap-user-input-with-inst-inst-tags) format. We recommend using `mistral-7b-instruct-v0.1.Q4_0.gguf`.
|
||||||
|
|
||||||

|
!> **Note**: Offline chat is not supported for a multi-user scenario. The host machine will encounter segmentation faults if multiple users try to use offline chat at the same time.
|
||||||
|
|
||||||
#### Online Chat
|
#### Online Chat
|
||||||
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
|
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
|
||||||
@@ -25,14 +26,12 @@ Online chat requires internet to use ChatGPT but is faster, higher quality and l
|
|||||||
!> **Warning**: This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing
|
!> **Warning**: This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing
|
||||||
|
|
||||||
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
|
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
|
||||||
2. Open your [Khoj Online Chat settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key.
|
2. Open your [Khoj Online Chat settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/). Add a new setting with your OpenAI API key, and click *Save*. Only one configuration will be used, so make sure that's the only one you have.
|
||||||
|
3. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the OpenAI chat model you want to use. Make sure to use `OpenAI` as its type.
|
||||||

|
|
||||||
|
|
||||||
|
|
||||||
### Use
|
### Use
|
||||||
1. Open Khoj Chat
|
1. Open Khoj Chat
|
||||||
- **On Web**: Open [/chat](http://localhost:42110/chat) in your web browser
|
- **On Web**: Open [/chat](https://app.khoj.dev/chat) in your web browser
|
||||||
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||||
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](./advanced.md#query-filters) to change what Khoj uses to respond
|
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](./advanced.md#query-filters) to change what Khoj uses to respond
|
||||||
|
|||||||
@@ -25,13 +25,7 @@ pip install -e .'[dev]'
|
|||||||
khoj -vv
|
khoj -vv
|
||||||
```
|
```
|
||||||
2. Configure Khoj
|
2. Configure Khoj
|
||||||
- **Via the Settings UI**: Add files, directories to index the [Khoj settings](http://localhost:42110/config) UI once Khoj has started up. Once you've saved all your settings, click `Configure`.
|
- **Via the Desktop application**: Add files, directories to index using the settings page of your desktop application. Click "Save" to immediately trigger indexing.
|
||||||
- **Manually**:
|
|
||||||
- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`
|
|
||||||
- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`
|
|
||||||
- Set `input-directories` field in `image` `content-type` section
|
|
||||||
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
|
|
||||||
- Restart khoj
|
|
||||||
|
|
||||||
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
||||||
|
|
||||||
|
|||||||
@@ -4,11 +4,11 @@ The Github integration allows you to index as many repositories as you want. It'
|
|||||||
|
|
||||||
# Configure your settings
|
# Configure your settings
|
||||||
|
|
||||||
1. Go to [http://localhost:42110/config](http://localhost:42110/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
1. Go to [https://app.khoj.dev/config](https://app.khoj.dev/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
||||||
|
|
||||||
## Use the Github plugin
|
## Use the Github plugin
|
||||||
|
|
||||||
1. Generate a [classic PAT (personal access token)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from [Github](https://github.com/settings/tokens) with `repo` and `admin:org` scopes at least.
|
1. Generate a [classic PAT (personal access token)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from [Github](https://github.com/settings/tokens) with `repo` and `admin:org` scopes at least.
|
||||||
2. Navigate to [http://localhost:42110/config/content-source/github](http://localhost:42110/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
2. Navigate to [https://app.khoj.dev/config/content-source/github](https://app.khoj.dev/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
||||||
3. Click `Save`. Go back to the settings page and click `Configure`.
|
3. Click `Save`. Go back to the settings page and click `Configure`.
|
||||||
4. Go to [http://localhost:42110/](http://localhost:42110/) and start searching!
|
4. Go to [https://app.khoj.dev/](https://app.khoj.dev/) and start searching!
|
||||||
|
|||||||
@@ -17,6 +17,7 @@
|
|||||||
repo: 'https://github.com/khoj-ai/khoj',
|
repo: 'https://github.com/khoj-ai/khoj',
|
||||||
loadSidebar: true,
|
loadSidebar: true,
|
||||||
themeColor: '#c2a600',
|
themeColor: '#c2a600',
|
||||||
|
auto2top: true,
|
||||||
// coverpage: true,
|
// coverpage: true,
|
||||||
}
|
}
|
||||||
</script>
|
</script>
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ We haven't setup a fancy integration with OAuth yet, so this integration still r
|
|||||||

|

|
||||||
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
|
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
|
||||||

|

|
||||||
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at http://localhost:42110/config/content-source/notion. Click `Save`.
|
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at https://app.khoj.dev/config/content-source/notion. Click `Save`.
|
||||||
5. Click `Configure` in http://localhost:42110/config to index your Notion workspace(s).
|
5. Click `Configure` in https://app.khoj.dev/config to index your Notion workspace(s).
|
||||||
|
|
||||||
That's it! You should be ready to start searching and chatting. Make sure you've configured your OpenAI API Key for chat.
|
That's it! You should be ready to start searching and chatting. Make sure you've configured your OpenAI API Key for chat.
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
## Khoj Search
|
## Khoj Search
|
||||||
### Use
|
### Use
|
||||||
1. Open Khoj Search
|
1. Open Khoj Search
|
||||||
- **On Web**: Open <http://localhost:42110/> in your web browser
|
- **On Web**: Open <https://app.khoj.dev/> in your web browser
|
||||||
- **On Obsidian**: Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or Search for *Khoj: Search* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
- **On Obsidian**: Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or Search for *Khoj: Search* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||||
2. Query using natural language to find relevant entries from your knowledge base. Use [query filters](./advanced.md#query-filters) to limit entries to search
|
2. Query using natural language to find relevant entries from your knowledge base. Use [query filters](./advanced.md#query-filters) to limit entries to search
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
# Telemetry
|
# Telemetry (self-hosting)
|
||||||
|
|
||||||
We collect some high level, anonymized metadata about usage of Khoj. This includes:
|
We collect some high level, anonymized metadata about usage of Khoj. This includes:
|
||||||
- Client (Web, Emacs, Obsidian)
|
- Client (Web, Emacs, Obsidian)
|
||||||
|
|||||||
Reference in New Issue
Block a user