mirror of
https://github.com/khoaliber/khoj.git
synced 2026-03-02 13:18:18 +00:00
Extract performance into separate sectin into shoving it under search Create page for web interface
4.0 KiB
4.0 KiB
Advanced Usage
Search across Different Languages
To search for notes in multiple, different languages, you can use a multi-lingual model.
For example, the paraphrase-multilingual-MiniLM-L12-v2 supports 50+ languages, has good search quality and speed. To use it:
-
Manually update
search-type > asymmetric > encodertoparaphrase-multilingual-MiniLM-L12-v2in your~/.khoj/khoj.ymlfile for now. See diff ofkhoj.ymlbelow for illustration:asymmetric: - encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1 + encoder: paraphrase-multilingual-MiniLM-L12-v2 cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2 model_directory: "~/.khoj/search/asymmetric/" -
Regenerate your content index. For example, by opening <khoj-url>/api/update?t=force
Access Khoj on Mobile
- Setup Khoj on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc
- Install Tailscale on your personal server and phone
- Open the Khoj web interface of the server from your phone browser.
It should behttp://tailscale-ip-of-server:42110orhttp://name-of-server:42110if you've setup MagicDNS - Click the Add to Homescreen button
- Enjoy exploring your notes, documents and images from your phone!
Use OpenAI Models for Search
Setup
- Set
encoder-type,encoderandmodel-directoryunderasymmetricand/orsymmetricsearch-typein yourkhoj.yml(at~/.khoj/khoj.yml):asymmetric: - encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" + encoder: text-embedding-ada-002 + encoder-type: khoj.utils.models.OpenAI cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2" - encoder-type: sentence_transformers.SentenceTransformer - model_directory: "~/.khoj/search/asymmetric/" + model-directory: null - Setup your OpenAI API key in Khoj
- Restart Khoj server to generate embeddings. It will take longer than with the offline search models.
Warnings
This configuration uses an online model
- It will send all notes to OpenAI to generate embeddings
- All queries will be sent to OpenAI when you search with Khoj
- You will be charged by OpenAI based on the total tokens processed
- It requires an active internet connection to search and index
Bootstrap Khoj Search for Offline Usage later
You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine. Note: Only search can currently run in fully offline mode, not chat.
- With Internet
- Manually download the asymmetric text, symmetric text and image search models from HuggingFace
- Pip install khoj (and dependencies) in an associated virtualenv. E.g
python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant
- Without Internet
- Copy each of the search models into their respective folders,
asymmetric,symmetricandimageunder the~/.khoj/search/directory on the air-gapped machine - Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g
source .venv/bin/activate && khoj
- Copy each of the search models into their respective folders,
