mirror of
https://github.com/khoaliber/khoj.git
synced 2026-03-02 21:19:12 +00:00
Use url fragment schema for deep link URIs, borrowing from URL/PDF schemas. E.g file:///path/to/file.txt#line=<line_no>&#page=<page_no> Compute line number during (recursive) org-mode entry chunking. Thoroughly test line number in URI maps to line number of chunk in actual org mode file. This deeplink URI with line number is passed to llm as context to better combine with line range based view file tool. Grep tool already passed matching line number. This change passes line number in URIs of org entries matched by the semantic search tool
48 lines
2.1 KiB
Org Mode
Vendored
48 lines
2.1 KiB
Org Mode
Vendored
* Khoj
|
|
/Allow natural language search on user content like notes, images using transformer based models/
|
|
|
|
All data is processed locally. User can interface with khoj app via [[./interface/emacs/khoj.el][Emacs]], API or Commandline
|
|
|
|
** Dependencies [[id:123-421-121-12]] :TAG1:@TAG1_1:
|
|
- Python3
|
|
- [[https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links][Miniconda]]
|
|
|
|
** Install
|
|
#+begin_src shell
|
|
git clone https://github.com/khoj-ai/khoj && cd khoj
|
|
conda env create -f environment.yml
|
|
conda activate khoj
|
|
#+end_src
|
|
|
|
** Run
|
|
Load ML model, generate embeddings and expose API to query specified org-mode files
|
|
|
|
#+begin_src shell
|
|
python3 main.py --input-files ~/Notes/Schedule.org ~/Notes/Incoming.org --verbose
|
|
#+end_src
|
|
|
|
** Use
|
|
*** *Khoj via Emacs* [[https://khoj.dev][link to khoj website]] :@EMACS:CLIENT_1:KHOJ:
|
|
- [[https://github.com/khoj-ai/khoj/tree/master/interface/emacs#installation][Install]] [[./interface/emacs/khoj.el][khoj.el]]
|
|
- Run ~M-x khoj <user-query>~ or Call ~C-c C-s~
|
|
|
|
*** *Khoj via API*
|
|
- Query: ~GET~ [[http://localhost:42110/api/search?q=%22what%20is%20the%20meaning%20of%20life%22][http://localhost:42110/api/search?q="What is the meaning of life"]]
|
|
- Update Index: ~GET~ [[http://localhost:42110/api/update][http://localhost:42110/api/update]]
|
|
- [[http://localhost:42110/docs][Khoj API Docs]]
|
|
|
|
*** *Call Khoj via Python Script Directly*
|
|
#+begin_src shell
|
|
python3 search_types/asymmetric.py \
|
|
--compressed-jsonl .notes.jsonl.gz \
|
|
--embeddings .notes_embeddings.pt \
|
|
--results-count 5 \
|
|
--verbose \
|
|
--interactive
|
|
#+end_src
|
|
|
|
** Acknowledgments
|
|
- [[https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1][MiniLM Model]] for Asymmetric Text Search. See [[https://www.sbert.net/examples/applications/retrieve_rerank/README.html][SBert Documentation]]
|
|
- [[https://github.com/openai/CLIP][OpenAI CLIP Model]] for Image Search. See [[https://www.sbert.net/examples/applications/image-search/README.html][SBert Documentation]]
|
|
- Charles Cave for [[http://members.optusnet.com.au/~charles57/GTD/orgnode.html][OrgNode Parser]]
|