mirror of
https://github.com/khoaliber/khoj.git
synced 2026-03-03 05:29:12 +00:00
16c6bfce8e49f5c7ed6a4230b30cd8adcd7331c0
# Incoming ## Major ### Fix Prompt Size Exceeded Issue - Fix issues related to prompt size, Closes #386. Use the correct tokenizer to calculate whether the input needs to be truncated or not. ### Improve Llama 2 Model Download - Use the correct download link for LlamaV2 -- should have been using the small model, but was using the medium - Add better downloading logic to retry download if it failed, Closes #379 ### Fix Segmentation Fault due to Race - Add a lock around generating chat responses from the offline model to avoid segmentation faults. Closes #367. - Add a loading symbol to the web chat UI when the model is thinking. Closes #392 ### Improve Chat Response Latency - Improve performance of offline chat by increasing batch size (via `n_batch`) to automatically engage more cores/GPU, using smaller model and fixing prompt vs response token generation numbers. Closes #363 ### Fix Fake Dialogue Continuation - Fix formatting of user query with offline chat, this was contributing to #398 - Stop Llama 2 from Creating Fake Dialogue Continuations. Closes #398 ## Minor - Improve default message for Chat window on web when it's not configured. Include hint to use offline chat. - Add null check in `perform_chat_checks` method - Add offline chat director unit tests ## Performance Analysis (Time to First Token) | | v0.10.0 | this branch | |-|-|-| | Query 1 | 52s | 28s | | Query 2 | 33s| 42s | | Query 3 | 67s| 38s|
An AI personal assistant for your digital brain
Khoj is a desktop application to search and chat with your notes, documents and images.
It is an offline-first, open source AI personal assistant accessible from your Emacs, Obsidian or Web browser.
It works with jpeg, markdown, notion, org-mode, pdf files and github repositories.
Languages
Python
51%
TypeScript
36.1%
CSS
4.1%
HTML
3.2%
Emacs Lisp
2.4%
Other
3.1%

