Debanjum 16c6bfce8e Improve Quality and Reliability of Offline Chat (#393)
# Incoming
## Major
### Fix Prompt Size Exceeded Issue
- Fix issues related to prompt size, Closes #386. Use the correct tokenizer to calculate whether the input needs to be truncated or not.

### Improve Llama 2 Model Download
- Use the correct download link for LlamaV2 -- should have been using the small model, but was using the medium
- Add better downloading logic to retry download if it failed, Closes #379 

### Fix Segmentation Fault due to Race
- Add a lock around generating chat responses from the offline model to avoid segmentation faults. Closes #367.
- Add a loading symbol to the web chat UI when the model is thinking. Closes #392

### Improve Chat Response Latency
- Improve performance of offline chat by increasing batch size (via `n_batch`) to automatically engage more cores/GPU, using smaller model and fixing prompt vs response token generation numbers. Closes #363

### Fix Fake Dialogue Continuation
- Fix formatting of user query with offline chat, this was contributing to #398
- Stop Llama 2 from Creating Fake Dialogue Continuations. Closes #398

## Minor
- Improve default message for Chat window on web when it's not configured. Include hint to use offline chat.
- Add null check in `perform_chat_checks` method
- Add offline chat director unit tests

## Performance Analysis (Time to First Token)
|  | v0.10.0 | this branch |
|-|-|-|
| Query 1 | 52s | 28s |
| Query 2 | 33s| 42s |
| Query 3 | 67s| 38s|
2023-08-01 22:07:27 -07:00
2023-07-27 15:28:47 -07:00
2023-07-30 22:37:20 -07:00
2023-07-11 18:43:44 -07:00
2023-07-28 19:27:47 -07:00
2023-07-28 19:27:47 -07:00

Khoj Logo

test dockerize pypi

An AI personal assistant for your digital brain


Khoj is a desktop application to search and chat with your notes, documents and images.
It is an offline-first, open source AI personal assistant accessible from your Emacs, Obsidian or Web browser.
It works with jpeg, markdown, notion, org-mode, pdf files and github repositories.


🔎 Search 💬 Chat
Quickly retrieve relevant documents using natural language Get answers and create content from your existing knowledge base
Does not need internet Can be configured to work without internet
Description
No description provided
Readme AGPL-3.0 116 MiB
Languages
Python 51%
TypeScript 36.1%
CSS 4.1%
HTML 3.2%
Emacs Lisp 2.4%
Other 3.1%