Upgrade to latest GPT4All. Use Mistral as default offline chat model

GPT4all now supports gguf llama.cpp chat models. Latest GPT4All (+mistral) performs much at least 3x faster. On Macbook Pro at ~10s response start time vs 30s-120s earlier. Mistral is also a better chat model, although it hallucinates more than llama-2
2026-03-07 21:29:13 +00:00 · 2023-10-22 18:16:02 -07:00
parent 6dc0df3afb
commit 0f1ebcae18
10 changed files with 84 additions and 11 deletions
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -206,7 +206,7 @@ def processor_config_offline_chat(tmp_path_factory):

    # Setup conversation processor
    processor_config = ProcessorConfig()
-    offline_chat = OfflineChatProcessorConfig(enable_offline_chat=True)
+    offline_chat = OfflineChatProcessorConfig(enable_offline_chat=True, chat_model="mistral-7b-instruct-v0.1.Q4_0.gguf")
    processor_config.conversation = ConversationProcessorConfig(
        offline_chat=offline_chat,
        conversation_logfile=processor_dir.joinpath("conversation_logs.json"),
--- a/tests/test_gpt4all_chat_actors.py
+++ b/tests/test_gpt4all_chat_actors.py
@@ -24,7 +24,7 @@ from khoj.processor.conversation.gpt4all.utils import download_model

 from khoj.processor.conversation.utils import message_to_log

-MODEL_NAME = "llama-2-7b-chat.ggmlv3.q4_0.bin"
+MODEL_NAME = "mistral-7b-instruct-v0.1.Q4_0.gguf"


@pytest.fixture(scope="session")