It is recommended to chat with open-source models by running an
open-source server like Ollama, Llama.cpp on your GPU powered machine
or use a commercial provider of open-source models like DeepInfra or
OpenRouter.
These chat model serving options provide a mature Openai compatible
API that already works with Khoj.
Directly using offline chat models only worked reasonably with pip
install on a machine with GPU. Docker setup of khoj had trouble with
accessing GPU. And without GPU access offline chat is too slow.
Deprecating support for an offline chat provider directly from within
Khoj will reduce code complexity and increase developement velocity.
Offline models are subsumed to use existing Openai ai model provider.
There seems to be a more standard mechanism of specifying launch.json
params for devcontainers. Previous mechanism to write launch.json to
.vscode/launch.json in post creation step does not work.
Improve default launch.json to include khoj admin username, password
with placeholder values to get started with local development faster.
Define dockerfile for devcontainer to pre-built server, web app
dependencies during dev container image creation stage. So install on
dev container startup is sped up as no need to install dependencies.