Not sure why but it some cases when interacting with o3 (which needs
non-streaming) the stream_options seems to be set.
Cannot reproduce but hopefully dropping the stream_options explicitly
should resolve this issue.
Related 985a98214
Older package (like 1.84.0) seem to always pass reasoning_effort
argument to openai api, which now seems to be throwing unexpected
request argument error when used with non-reasoning models (like
4o-mini).
There had been a regression that made all agents display the default
chat model instead of the actual chat model associated with the agent.
This change resolves that issue by prioritizing agent specific chat
model from DB (over user or server chat model).
- Fix code context data type for validation on server. This would
prevent the chat message from being written to history
- Handle null code results on web app
We now pass deeply typed chat messages throughout the application to
construct tool specific chat history views since 05d4e19cb.
This ChatMessageModel didn't allow intent.query to be unset. But
interrupted research iteration history can have unset query. This
changes allows makes intent.query optional.
It also uses message by user entry to populate user message in tool
chat history views. Using query from khoj intent was an earlier
shortcut used to not have to deal with message by user. But that
doesn't scale to current scenario where turns are not always required
to have a single user, assistant message pair.
Specifically a chat history can now contain multiple user messages
followed by a single khoj message. The new change constructs a chat
history that handles this scenario naturally and makes the code more
readable.
Also now only previous research iterations that completed are
populated. Else they do not serve much purpose.
Clean non useful slash commands to make chat API more maintanable.
- App version, chat model via /help is visible in other parts of the
UX. Asking help questions with site:docs.khoj.dev filter isn't used
or known to folks
- /summarize is esoterically tuned. Should be rewritten if add back.
It wasn't being used by /research already
- Automations can be configured via UX. It wasn't being shown in UX
already
The chat actor (and director) tests haven't been looked into in a long
while. They'd gone stale in how they were calling thee functions. And
what was required to run them. Now the online chat actor tests work
again.
Using model specific extract questions was an artifact from older
times, with less guidable models.
New changes collate and reuse logic
- Rely on send_message_to_model_wrapper for model specific formatting.
- Use same prompt, context for all LLMs as can handle prompt variation.
- Use response schema enforcer to ensure response consistency across models.
Extract questions (because of its age) was the only tool directly within
each provider code. Put it into helpers to have all the (mini) tools
in one place.
- Rename GET /api/automations to GET /api/automation
- Rename POST /api/trigger/automation to POST /api/automation/trigger
- Update calls to the automations API from the web app.
- Add context based on information provided rather than conversation
commands. Let caller handle passing appropriate context to ai
provider converse methods
Increase timeout to 180 (from 120s previous) and graceful timeout to
90 (from 30s default) to reduce
Increase default gunicorn workers and make it configurable to better
utilize (v)CPUs. This is manually configured (instead of using
multiprocessing.cpu_count()) as VMs/containers may read cpu count of
host machine instead of their VMs/containers.
The chat dictionary is an artifact from earlier non-db chat history
storage. We've been ensuring new chat messages have valid type before
being written to DB for more than 6 months now.
Move to using the deeply typed chat history helps avoids null refs,
makes code more readable and easier to reason about.
Next Steps:
The current update entangles chat_history written to DB
with any virtual chat history message generated for intermediate
steps. The chat message type written to DB should be decoupled from
type that can be passed to AI model APIs (maybe?).
For now we've made the ChatMessage.message type looser to allow
for list[dict] type (apart from string). But later maybe a good idea
to decouple the chat_history recieved by send_message_to_model from
the chat_history saved to DB (which can then have its stricter type check)
- Converts response schema into a anthropic tool call definition.
- Works with simple enums without needing to rely on $defs, $refs as
unsupported by Anthropic API
- Do not force specific tool use as not supported with deep thought
This puts anthropic models on parity with openai, gemini models for
response schema following. Reduces need for complex json response
parsing on khoj end.