- Process chat history in default order instead of processing it in
reverse. Improve legibility of context construction for minor
performance hit in dropping message from front of list.
- Handle multiple system messages by collating them into list
- Remove logic to drop system role for gemma-2, o1 models. Better to
make code more readable than support old models.
- Cache last anthropic message. Given research mode now uses function
calling paradigm and not the old research mode structure.
- Cache tool definitions passed to anthropic models
- Stop dropping first message if by assistant as seems like Anthropic
API doesn't complain about it any more.
- Drop tool result when tool call is truncated as invalid state
- Do not truncate tool use message content, just drop the whole tool
use message.
AI model APIs need tool use assistant message content in specific
form (e.g with thinking etc.). So dropping content items breaks
expected tool use message content format.
Handle tool use scenarios where iteration query isn't set for retry
Fix for issue is in tenacity 9.0.0. But older langchain required
tenacity <0.9.0.
Explicitly pin version of langchain sub packages to avoid indexing
and doc parsing breakage.
Gemini doesn't work well when trying to output json objects. Using it
to output raw json strings with complex, multi-line structures
requires more intense clean-up of raw json string for parsing
GPT-4o-mini is cheaper, smarter and can hold more context than
GPT-3.5-turbo. In production, we also default to gpt-4o-mini, so makes
sense to upgrade defaults and tests to work with it
Previously was assuming the system prompt is being always passed as
the first message. So expected there to be at least 2 messages in logs.
This broke chat actors querying with single long non system message.
A more robust way to extract system prompt is via the message role
instead