Reason
---
- Simplify code and logic to stream chat response by solely relying on
asyncio event loop.
- Reduce overhead of managing threads to increase efficiency and
throughput (where possible).
Details
---
- Use async/await with no threading when generating chat response via
OpenAI, Gemini, Anthropic AI model APIs
- Use threading for offline chat model as llama-cpp doesn't support
async streaming yet
# PR Summary
This small PR resolves the deprecation warnings on `datetime` in
Python3.12+. You can find them in the [CI
logs](https://github.com/khoj-ai/khoj/actions/runs/14538833837/job/40792624987#step:9:134):
```python
/__w/khoj/khoj/src/khoj/processor/content/images/image_to_entries.py:61: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
timestamp_now = datetime.utcnow().timestamp()
```
Overview
---
Enable free tier users to chat with any AI model made available on free tier
of production deployments like [Khoj cloud](https://app.khoj.dev).
Previously model switching was completely disabled for users on free tier.
Details
---
- Track price tier of each Chat, Speech, Image, Voice AI model in DB
- Update API to allow free tier users to switch between free models
- Update web app to allow model switching on agent creation, settings
chat page (via right side pane), even for free tier users.
- Update API to allow free tier users to switch between free models
- Update web app to allow model switching on agent creation, settings
chat page (via right side pane), even for free tier users.
Previously the model switching APIs and UX fields on web app were
completely disabled for free tier users
Rely on deepthought flag to control reasoning effort of low/high for
the grok model
This is different from the openai reasoning models which support
low/medium/high and for which we use low/medium effort based on the
deepthought flag
Note: grok is accessible over an openai compatible API
Disregard chart types as not using rich chart rendering
and they are duplicate of chart images that are rendered
Disregard text output associated with generated image files
Added a “Troubleshooting & Tips” section to the GCP Vertex documentation.
This section provides guidance for self-hosted users on common issues
they may encounter when setting up Google Vertex AI integration in Khoj.
Topics covered include permissions, region compatibility, prompt size
limits, API key testing, and secure key management with environment
variables. The goal is to improve the onboarding experience and reduce
setup errors for contributors and self-hosters using Vertex AI models
like Claude and Gemini.
Signed off by: brightally6@gmail.com
Server:
- Rate limit based on unverified email before creating user
- Check email address for deliverability before creating user
- Track rate limit for unverified email in new non-user keyed table
Web app:
- Show error in login popup to user on failure/throttling
- Simplify login popup logic by moving magic link handling logic
into EmailSigninContext instead of passing require props via parent
- Set chatSidebar prompt, Setting name fields to empty str if value null
- Track if agent modified in chatSidebar to simplify code, fix looping
- Suppress spurious dark mode hydration warnings on the web app
- Set key for chatMessage parent to get UX efficiently updated by react
- Let only root next.js layout handle html, body tags, not child layouts
Previously the sidebar could recurse on opening chat page (from home?)
due to child modelSelector component updating parent chatSidebar prop
which was passed back down to it in a loop.
The chatSidebar decides if agent has been modified in a single
useEffect and enables the Save button accordingly.
- Track agent modification wrt agent info received from server in
chatSidebar instead.
- Reduce modelSelector's mandate to just notify
when the user changes the model.
- Fix to infer, show & update agent state from chat sidebar on web app
This logic is fragile and convoluted because:
- the default agent chat model is dynamically determined.
- need to disambiguate tools not set vs none set vs all set by user
The default agent's tool selection is stored as undefined to show
not set scenario, which allows for all tools to be dynamically
used by agent.
But the user can also set no tools or all tools for their agents.
All 3 scenarios are handled differently.
- Track tools to be displayed vs tools to be stored
This is triggered by mismatch between "dark" class present on server
sent layout but not in client sent layout on initial render.
That mismatch exists because the server applies dark-mode styling
early to avoid FOUC flickering of UX.
Related 43e032e
Remove html, body elements from child page layouts. Let only the root
layout handle it.
Next.js router structure mounts child layouts inside parent layouts,
as defined by their directory hierarchy. So the html, body component
should only be defined in the parent layout.
This avoids the child layout mounting its html, body component within
the actual root layout's existing html, body component.
Previously the chat model associated with the default agent was always
the first chat model populated on the server. This doesn't match
behavior of the rest of the system, where the server chat settings is
preferred over the user chat settings over the first chat model.
This change brings the default agent's chat model in line with the
preference order used in the reset of the system.
Previous change to fallback to default agent was not functional. It
would error out if the conversation agent wasn't set when trying to
get conversation.agent.slug for calling aget_agent_by_slug func
We were previously relying on an older, unmaintained version of
pgvector docker image, ankane/pgvector.
Moving to new docker image requires selecting from tags based on the
pg major version (14, 15, 16 or 17).
This change uses pg15 tag to resolve image pull.
Note: we use postgres 15 for khoj docker images currently
Fixes#1154
Issue introduced in commit 5a3c7b1.
Usage of KHOJ_DOMAIN
---
KHOJ_DOMAIN is tri-state for local, official and other production deployments:
- If KHOJ_DOMAIN is unset (for local):
- sets CSRF cookie to localhost
- adds khoj.dev variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS
- adds app.khoj.dev variants to CORS origins
- If KHOJ_DOMAIN is set to empty (for official):
- sets CSRF to khoj.dev
- adds khoj.dev variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS
- adds app.khoj.dev variants to CORS origins
- If KHOJ_DOMAIN is set (for other prod deployments):
- sets CSRF cookie to KHOJ_DOMAIN
- adds KHOJ_DOMAIN variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS
- adds KHOJ_DOMAIN variants to CORS origins
Related #1137, #1152Resolves#1123
Unsure why this error triggers on every request to the Django admin
panel these days but all the requests are completing fine and the
client is clearly not aborting the request when the RequestAborted
exception is raised.
Suppress these errors for now via middleware to prevent them from
unnecessarily cluttering up the server logs and confusing folks.
Related #1152