klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-03-02 21:19:12 +00:00

Author	SHA1	Message	Date
Debanjum	bdb6e33108	Install pgserver only when `pip install khoj[local]' is enabled This avoids installing pgserver on linux arm64 docker builds, which it doesn't currently support and isn't required to support as Khoj docker images can use standard postgres server made available via our docker-compose.yml	2025-03-29 00:27:19 +05:30
Debanjum	5ee513707e	Use embedded postgres db to simplify self-hosted setup (#1141 ) Use pgserver python package as an embedded postgres db, installed directly as a khoj python package dependency. This significantly simplifies self-hosting with just a `pip install khoj'. No need to also install postgres separately. Still use standard postgres server for multi-user, production use-cases.	2025-03-29 00:03:55 +05:30
Debanjum	55ae0eda7a	Upgrade package dependencies nextjs for web app and torch on server	2025-03-23 17:10:40 +05:30
Debanjum	f2b438145f	Upgrade sentence-transformers. Avoid transformers v4.50.0 as problematic - The 3.4.1 release of sentence tranformer fixes offline load latency of sentence transformer models (and Khoj) by avoiding call to HF - The 4.50.0 release of transformers is resulting in jax error (unexpected keyword argument 'flatten_with_keys') on load.	2025-03-23 09:02:57 +05:30
Debanjum	510cbed61c	Make google auth package dependency explicit to simplify code Previously google auth library was explicitly installed only for the cloud variant of Khoj to minimize packages installed for non production use-cases. But it was being implicitly installed as a dependency of an explicit package in the default installation anyway. Making the dependency on google auth package explicit simplifies the conditional import of google auth in code while not incurring any additional cost in terms of space or complexity.	2025-03-23 09:02:57 +05:30
Debanjum	79816d2b9b	Upgrade package dependencies of server, clients and docs	2025-03-12 00:22:08 +05:30
Debanjum	bdfa6400ef	Upgrade to new Gemini package to interface with Google AI	2025-03-11 22:18:07 +05:30
Debanjum	50f71be03d	Support Claude 3.7 and use its extended thinking in research mode Claude 3.7 Sonnet is Anthropic's first reasoning model. It provides a single model/api capable of standard and extended thinking. Utilize the extended thinking in Khoj's research mode. Increase default max output tokens to 8K for Anthropic models.	2025-03-11 21:27:59 +05:30
Debanjum	45fb85f1df	Add E2B as an optional code sandbox provider - Specify E2B api key and template to use via env variables - Try load, use e2b library when E2B api key set - Fallback to try use terrarium sandbox otherwise - Enable more python packages in e2b sandbox like rdkit via custom e2b template - Use Async E2B Sandbox - Parallelize file IO with sandbox - Add documentation on how to enable E2B as code sandbox instead of Terrarium	2025-03-09 18:23:30 +05:30
Debanjum	6e955e158b	Use normalized email address for new users Not check email deliverability for now to allow air-gapped usage or authenticated/multi-user setups with admin managed otp Closes #1069	2025-01-11 12:28:40 +07:00
Debanjum	1a43ca75f3	Update to latest jinja python package dependency	2024-12-27 01:44:41 -08:00
sabaimran	3b050a33bb	Include resend as a default dependency, rather than restricting to prod	2024-12-16 22:24:41 -08:00
Debanjum	4bc5c1357a	Upgrade server, documentation dependencies. Spell fix docker-compose.yml	2024-12-10 15:47:47 -08:00
Debanjum	354dc12b3b	Style the Admin Panel with a modern theme and Khoj branding (#999 ) Overview - The default django admin panel UI looks pretty dated and didn't have any Khoj specific branding - Used the Unfold Django admin panel theme for a modern look - Used the Khoj logo and name in Admin panel title, headings, favicons Details: All models shown on Admin panel need to inherit from unfold's ModelAdmin to get styling applied. So - Make all models on Admin panel inherit from unfold's ModelAdmin - Subclassed UserAdmin to inherit from unfold's ModelAdmin - Deregistered the unused Auth Group model from the Admin panel We can add it back when its actually used. Avoid confusion for now - Explicitly register DjangoJobExecution on admin panel and again make it inherit from the unfold.admin.ModelAdmin	2024-12-04 23:53:43 -08:00
Debanjum	8c120a5139	Fallback to json5 loader if json.loads cannot parse complex json str JSON5 spec is more flexible, try to load using a fast json5 parser if the stricter json.loads from the standard library can't load the raw complex json string into a python dictionary/list	2024-11-26 21:17:00 -08:00
Debanjum	96904e0769	Add script to evaluate khoj on Google's FRAMES benchmark Google's FRAMES benchmark evaluates multi-step retrieval and reasoning capabilities of an agent. The script uses Gemini as an LLM Judge to evaluate Khoj responses to the FRAMES benchmark prompts against the ground truth provided by it.	2024-11-02 04:57:42 -07:00
Debanjum	31b5fde163	Only enable prompt tracer if git python is installed	2024-11-02 02:07:02 -07:00
sabaimran	8d1b1bc78e	Move the git python dependency into top level dependencies	2024-11-01 22:51:00 -07:00
Debanjum	b3a63017b5	Support setting seed for reproducible LLM response generation Anthropic models do not support seed. But offline, gemini and openai models do. Use these to debug and test Khoj via KHOJ_LLM_SEED env var	2024-10-30 14:00:21 -07:00
Debanjum Singh Solanky	10c8fd3b2a	Save conversation traces to git for visualization	2024-10-26 04:59:19 -07:00
sabaimran	db959a504d	Fix the version of pymupdf to avert build errors	2024-10-21 12:56:51 -07:00
Debanjum Singh Solanky	7ebfc24a96	Upgrade Django version used by Khoj server	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	9b10b3e7a1	Remove unused langchain openai server dependency	2024-09-29 04:06:35 -07:00
Debanjum Singh Solanky	077b88bafa	Make RapidOCR dependency optional as flaky requirements RapidOCR depends on OpenCV which by default requires a bunch of GUI paramters. This system package dependency set (like libgl1) is flaky Making the RapidOCR dependency optional should allow khoj to be more resilient to setup/dependency failures Trade-off is that OCR for documents may not always be available and it'll require looking at server logs to find out when this happens	2024-09-19 15:10:31 -07:00
Alexander Matyasko	9570933506	Support Google's Gemini model series (#902 ) * Add functions to chat with Google's gemini model series * Gracefully close thread when there's an exception in the gemini llm thread * Use enums for verifying the chat model option type * Add a migration to add the gemini chat model type to the db model * Fix chat model selection verification and math prompt tuning * Fix extract questions method with gemini. Enforce json response in extract questions. * Add standard stop sequence for Gemini chat response generation --------- Co-authored-by: sabaimran <narmiabas@gmail.com> Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-12 18:17:55 -07:00
Debanjum Singh Solanky	72fbbc092c	Upgrade Django, FastAPI, Uvicorn packages - Update Django to 5.0.8 - Update Uvicorn to 0.30.6 - Update FastAPI minimum versions to 0.110.0	2024-09-09 10:40:53 -07:00
sabaimran	7b8b3a66ae	Revert django version to previous patch	2024-08-23 11:12:41 -07:00
Debanjum Singh Solanky	a60baa55fb	Upgrade Django, a Khoj server dependency, to version 5.0.8	2024-08-20 12:32:00 -07:00
Debanjum Singh Solanky	acdc3f9470	Unwrap any json in md code block, when parsing chat actor responses This is a more robust way to extract json output requested from gemma-2 (2B, 9B) models which tend to return json in md codeblocks. Other models should remain unaffected by this change. Also removed request to not wrap json in codeblocks from prompts. As code is doing the unwrapping automatically now, when present	2024-08-16 14:16:29 -05:00
Debanjum Singh Solanky	1cdfa8087c	Update Khoj tagline to "Your Second Brain"	2024-08-05 02:27:05 +05:30
Debanjum Singh Solanky	53eabe0c06	Support Gemma 2 for Offline Chat - Pass system message as the first user chat message as Gemma 2 doesn't support system messages - Use gemma-2 chat format - Pass chat model name to generic, extract questions chat actors Used to figure out chat template to use for model For generic chat actor argument was anyway available but not being passed, which is confusing	2024-07-18 03:09:38 +05:30
Debanjum Singh Solanky	583fa3c188	Migrate the pypi package to khoj project name. Update references - Deprecate khoj-assistant pypi package. Use more accurate and succinct pypi project name, khoj - Update references to sye khoj pypi package in docs and code instead of the legacy khoj-assistant pypi package - Update pypi workflow to publish to both khoj, khoj-assistant for now - Update stale python 3.9 support mentioned in our pyproject. Can't support python 3.9 as depend on latest django which support >=3.10	2024-07-17 10:41:16 +05:30
Debanjum Singh Solanky	02658ad4fd	Upgrade Django version	2024-07-11 16:35:10 +05:30
sabaimran	260aa61818	Remove tests for python3.9	2024-07-09 12:28:11 +05:30
sabaimran	4471c1e37f	Apply mitigations for piling up open connections - Because we're using a FastAPI api framework with a Django ORM, we're running into some interesting conditions around connection pooling and clean-up. We're ending up with a large pile-up of open, stale connections to the DB recurringly when the server has been running for a while. To mitigate this problem, given starlette and django run in different python threads, add a middleware that will go and call the connection clean up method in each of the threads.	2024-07-09 12:22:58 +05:30
Debanjum Singh Solanky	a353d883a0	Make it optional to set the encoder, cross-encoder configs via admin UI Upgrade sentence-transformer, add einops dependency for some sentence transformer models like nomic	2024-07-05 16:09:30 +05:30
Debanjum Singh Solanky	0d04018622	Install pydantic with optional email validator package Otherwise Khoj fails on startup. Not sure why, must be new changes to pydantic?	2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky	22f6db0a6b	Upgrade RapidOCR and enable for Python 3.12. Fix PDF OCR test	2024-06-22 16:01:55 +05:30
Debanjum Singh Solanky	55a23eae25	Upgrade pillow to fix pytest workflow failure	2024-06-22 15:17:43 +05:30
Raghav Tirumale	bd3b590153	Support Indexing Docx Files (#801 ) * Add support for indexing docx files and associated unit tests --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-20 11:18:01 +05:30
sabaimran	a57e1e7a14	Fix langchain, tenacity versions	2024-06-17 14:52:11 +05:30
sabaimran	ce9c14f894	Fix more packages related to langchain in the pyproject.toml	2024-06-17 14:38:05 +05:30
Debanjum Singh Solanky	179c70dba8	Upgrade Khoj llama-cpp, django and jinja dependencies	2024-06-04 09:05:53 +05:30
sabaimran	4aac84e1c1	Pin rsesend verison in pyproject.toml	2024-05-30 07:05:11 +05:30
sabaimran	01cdc54ad0	Add support for Anthropic models (#760 ) * Add support for chatting with Anthropic's suite of models - Had to use a custom class because there was enough nuance with how the anthropic SDK works that it would be better to simply separate out the logic. The extract questions flow needed modification of the system prompt in order to work as intended with the haiku model	2024-05-26 22:50:34 +05:30
sabaimran	0b7910d4af	Pin th elangchain-community version explicitly	2024-05-21 05:26:17 -05:00
sabaimran	2b8e5a86cc	Update version for resent library in pyproject.toml	2024-05-09 13:43:27 -07:00
sabaimran	eb65532386	Use Django ap scheduler in place of the sqlalchemy one	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	230d160602	Improve rendering task scheduled settings view and message - Render crontime string in natural language in message & settings UI - Show more fields in tasks web config UI - Add link to the tasks settings page in task scheduled chat response - Improve task variables names Rename executing_query to query_to_run. scheduling_query to scheduling_request	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	3ce06a938c	Render scheduled task response as html to improve readability in email	2024-05-01 08:30:10 +05:30

1 2 3 4

158 Commits