klbr/khoj - khoj - Gitea: Git with a cup of tea

klbr/khoj

mirror of https://github.com/khoaliber/khoj.git synced 2026-03-02 13:18:18 +00:00

Author	SHA1	Message	Date
Debanjum	ddf028f7af	Fix khoj computer image name used in docker-compose.yml instead	2025-06-01 16:44:28 -07:00
Debanjum	c6cc709f62	Fix khoj computer image name and only build it once for each arch	2025-06-01 16:36:46 -07:00
Debanjum	ceb1d82bf6	Create khoj computer via cloud build. Add computer to docker-compose.yml	2025-05-31 21:39:38 -07:00
Debanjum	22cd638add	Fix handling unset openai_base_url to run eval with openai chat models The github run_eval workflow sets OPENAI_BASE_URL to empty string. The ai model api created during initialization for openai models gets set to empty string rather than None or the actual openai base url This tries to call llm at to empty string base url instead of the default openai api base url, which obviously fails. Fix is to map empty base url's to the actual openai api base url.	2025-05-19 16:19:43 -07:00
Debanjum	ab29ffd799	Fix web app packaging for pypi since upgrade to python 3.11.12 in CI	2025-04-19 18:03:29 +05:30
Debanjum	c1912f8ca7	Default eval to use 10 iterations for research mode	2025-04-05 10:09:58 +05:30
Debanjum	e9928d3c50	Eval more model, control randomization & auto read webpage via workflow - Control auto read webpage via eval workflow. Prefix env var with KHOJ_ Default to false as it is the default that is going to be used in prod going forward. - Set openai api key via input param in manual eval workflow runs - Simplify evaluating other chat models available over openai compatible api via eval workflow. - Mask input api key as secret in workflow. - Discard unnecessary null setting of env vars. - Control randomization of samples in eval workflow. If randomization is turned off, it'll take the first SAMPLE_SIZE items from the eval dataset instead of a random collection of SAMPLE_SIZE items.	2025-04-04 20:11:00 +05:30
Debanjum	0dcb2544d7	Use embedded postgres instead of postgres server for eval workflow	2025-04-04 20:11:00 +05:30
Debanjum	66e9ddb6be	Support OpenAI (API compatible) models and Firecrawl in eval workflow	2025-04-03 14:03:29 +05:30
Debanjum	d4b0ef5e93	Fix ability to disable code and internet providers in eval workflow Sets env vars to empty if condition not met so: - Terrarium (not e2b) used as code sandbox on release triggered eval - Internet turned off for math500 eval	2025-03-25 14:04:16 +05:30
Debanjum	6cc5a10b09	Disable SimpleQA eval on release as saturated & low signal for usecase Reaching >94% in research mode on SimpleQA. When answers can be researched online, it becomes too easy. And the FRAMES eval does a more thorough job of evaluating that use-case anyway.	2025-03-22 08:05:12 +05:30
Debanjum	dc473015fe	Set default model, sandbox to display in eval workflow summary on release	2025-03-20 14:44:56 +05:30
Debanjum	931f555cf8	Configure max allowed iterations in research mode via env var	2025-03-18 18:15:50 +05:30
Debanjum	c133d11556	Improvements based on code feedback	2025-03-09 18:23:30 +05:30
Debanjum	94ca458639	Set default chat model to KHOJ_CHAT_MODEL env var if set Simplify code log to set default_use_model during init for readability	2025-03-09 18:23:30 +05:30
Debanjum	45fb85f1df	Add E2B as an optional code sandbox provider - Specify E2B api key and template to use via env variables - Try load, use e2b library when E2B api key set - Fallback to try use terrarium sandbox otherwise - Enable more python packages in e2b sandbox like rdkit via custom e2b template - Use Async E2B Sandbox - Parallelize file IO with sandbox - Add documentation on how to enable E2B as code sandbox instead of Terrarium	2025-03-09 18:23:30 +05:30
Debanjum	b4183c7333	Default to gemini 2.0 flash instead of 1.5 flash on Gemini setup Add price of gemini 2.0 flash for cost calculations	2025-03-07 13:48:15 +05:30
Debanjum	701a7be291	Stop code sandbox on request timeout to allow sandbox process restarts	2025-03-07 13:48:15 +05:30
sabaimran	fd90842d38	Bump postgresql server dev version to 16 for latest ubuntu	2025-01-22 19:07:54 -08:00
sabaimran	8fe08eecce	add --break-system-packages to bypass venv requirement	2025-01-20 00:21:27 -08:00
sabaimran	bf58d9430b	downgrade postgres server pkg to 16	2025-01-20 00:15:56 -08:00
sabaimran	95ad1f936e	upgrade postgres server to 17	2025-01-20 00:10:20 -08:00
sabaimran	a214bd4100	upgrade pg server dev version to 15	2025-01-20 00:05:35 -08:00
sabaimran	82ff74cfa9	Run on container with ubuntu latest for pytest gh action workflow	2025-01-19 23:57:57 -08:00
sabaimran	af9e906cb5	Use python3 instead of python when running pip install commands in gh actions	2025-01-17 17:48:42 -08:00
Debanjum	6bd9f6bb61	Give a shorter, simpler name to github workflow to deploy docs	2025-01-12 10:54:56 +07:00
sabaimran	bac90ad69d	Upgrade deploy-pages action to vv4	2025-01-09 19:04:31 -08:00
Debanjum	2069f571c8	Upgrade upload-artifact gh action to v4 as <=v3 deprecated This started failing github workflow jobs	2025-01-10 00:41:24 +07:00
sabaimran	92144c8102	Remove release step in todesktop flow, since we need to run releases manually now - Leaving it commented out for the time being so we can revisit automating this later	2024-12-17 16:02:45 -08:00
Debanjum	10bd56d2b9	Attest Khoj pypi package by upgrading pypi publish gh action - Print hash in CI to ease verifying ci built python package matches khoj package published on pypi - Newer pypi publish github action should speed up workflow by ~30s	2024-12-17 13:40:39 -08:00
Debanjum	df15f00243	Tag docker images with latest tag in dockerize workflow on release	2024-12-17 13:18:51 -08:00
sabaimran	f6abfcfa6b	Use latest release version for pypi gh action to publish	2024-12-17 12:19:42 -08:00
sabaimran	e74e922cea	Update file path of python installation	2024-12-12 16:50:32 -08:00
Debanjum	2db7a1ca6b	Restart code sandbox on crash in eval github workflow (#1007 ) See `e3fed3750b` for corresponding change to use pm2 to auto-restart code sandbox	2024-12-12 14:32:03 -08:00
Debanjum	9eb863e964	Restart code sandbox on crash in eval github workflow	2024-12-12 11:28:54 -08:00
Debanjum	59008ae90e	Use buildx to create multi platform docker image	2024-12-11 00:21:29 -08:00
Debanjum	ec797bc6b8	Build docker imgs on native arch runners to avoid manifest list error This also avoids the need to use --amend and annotate steps when creating the multi-arch docker images	2024-12-10 23:16:36 -08:00
Debanjum	5f7b13df2d	Fix new docker tags in workflow to not include forward slashes	2024-12-10 22:55:33 -08:00
Debanjum	ba6237b5c0	Fix to create multi-arch builds. Stop docker image overwrites in workflow	2024-12-10 21:08:17 -08:00
sabaimran	44ede26e67	Temporarily disable cloud arm builds while we disambiguate the build issues	2024-12-10 20:00:59 -08:00
sabaimran	9c403d24e1	Fix reference to directory in the eval workflow for starting terrarium	2024-12-08 13:03:05 -08:00
sabaimran	6940c6379b	Add sudo when running installations in order to install relevant packages add --legacy-peer-deps temporarily to see if it helps mitigate the issue	2024-12-08 11:11:13 -08:00
sabaimran	4c4b7120c6	Use Khoj terrarium fork instead of building from official Cohere repo	2024-12-08 11:06:33 -08:00
sabaimran	2dfd163430	Add more explicity run strategies in the runner matrix	2024-11-28 19:31:34 -08:00
sabaimran	80cd902c86	Since linux/amd64 images aren't being created, try setting a custom description on the image Refer to this GH documentation on working with multi arch images in the container registry: https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#adding-a-description-to-multi-arch-images	2024-11-28 19:14:06 -08:00
Debanjum	29e801c381	Add MATH500 dataset to eval Evaluate simpler MATH500 responses with gemini 1.5 flash This improves both the speed and cost of running this eval	2024-11-28 12:48:25 -08:00
Debanjum	22aef9bf53	Add GPQA (diamond) dataset to eval	2024-11-28 12:48:25 -08:00
Debanjum	8cb0db0051	Fix llama-cpp-python install by pytest github workflow - Use pre-built wheels for torch and llama-cpp-python - Install and link musl as it's used by llama-cpp-python pre-built wheel instead of glibc - Join Install git and Install Dependencies steps in pytest workflow To remove unnecessary steps	2024-11-26 02:04:36 -08:00
Debanjum	e088fcbc7b	Build for arm64 on arm64 runner. Parallelize arm64, x64 docker builds - Building arm64 image on an ubuntu arm64 runner reduces `yarn build' step time by 75% from 12mins to 3mins. - This is because no QEMU emulation for arm64 on x86 is required now - Parallelizing x64 and arm64 platform builds halves build time on top - Revert to use standard ubuntu-latest runner as large x64 runner doesn't give much more speed improvements This results an effective additional 50%-66% reduction in build time on top of #987. So a full dockerize workflow run now takes 10 mins vs previous 35+mins. This is a total of 72% improvement in max dockerize run time. Get additional speed improvements when docker layer cache hit.	2024-11-24 23:18:55 -08:00
Debanjum	4a5646c8da	Cache docker layers, nextjs builds in dockerize github workflow	2024-11-24 21:06:22 -08:00

1 2 3 4 5

215 Commits