Debanjum
f0513cbbb1
Fix to run new automation api tests in ci
2025-07-08 23:27:09 -07:00
Debanjum
58f44ad43b
Make release/1.x a privileged branch to run workflows, create releases
...
It'll work similar to the master branch but with pre-1x and latest-1x
tagged series of docker images.
This should ease deployment changes from 1.x vs 2.x series
2025-07-06 09:28:16 -07:00
Debanjum
e87be4edf4
Pin python version used by github workflow to publish to pypi
...
Avoids having to update python path to write web app static build
files to everytime patch version of python is updated
2025-06-11 13:30:15 -07:00
Debanjum
ddf028f7af
Fix khoj computer image name used in docker-compose.yml instead
2025-06-01 16:44:28 -07:00
Debanjum
c6cc709f62
Fix khoj computer image name and only build it once for each arch
2025-06-01 16:36:46 -07:00
Debanjum
ceb1d82bf6
Create khoj computer via cloud build. Add computer to docker-compose.yml
2025-05-31 21:39:38 -07:00
Debanjum
22cd638add
Fix handling unset openai_base_url to run eval with openai chat models
...
The github run_eval workflow sets OPENAI_BASE_URL to empty string.
The ai model api created during initialization for openai models gets
set to empty string rather than None or the actual openai base url
This tries to call llm at to empty string base url instead of the
default openai api base url, which obviously fails.
Fix is to map empty base url's to the actual openai api base url.
2025-05-19 16:19:43 -07:00
Debanjum
ab29ffd799
Fix web app packaging for pypi since upgrade to python 3.11.12 in CI
2025-04-19 18:03:29 +05:30
Debanjum
c1912f8ca7
Default eval to use 10 iterations for research mode
2025-04-05 10:09:58 +05:30
Debanjum
e9928d3c50
Eval more model, control randomization & auto read webpage via workflow
...
- Control auto read webpage via eval workflow. Prefix env var with KHOJ_
Default to false as it is the default that is going to be used in prod
going forward.
- Set openai api key via input param in manual eval workflow runs
- Simplify evaluating other chat models available over openai
compatible api via eval workflow.
- Mask input api key as secret in workflow.
- Discard unnecessary null setting of env vars.
- Control randomization of samples in eval workflow.
If randomization is turned off, it'll take the first SAMPLE_SIZE
items from the eval dataset instead of a random collection of
SAMPLE_SIZE items.
2025-04-04 20:11:00 +05:30
Debanjum
0dcb2544d7
Use embedded postgres instead of postgres server for eval workflow
2025-04-04 20:11:00 +05:30
Debanjum
66e9ddb6be
Support OpenAI (API compatible) models and Firecrawl in eval workflow
2025-04-03 14:03:29 +05:30
Debanjum
d4b0ef5e93
Fix ability to disable code and internet providers in eval workflow
...
Sets env vars to empty if condition not met so:
- Terrarium (not e2b) used as code sandbox on release triggered eval
- Internet turned off for math500 eval
2025-03-25 14:04:16 +05:30
Debanjum
6cc5a10b09
Disable SimpleQA eval on release as saturated & low signal for usecase
...
Reaching >94% in research mode on SimpleQA. When answers can be
researched online, it becomes too easy. And the FRAMES eval does a
more thorough job of evaluating that use-case anyway.
2025-03-22 08:05:12 +05:30
Debanjum
dc473015fe
Set default model, sandbox to display in eval workflow summary on release
2025-03-20 14:44:56 +05:30
Debanjum
931f555cf8
Configure max allowed iterations in research mode via env var
2025-03-18 18:15:50 +05:30
Debanjum
c133d11556
Improvements based on code feedback
2025-03-09 18:23:30 +05:30
Debanjum
94ca458639
Set default chat model to KHOJ_CHAT_MODEL env var if set
...
Simplify code log to set default_use_model during init for readability
2025-03-09 18:23:30 +05:30
Debanjum
45fb85f1df
Add E2B as an optional code sandbox provider
...
- Specify E2B api key and template to use via env variables
- Try load, use e2b library when E2B api key set
- Fallback to try use terrarium sandbox otherwise
- Enable more python packages in e2b sandbox like rdkit via custom e2b template
- Use Async E2B Sandbox
- Parallelize file IO with sandbox
- Add documentation on how to enable E2B as code sandbox instead of Terrarium
2025-03-09 18:23:30 +05:30
Debanjum
b4183c7333
Default to gemini 2.0 flash instead of 1.5 flash on Gemini setup
...
Add price of gemini 2.0 flash for cost calculations
2025-03-07 13:48:15 +05:30
Debanjum
701a7be291
Stop code sandbox on request timeout to allow sandbox process restarts
2025-03-07 13:48:15 +05:30
sabaimran
fd90842d38
Bump postgresql server dev version to 16 for latest ubuntu
2025-01-22 19:07:54 -08:00
sabaimran
8fe08eecce
add --break-system-packages to bypass venv requirement
2025-01-20 00:21:27 -08:00
sabaimran
bf58d9430b
downgrade postgres server pkg to 16
2025-01-20 00:15:56 -08:00
sabaimran
95ad1f936e
upgrade postgres server to 17
2025-01-20 00:10:20 -08:00
sabaimran
a214bd4100
upgrade pg server dev version to 15
2025-01-20 00:05:35 -08:00
sabaimran
82ff74cfa9
Run on container with ubuntu latest for pytest gh action workflow
2025-01-19 23:57:57 -08:00
sabaimran
af9e906cb5
Use python3 instead of python when running pip install commands in gh actions
2025-01-17 17:48:42 -08:00
Debanjum
6bd9f6bb61
Give a shorter, simpler name to github workflow to deploy docs
2025-01-12 10:54:56 +07:00
sabaimran
bac90ad69d
Upgrade deploy-pages action to vv4
2025-01-09 19:04:31 -08:00
Debanjum
2069f571c8
Upgrade upload-artifact gh action to v4 as <=v3 deprecated
...
This started failing github workflow jobs
2025-01-10 00:41:24 +07:00
sabaimran
92144c8102
Remove release step in todesktop flow, since we need to run releases manually now
...
- Leaving it commented out for the time being so we can revisit automating this later
2024-12-17 16:02:45 -08:00
Debanjum
10bd56d2b9
Attest Khoj pypi package by upgrading pypi publish gh action
...
- Print hash in CI to ease verifying ci built python package matches
khoj package published on pypi
- Newer pypi publish github action should speed up workflow by ~30s
2024-12-17 13:40:39 -08:00
Debanjum
df15f00243
Tag docker images with latest tag in dockerize workflow on release
2024-12-17 13:18:51 -08:00
sabaimran
f6abfcfa6b
Use latest release version for pypi gh action to publish
2024-12-17 12:19:42 -08:00
sabaimran
e74e922cea
Update file path of python installation
2024-12-12 16:50:32 -08:00
Debanjum
2db7a1ca6b
Restart code sandbox on crash in eval github workflow ( #1007 )
...
See
e3fed3750b
for corresponding change to use pm2 to auto-restart code sandbox
2024-12-12 14:32:03 -08:00
Debanjum
9eb863e964
Restart code sandbox on crash in eval github workflow
2024-12-12 11:28:54 -08:00
Debanjum
59008ae90e
Use buildx to create multi platform docker image
2024-12-11 00:21:29 -08:00
Debanjum
ec797bc6b8
Build docker imgs on native arch runners to avoid manifest list error
...
This also avoids the need to use --amend and annotate steps when
creating the multi-arch docker images
2024-12-10 23:16:36 -08:00
Debanjum
5f7b13df2d
Fix new docker tags in workflow to not include forward slashes
2024-12-10 22:55:33 -08:00
Debanjum
ba6237b5c0
Fix to create multi-arch builds. Stop docker image overwrites in workflow
2024-12-10 21:08:17 -08:00
sabaimran
44ede26e67
Temporarily disable cloud arm builds while we disambiguate the build issues
2024-12-10 20:00:59 -08:00
sabaimran
9c403d24e1
Fix reference to directory in the eval workflow for starting terrarium
2024-12-08 13:03:05 -08:00
sabaimran
6940c6379b
Add sudo when running installations in order to install relevant packages
...
add --legacy-peer-deps temporarily to see if it helps mitigate the issue
2024-12-08 11:11:13 -08:00
sabaimran
4c4b7120c6
Use Khoj terrarium fork instead of building from official Cohere repo
2024-12-08 11:06:33 -08:00
sabaimran
2dfd163430
Add more explicity run strategies in the runner matrix
2024-11-28 19:31:34 -08:00
sabaimran
80cd902c86
Since linux/amd64 images aren't being created, try setting a custom description on the image
...
Refer to this GH documentation on working with multi arch images in the container registry:
https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#adding-a-description-to-multi-arch-images
2024-11-28 19:14:06 -08:00
Debanjum
29e801c381
Add MATH500 dataset to eval
...
Evaluate simpler MATH500 responses with gemini 1.5 flash
This improves both the speed and cost of running this eval
2024-11-28 12:48:25 -08:00
Debanjum
22aef9bf53
Add GPQA (diamond) dataset to eval
2024-11-28 12:48:25 -08:00