Commit Graph

  • ceb1d82bf6 Create khoj computer via cloud build. Add computer to docker-compose.yml Debanjum 2025-05-31 21:39:38 -07:00
  • 68f7aae71c Install claude 4 sonnet, latest gemini 2.5s when configure on first run Debanjum 2025-05-31 20:51:32 -07:00
  • b90b724f9a Disable openai, binary operator agents until they become useful Debanjum 2025-05-31 18:48:40 -07:00
  • 830a1af69e Render operator train of thought as video on web app to ease viewing Debanjum 2025-05-31 04:31:23 -07:00
  • 6821bd38ed Fix mypy typing errors in operator environment files Debanjum 2025-05-31 02:59:53 -07:00
  • c5c06a086e Fix, improve openai operator agent for interrupts, computer environment Debanjum 2025-05-30 21:14:43 -07:00
  • f517566560 Improve invoking keybindings on computer always using lowercase keys Debanjum 2025-05-31 02:15:54 -07:00
  • 2558ac7f18 Show thinking and engage deep thought for gemini 2.5 model series Debanjum 2025-05-30 21:10:18 -07:00
  • cecbfe35e2 Rename compile response into a private operator agents function Debanjum 2025-05-30 20:31:43 -07:00
  • ded1db642c Get max context for user, operator model pair for context compression Debanjum 2025-05-30 16:44:01 -07:00
  • 7eaf0e80c5 Get max prompt size for given user, model via reusable functions Debanjum 2025-05-30 16:40:53 -07:00
  • 3797f03625 Log ai model usage on every call to get_chat_usage_metrics in debug mode Debanjum 2025-05-30 16:02:45 -07:00
  • 4cb900658d Cache system prompt, tools of anthropic operator agent for efficiency Debanjum 2025-05-30 15:56:11 -07:00
  • 928e5ee8ad Cache messages to anthropic models from chat actors for efficiency Debanjum 2025-05-30 15:47:43 -07:00
  • 0d1e6b0d53 Do not overwrite system_prompt for idempotent AI API calls retry Debanjum 2025-05-30 15:43:25 -07:00
  • e0ea151f20 Implement file editor and terminal tools, in-built in claude Debanjum 2025-05-30 04:21:45 -07:00
  • 21bf7f1d6d Continue interrupted operator run with new query and previous context Debanjum 2025-05-29 20:49:55 -07:00
  • de35d91e1d Pass previous trajectory to operator agents for context Debanjum 2025-05-29 18:31:01 -07:00
  • 864e0ac8b5 Simplify research iteration and main research function names Debanjum 2025-05-29 15:04:35 -07:00
  • 6c9d569a22 Fix to get user questions in chat history from user not khoj message Debanjum 2025-05-29 11:02:24 -07:00
  • b6aa77a6f5 Lookback 3 previous turns to select next tool, for questions history Debanjum 2025-05-29 11:23:01 -07:00
  • d511cbfa34 Extract constructing question history into shared function for reuse Debanjum 2025-05-29 10:59:13 -07:00
  • da663e184c Type operator results. Enable storing, loading operator trajectories. Debanjum 2025-05-28 21:53:33 -07:00
  • 675fc0ad05 Decouple trajectory compression from `act'. Reuse func to call llm api Debanjum 2025-05-28 16:06:24 -07:00
  • b027024c42 Handle failed operator agent calls to anthropic api more gracefully Debanjum 2025-05-28 00:33:30 -07:00
  • d54bfc19e5 Add trajectory compression to anthropic operator agent Debanjum 2025-05-28 00:28:34 -07:00
  • cb451fa67c Put default summarize prompt into operator agent Debanjum 2025-05-27 21:06:21 -07:00
  • 99fdd91a01 Latch to bottom instantly and well when auto scroll chat stream on web Debanjum 2025-05-27 17:53:25 -07:00
  • 253656b634 Fix engaging anthropic api cache for operator trajectories. Debanjum 2025-05-27 17:34:23 -07:00
  • faecbdb7d8 Enable operators to use computers Debanjum 2025-05-26 18:16:54 -07:00
  • 771909f76a Implement docker computer environment for operator Debanjum 2025-05-27 15:22:07 -07:00
  • e117f57f64 Implement local computer environment for operator Debanjum 2025-05-26 15:53:00 -07:00
  • 7eab87bfdf Generalize operator to operate multiple types of environment Debanjum 2025-05-26 17:56:34 -07:00
  • c0689b2740 Easily interrupt and redirect khoj's research direction via chat Debanjum 2025-05-27 17:57:21 -07:00
  • c9e6b8e88d Align expected types to actual returned types by AI APIs, operator Debanjum 2025-05-26 00:37:26 -07:00
  • c1c1fc6265 Make send message validation more robust on web app Debanjum 2025-05-22 16:51:01 -07:00
  • 6cb512d9cf Support natural interrupt and send query behavior from web app Debanjum 2025-05-21 13:28:12 -07:00
  • 2b7dd7401b Continue interrupt queries only after previous query written to DB Debanjum 2025-05-22 16:26:07 -07:00
  • 3cd6e1a9a6 Save and restore research from partial state Debanjum 2025-05-20 16:00:40 -07:00
  • a83c36fa05 Validate operator, research, context.query fields of ChatMessage Debanjum 2025-05-23 02:36:58 -07:00
  • 02ee4e90a2 Pass doc/web/code/operator context as list[dict] of message content Debanjum 2025-05-21 23:04:37 -07:00
  • 98b56316e4 Support constructing chat message as a list of dictionaries Debanjum 2025-05-21 22:57:56 -07:00
  • df9ab51fd0 Track research results as iteration list instead of iteration summaries Debanjum 2025-05-20 15:31:33 -07:00
  • 5d65fa8698 Use Django timezone funcs to make datetimes in DB timezone aware Debanjum 2025-05-24 18:22:42 -07:00
  • 231aa1c0df Support claude 4 models. Engage reasoning, operator. Track costs etc. Debanjum 2025-05-22 14:57:53 -07:00
  • dca17591f3 Handle parsing json from string with plain text suffix Debanjum 2025-05-22 20:54:28 -07:00
  • acebb90643 Mention keys expected in prompt to next research tool selector Debanjum 2025-05-22 14:59:54 -07:00
  • e968cca273 Clean usage of conversation_id in chat API function Debanjum 2025-05-23 14:34:55 -07:00
  • a76032522e Add type hints to function args calling anthropic model api Debanjum 2025-05-22 14:56:09 -07:00
  • 97c5222b04 Set type hints and reorder args of all converse_[provider] methods Debanjum 2025-05-21 16:25:32 -07:00
  • 2ea16298aa Create Operator Framework. Enable Khoj to Operate Web Browser (#1174) Debanjum 2025-05-20 01:30:36 -07:00
  • 19b4c18b69 Configure max iterations per operator run via environment variable Debanjum 2025-05-20 01:03:11 -07:00
  • 06a1a22e3b Align generic grounding agent's interface with uitars grounding agent Debanjum 2025-05-19 10:02:14 -07:00
  • 0ce74e0329 Show operator context when use operator in default and research mode Debanjum 2025-05-17 17:47:39 -07:00
  • cc355f93fc Use operator context consistently as a dict[str, str] of query, result Debanjum 2025-05-19 09:21:42 -07:00
  • 07e33994f0 Reduce scroll amount to have previous page stay a bit on screen Debanjum 2025-05-13 00:01:29 -06:00
  • e2c1b1fcd3 Add dev container config to ease setup for remote development Debanjum 2025-05-19 19:23:57 -07:00
  • fdb681ca0e Only install desktop, obsidian app from dev_setup.sh with --full flag Debanjum 2025-05-19 22:15:10 -07:00
  • 33dd4c8c33 Handle gemini returning simple string in response candidates Debanjum 2025-05-19 18:35:09 -07:00
  • 626ced8b8b Fix adding code results to chatml messages context Debanjum 2025-05-19 18:33:32 -07:00
  • ded753ff9a Improve parsing tool use coordinate returned by claude operator agent Debanjum 2025-05-12 17:30:08 -06:00
  • 473dd006d5 Remove unnecessary images conversion to png in binary operator agent. Debanjum 2025-05-11 15:20:45 -06:00
  • 9f3fbf9021 Encourage reasoner, grounder to work better together in binary operator Debanjum 2025-05-10 16:33:44 -06:00
  • ac19f6d336 Improve operator exception handling Debanjum 2025-05-10 16:31:48 -06:00
  • 59e0e092b0 Remove deprecated prompt for grounding model to choose goto, back func Debanjum 2025-05-10 16:27:53 -06:00
  • 1442a4f6fb Handle reasoning messages returned by openai cua model Debanjum 2025-05-10 02:17:58 -06:00
  • 95f211d03c Resolve mypy typing errors in operator code Debanjum 2025-05-09 19:51:57 -06:00
  • 33689feb91 Handle more openai response types for better rendering and error avoidance Debanjum 2025-05-09 19:07:39 -06:00
  • 3a75cd3c3d Only trigger claude, openai monolithic operators with specific models Debanjum 2025-05-09 15:26:54 -06:00
  • 258b5a0372 Show operator screenshots with reasoning in train of thought on web app Debanjum 2025-05-09 14:49:37 -06:00
  • 21a9556b06 Show formatted action, env screenshot after action on each operator step Debanjum 2025-05-09 14:47:31 -06:00
  • a1d712e031 Add current cursor position to browser screenshots for ai, human view Debanjum 2025-05-09 14:46:03 -06:00
  • 1be3986537 Require explicit switch to enable operator locally for now Debanjum 2025-05-09 00:41:39 -06:00
  • b395a438d0 Fix handling multiple actions requested by grounding agent in an iteration Debanjum 2025-05-08 23:35:11 -06:00
  • e5415bdaee Only reasoning agent should terminate run, not the grounding agent. Debanjum 2025-05-08 23:31:55 -06:00
  • ffe58d2ec1 Parse goto, back actions directly from instruction for uitars grounder Debanjum 2025-05-08 21:31:01 -06:00
  • 7395af3c3a Allow visual grounder of binary operator agent to see past actions Debanjum 2025-05-08 20:27:09 -06:00
  • d8bc6239f8 Bifurcate visual grounder into a ui-tars specific & generic grounder Debanjum 2025-05-08 18:25:56 -06:00
  • c3bfb15fab Support KeyUp, KeyDown operator actions. Make coordinates into floats Debanjum 2025-05-08 15:18:48 -06:00
  • b279060e2c Enable using Operator with Gemini models Debanjum 2025-05-08 11:11:28 -06:00
  • 0d8fb667ec Add action results for multiple actions similar to other operator agents Debanjum 2025-05-08 11:02:19 -06:00
  • e17c06b798 Set operator query on init. Pass summarize prompt to summarize func Debanjum 2025-05-08 09:25:43 -06:00
  • 38bcba2f4b Make back action in browser environment use goto to avoid timeouts Debanjum 2025-05-08 08:26:47 -06:00
  • fd139d4708 Improve termination on task completion for binary operator agent Debanjum 2025-05-08 08:24:44 -06:00
  • 680c226137 Use any supported vision model as reasoner for binary operator agent Debanjum 2025-05-07 19:25:48 -06:00
  • 3839d83b90 Modularize operator into separate files for agent, action, environment etc Debanjum 2025-05-06 12:52:22 -06:00
  • 833c8ed150 Add a flexible operator agent using separate reasoning, grounder models Debanjum 2025-05-05 23:22:56 -06:00
  • 773d20a26f Improve instructions to the openai operator agent. Debanjum 2025-05-05 09:38:26 -06:00
  • 4db888cd62 Simplify operator loop. Make each OperatorAgent manage state internally. Debanjum 2025-05-04 18:39:12 -06:00
  • a1c9c6b2e3 Add pages visited via browser operator to references returned to clients Debanjum 2025-05-04 01:27:18 -06:00
  • e71575ad1a Render screenshot in train of thought on openai agent screenshot action Debanjum 2025-05-04 00:24:25 -06:00
  • 78e052bfcb Decouple environment from operator agent to improve modularity Debanjum 2025-05-03 18:38:19 -06:00
  • 7c60e04efb Pull out common iteration loop into main browser operator method Debanjum 2025-05-03 15:24:39 -06:00
  • 08e93c64ab Render screenshot in train of thought on browser screenshot action Debanjum 2025-04-28 16:49:22 -06:00
  • 188b3c85ae Force open links in current page to stay in operator page context Debanjum 2025-04-28 14:06:31 -06:00
  • 20f87542e5 Add cancellation support to browser operator via asyncio.Event Debanjum 2025-04-07 21:22:44 +05:30
  • 9f75622346 Allow browser operator to use browser with existing context over CDP Debanjum 2025-03-29 18:03:02 +05:30
  • b9ea538b02 Support operating web browser with Anthropic models Debanjum 2025-03-24 15:18:20 +05:30
  • 2e86141575 Enable Khoj to use a GUI web browser. Operate it with Openai models Debanjum 2025-03-12 17:14:33 +05:30
  • ab5d0b5878 Upgrade server dependencies Debanjum 2025-05-19 16:28:21 -07:00