Commit Graph

  • 0521ea10d6 Put image score breakdown under `additional' field in search response Debanjum Singh Solanky 2022-09-15 13:44:00 +03:00
  • e42a38e825 Version Khoj API, Update frontends, tests and docs to reflect it Debanjum Singh Solanky 2022-09-14 21:22:20 +03:00
  • d25e1d8e86 fix: explicitly set url-request-method Robert Irelan 2022-09-19 15:46:46 -04:00
  • ee65a4f2c7 Merge /reload, /regenerate into single /update API endpoint Debanjum Singh Solanky 2022-09-14 14:01:09 +03:00
  • 02d944030f Use Base TextToJsonl class to standardize <text>_to_jsonl processors Debanjum Singh Solanky 2022-09-14 10:53:43 +03:00
  • c16ae9e344 Ignore "Legacy way to download model" warning for upstream dependency Debanjum Singh Solanky 2022-09-14 21:29:48 +03:00
  • 3169e3b78e Use ellipsis instead of pass in base filter abstract methods for aesthetic Debanjum Singh Solanky 2022-09-14 13:29:58 +03:00
  • bf1ae038cb Get XMP metadata from image using Pillow. Remove ExifTool dependency Debanjum Singh Solanky 2022-09-14 13:22:27 +03:00
  • a53094ec92 Add workflow dispatch support in build.yml - To support dispatch, set the image label based on the branch name - Master build should still be tagged with latest to get benefit of the standard production Docker label Saba 2022-09-15 20:28:41 +03:00
  • 8f57a62675 Remove unused imports. Fix typing and indentation Debanjum Singh Solanky 2022-09-14 02:58:49 +03:00
  • be57c711fd Revert OrgNode.hasTag func to method instead of property as accepts argument Debanjum Singh Solanky 2022-09-14 02:46:41 +03:00
  • 0109c7bd91 Disable ability to call <text>_to_jsonl, <type>_search packages directly Debanjum Singh Solanky 2022-09-14 02:17:06 +03:00
  • 1680a617da Reflect updates to query and results count in URL Debanjum Singh Solanky 2022-09-13 23:39:24 +03:00
  • 34314e859a Call /reload instead of /regenerate API to update index from web interface Debanjum Singh Solanky 2022-09-12 22:47:07 +03:00
  • 13b5d5082f Create input field to set results count on the web interface Debanjum Singh Solanky 2022-09-12 22:46:48 +03:00
  • 0ce0c00090 Bump khoj version to 0.1.10 Debanjum Singh Solanky 2022-09-12 23:02:26 +03:00
  • 1bfe9c4ef2 Handle filter only queries. Short-circuit and return filtered results Debanjum Singh Solanky 2022-09-12 16:42:41 +03:00
  • afc84de234 Make word filter regex explicit. Allow hyphen in word filters Debanjum Singh Solanky 2022-09-12 16:48:58 +03:00
  • 3d86d763c5 Support Multiple Input Filters to Configure Content to Index Debanjum 2022-09-12 08:19:52 +00:00
  • 536f03af8f Process text content files in sorted order for stable indexing Debanjum Singh Solanky 2022-09-12 11:02:05 +03:00
  • a701ad08b9 Support multiple input-filters to configure content to index via khoj.yml Debanjum Singh Solanky 2022-09-12 10:39:39 +03:00
  • 940c8fac8c Use app LRU, not functools LRU decorator, to cache search results in router Debanjum Singh Solanky 2022-09-12 09:28:49 +03:00
  • c6fa09d8fc Fix querying with include word filter from web interface Debanjum Singh Solanky 2022-09-12 09:27:02 +03:00
  • 1502fbc9e9 Add index_heading_entries flag to default and sample khoj configs Debanjum Singh Solanky 2022-09-11 17:30:02 +03:00
  • 7216cdff58 Add Date, Word filter for Org-Music content Debanjum Singh Solanky 2022-09-11 17:29:34 +03:00
  • 182fbbd8df Allow Indexing Heading Entries. Improve Org, TextToJsonl Parser Debanjum 2022-09-11 13:46:11 +00:00
  • 9d369ae4df Fix OrgNode render of entries with property drawers and empty body Debanjum Singh Solanky 2022-09-11 15:54:26 +03:00
  • 253c9eae9a Set index_heading_entries field in config to index entries with no body Debanjum Singh Solanky 2022-09-11 12:40:58 +03:00
  • 1d3b3d5f39 Convert field get/set methods in OrgNode class to @property Debanjum Singh Solanky 2022-09-11 12:25:26 +03:00
  • db37e38df7 Create OrgNode hasBody method. Use it in org_to_jsonl checks Debanjum Singh Solanky 2022-09-11 10:47:44 +03:00
  • b4878d76ea Extract entries from scratch when regenerate requested Debanjum Singh Solanky 2022-09-11 10:14:08 +03:00
  • 52e3dd9835 Pass the whole TextContentConfig as argument to text_to_jsonl methods Debanjum Singh Solanky 2022-09-11 10:09:17 +03:00
  • e951ba37ad Raise exception when org file not found Debanjum Singh Solanky 2022-09-11 01:09:24 +03:00
  • c415af32d5 Support Incremental Update of Entries, Embeddings for OrgMode, Markdown, Beancount Content Debanjum 2022-09-10 21:38:05 +00:00
  • 9b2845de06 Add basic tests for beancount to jsonl conversion Debanjum Singh Solanky 2022-09-11 00:16:02 +03:00
  • d3267554ae Add basic tests for markdown to jsonl conversion Debanjum Singh Solanky 2022-09-10 23:57:17 +03:00
  • 2e1bbe0cac Fix striping empty escape sequences from strings Debanjum Singh Solanky 2022-09-10 23:55:09 +03:00
  • a7cf6c8458 Use dictionary instead of list to track entry to file maps Debanjum Singh Solanky 2022-09-10 23:08:30 +03:00
  • 3e1323971b Stack function calls in jsonl converters to avoid unneeded variables Debanjum Singh Solanky 2022-09-10 22:56:06 +03:00
  • 4eb84c7f51 Log performance metrics for beancount, markdown to jsonl conversion Debanjum Singh Solanky 2022-09-10 22:47:54 +03:00
  • ebd5039bd1 Merge branch 'master' into support-incremental-updates-of-embeddings Debanjum Singh Solanky 2022-09-10 22:11:43 +03:00
  • ed8d432fdd Clean-up generated file after image search test run Debanjum Singh Solanky 2022-09-07 03:07:30 +03:00
  • 030fab9bb2 Support incremental update of Markdown entries, embeddings Debanjum Singh Solanky 2022-09-10 21:30:04 +03:00
  • 91aac83c6a Support incremental update of Beancount transactions, embeddings Debanjum Singh Solanky 2022-09-10 20:55:32 +03:00
  • cfaf7aa6f4 Update Indexing Performance Section in Readme Debanjum Singh Solanky 2022-09-07 14:10:38 +03:00
  • b01b4d7daa Extract logic to mark entries for embeddings update into helper function Debanjum Singh Solanky 2022-09-07 03:31:48 +03:00
  • f97308bef2 Fix log message on writing JSONL data to file Debanjum Singh Solanky 2022-09-10 21:40:08 +03:00
  • 899bfc5c3e Test incremental update triggered on calling text_search.setup Debanjum Singh Solanky 2022-09-07 03:06:29 +03:00
  • c17a0fd05b Do not store word filters index to file. Not necessary for now Debanjum Singh Solanky 2022-09-07 02:43:58 +03:00
  • 91d11ccb49 Only hash compiled entry to identify new/updated entries to update Debanjum Singh Solanky 2022-09-07 02:36:38 +03:00
  • b9a6e80629 Make OrgNode tags stable sorted to find new entries for incremental updates Debanjum Singh Solanky 2022-09-07 01:38:30 +03:00
  • 2f7a6af56a Support incremental update of org-mode entries and embeddings Debanjum Singh Solanky 2022-09-07 00:16:48 +03:00
  • ec675d27d3 Suppress non-actionable HuggingFace FutureWarning shown on app start Debanjum Singh Solanky 2022-09-10 16:40:22 +03:00
  • 1ac6a71ff0 Add --version flag to show installed version of khoj Debanjum Singh Solanky 2022-09-10 16:30:58 +03:00
  • 372dcd2dbc Handle Empty Org Files or Org Files with No Headings Debanjum 2022-09-10 12:42:07 +00:00
  • 976397bd82 Ignore empty #+TITLE, merge multiple #+TITLE for 0th level headings Debanjum Singh Solanky 2022-09-10 15:22:26 +03:00
  • 2b58218b56 Reuse search models across sessions. Merge unused pytest fixtures Debanjum Singh Solanky 2022-09-10 14:15:43 +03:00
  • 11917c6ddd Do not normalize absolute filenames for creating links in OrgNode Debanjum Singh Solanky 2022-09-10 13:26:03 +03:00
  • 07b98d35f1 Use filename or #+TITLE as heading for 0th level content in org files Debanjum Singh Solanky 2022-09-10 13:18:39 +03:00
  • d6bd7bf3e1 Fix initializing OrgNode level to string to parse org files Debanjum Singh Solanky 2022-09-10 13:11:58 +03:00
  • d835467f2c Throw exception if no valid entries found in specified content files Debanjum Singh Solanky 2022-09-10 13:05:21 +03:00
  • e00bb53336 Init word filter dictionary with default value as set to simplify code Debanjum Singh Solanky 2022-09-10 12:16:53 +03:00
  • 4d776d9c7a Bump khoj version to 0.1.9 Debanjum Singh Solanky 2022-09-09 07:50:15 +03:00
  • b58b7d7483 Create App Directory, Fix Initialization GUI on First Run Debanjum 2022-09-09 04:40:22 +00:00
  • 588f598949 Pass empty list of `input_files' to FileBrowser on first run Debanjum Singh Solanky 2022-09-09 07:18:05 +03:00
  • 3ddffdfba4 Create config directory before setting up logging to file under it Debanjum Singh Solanky 2022-09-09 07:15:28 +03:00
  • 79894efc7a Resolve GUI Issues in Docker Build Debanjum 2022-09-08 07:55:06 +00:00
  • 26ff66f38b (Re-)Enable image search via Docker image as image search issues fixed Debanjum Singh Solanky 2022-09-08 10:42:34 +03:00
  • 17354aaffd Install pyqt system package in Docker image to get qt dependencies Debanjum Singh Solanky 2022-09-08 10:39:11 +03:00
  • 5d3aeba22f Use --no-gui flag on starting Khoj from docker-compose Debanjum Singh Solanky 2022-09-08 10:37:39 +03:00
  • e4d40e4d4d Update setup.py version, Readme. Remove faulty release badge for now Debanjum Singh Solanky 2022-09-07 14:51:03 +03:00
  • 35d81de1a1 Update khoj version to 0.1.7 in setup.py Debanjum Singh Solanky 2022-09-07 13:38:15 +03:00
  • 762607fc9f Log processed entries by org_to_jsonl only if verbosity > 2 Debanjum Singh Solanky 2022-09-06 23:01:38 +03:00
  • 490157cafa Setup File Filter for Markdown and Ledger content types Debanjum Singh Solanky 2022-09-06 15:27:31 +03:00
  • 94cf3e97f3 Log app logs to file for posthoc debugging and performance analysis Debanjum Singh Solanky 2022-09-06 14:51:48 +03:00
  • 0a78cd5477 Create File Filter. Improve, Consolidate Filter Code Debanjum 2022-09-05 15:29:55 +00:00
  • 3707a4cdd4 Improve date filter perf. Precompute date to entry map, Cache results Debanjum Singh Solanky 2022-09-05 18:21:29 +03:00
  • 31503e7afd Do not pass embeddings as argument to filter.apply method Debanjum Singh Solanky 2022-09-05 15:46:54 +03:00
  • 965bd052f1 Make search filters return entry ids satisfying filter Debanjum Singh Solanky 2022-09-05 03:17:41 +03:00
  • 7dd20d764c Pre-compute file to entry map in file filter to mark ids to include faster Debanjum Singh Solanky 2022-09-05 02:51:15 +03:00
  • 2890b4cd44 Simplify extracting entries satisfying file filter Debanjum Singh Solanky 2022-09-05 02:09:36 +03:00
  • 7606724dbc Add file of each entry to entry dict in org_to_jsonl converter Debanjum Singh Solanky 2022-09-05 01:57:17 +03:00
  • 7e083d3e96 Cache results for file filters passed in query for faster filtering Debanjum Singh Solanky 2022-09-05 01:51:11 +03:00
  • f634399f23 Convert simple file filters with no path separator into regex Debanjum Singh Solanky 2022-09-05 01:45:18 +03:00
  • 092b9e329d Setup Filters when configuring Text Search for each Search Type Debanjum Singh Solanky 2022-09-05 01:05:13 +03:00
  • 1f9fd28b34 Create File Filter to filter files to query. Add tests for file filter Debanjum Singh Solanky 2022-09-04 19:38:29 +03:00
  • e4418746f2 Create Abstract Base Class for Filters. Make Word, Date Filter Child of BaseFilter Debanjum Singh Solanky 2022-09-04 18:05:38 +03:00
  • c9f6200007 Ignore pytest_cache directory from git using .gitignore Debanjum Singh Solanky 2022-09-04 17:19:22 +03:00
  • f930324350 Rename explicit filter to word filter to be more specific Debanjum Singh Solanky 2022-09-04 17:18:47 +03:00
  • d153d420fc Improve Latency of Explicit Filter Debanjum 2022-09-04 13:55:17 +00:00
  • 6087862521 Use LRU helper class for explicit filter cache Debanjum Singh Solanky 2022-09-04 16:42:28 +03:00
  • 8f3326c8d4 Create LRU helper class for caching Debanjum Singh Solanky 2022-09-04 16:31:46 +03:00
  • 191a656ed7 Use word to entry map, list comprehension to speed up explicit filter Debanjum Singh Solanky 2022-09-04 15:09:09 +03:00
  • 28d3dc1434 Deep copy entries, embeddings in filters. Defer till actual filtering Debanjum Singh Solanky 2022-09-04 02:22:42 +03:00
  • 3308e68edf Cache explicitly filtered entries, embeddings by required, blocked words Debanjum Singh Solanky 2022-09-04 02:21:10 +03:00
  • cdcee89ae5 Wrap words in quotes to trigger explicit filter from query Debanjum Singh Solanky 2022-09-04 02:12:56 +03:00
  • 8d9f507df3 Load entries_by_word_set from file only once on first load of explicit filter Debanjum Singh Solanky 2022-09-04 00:37:37 +03:00
  • 858d86075b Use regexes to check if any explicit filters in query. Test can_filter Debanjum Singh Solanky 2022-09-03 23:47:28 +03:00
  • 546fad570d Use regex to extract include, exclude filter words from query Debanjum Singh Solanky 2022-09-03 23:33:52 +03:00
  • b7d259b1ec Test Explicit Include, Exclude Filters Debanjum Singh Solanky 2022-09-03 23:00:09 +03:00