Commit Graph

9 Commits

Author SHA1 Message Date
Debanjum Singh Solanky
66238004d8 Use verbosity level instead of bool across application
For consistent, more granular verbosity controls across app
Allows user to increase verbosity by passing -vvv flags passed to main.py
2021-08-16 17:15:41 -07:00
Debanjum Singh Solanky
649e5d1327 Allow reuse of get_absolute_path, is_none_or_empty methods
- Move them to utils.helper.py for reuse
- Import those modules where required
- Delete duplicate methods defined in org_to_jsonl.py, asymmetric.py
2021-08-16 16:33:43 -07:00
Debanjum Singh Solanky
19d6678eb1 Allow importing org-to-jsonl as module for reuse
To allow importing org-to-jsonl as module
  - Wrap code in __main__ into a org-to-jsonl method
  - Rename processor/org-mode to processor/org_mode
  - Add __init__.py to processor directory
2021-08-16 16:31:30 -07:00
Debanjum Singh Solanky
85bf15628d Use better cmdline argument names. Drop unneeded no-compress argument
Can infer to compress or not via the output_file suffix
2021-08-16 13:49:39 -07:00
Debanjum Singh Solanky
d9f60c00bf Warn if any input files to org-to-json are potentially non org-mode files
That is, if the file paths in the input set don't end with .org
2021-08-16 13:49:39 -07:00
Debanjum Singh Solanky
3aa0c30fee Use absolute file path to open files in org-to-jsonl.py, asymmetric.py
Exit script if neither org_files, org_file_filter is present
2021-08-16 13:49:39 -07:00
Debanjum Singh Solanky
e773611558 Remove unused jsonl_file argument from convert_org_entries_to_jsonl 2021-08-16 13:49:35 -07:00
Debanjum Singh Solanky
8b29e272d3 Standardize interface, better default args for org-to-json.py script
- Remove non-standard, unnecessary argument for org-directory
  Pass path each file in org-files and org-files-filter argument directly
- Allow shorthand -i, -o for input files, output files
- Default to compress, unless user explicitly specifies not to
2021-08-16 11:29:08 -07:00
Debanjum Singh Solanky
354c541b62 Add org processor to generate compressed jsonl from org-mode files
The corpus embeddings are generated from this compressed JSONL
using the specified transformer ML model
2021-08-15 22:52:31 -07:00