Do not wrap filepath in Path to fix indexing markdown files on Windows

Issue
- Path with / are converted to \\ on Windows using the Path operator.
- The markdown to entries method for some reason was doing this.
  This would store the file paths in DB entry differently than the file
  to entries map. Resulting in a KeyError when trying to look up the
  entry file path from file_to_text_map in the
  text_to_entries:update_embeddings() function.

Fix
- Removing the unnecessary OS dependendent Path normalization in
  markdown_to_entries should keep the file path storage consistent
  across file_to_text_map var, FileObjectAdaptor, Entry DB tables on
  Windows for Markdown files as well

This issue would only affect users hosting Khoj server on Windows and
attempting to index markdown files.

Resolves #984
This commit is contained in:
Debanjum
2024-12-01 22:37:57 -08:00
parent 9e0a2c7a98
commit dffdd81345

View File

@@ -139,7 +139,7 @@ class MarkdownToEntries(TextToEntries):
# Escape the URL to avoid issues with special characters
entry_filename = urllib3.util.parse_url(raw_filename).url
else:
entry_filename = str(Path(raw_filename))
entry_filename = raw_filename
heading = parsed_entry.splitlines()[0] if re.search(r"^#+\s", parsed_entry) else ""
# Append base filename to compiled entry for context to model