- Reason:
Allow natural search on markdown based notes, documentation,
websites etc
- Details:
- Create markdown processor to extract Markdown entries (identified by
Heading) into standard jsonl format required by text_search
- Update API, Configs to support interfacing with new markdown type
- Update Emacs, Web clients to support interfacing with new markdown
type via API
- Update Readme to mentiond markdown is also supported
Closes#35
- The code for both the text search types were mostly the same
It was earlier done this way for expedience while experimenting
- The minor differences were reconciled and merged into a single
text_search type
- This simplifies the app and making it easier to process other
text types
Now that the logic to compile entries is in the processor layer, the
extract_entries method is standard across (text) search_types
Extract the load_jsonl method as a utility helper method.
Use it in (a)symmetric search types
- The all-MiniLM-L6-v2 is more accurate
- The exact previous model isn't benchmarked but based on the
performance of the closest model to it. Seems like the new model
maybe similar in speed and size
- On very preliminary evaluation of the model, the new model seems
faster, with pretty decent results
- The multi-qa-MiniLM-L6-cos-v1 is more extensively benchmarked[1]
- It has the right mix of model query speed, size and performance on benchmarks
- On hugging face it has way more downloads and likes than the msmarco model[2]
- On very preliminary evaluation of the model
- It doubles the encoding speed of all entries (down from ~8min to 4mins)
- It gave more entries that stay relevant to the query (3/5 vs 1/5 earlier)
[1]: https://www.sbert.net/docs/pretrained_models.html
[2]: https://huggingface.co/sentence-transformers
Conversation logs structure now has session info too instead of just chat info
Session info will allow loading past conversation summaries as context for AI in new conversations
{
"session": [
{
"summary": <chat_session_summary>,
"session-start": <session_start_index_in_chat_log>,
"session-end": <session_end_index_in_chat_log>
}],
"chat": [
{
"intent": <intent-object>
"trigger-emotion": <emotion-triggered-by-message>
"by": <AI|Human>
"message": <chat_message>
"created": <message_created_date>
}]
}
Details
- Rename method query_* to query in search_types for standardization
- Wrapping Config code in classes simplified mocking test config
- Reduce args beings passed to a function by passing it as single
argument wrapped in a class
- Minimize setup in main.py:__main__. Put most of it into functions
These functions can be mocked if required in tests later too
Setup Flow:
CLI_Args|Config_YAML -> (Text|Image)SearchConfig -> (Text|Image)SearchModel
- Wrap Image, Music, Ledger search into the type of SearchModel they use
Similar to what was done for notes model by wrapping it's config
into an AsymmetricSearchModel.
- Use the uber wrapper class to expose all type specific search models
- Wrap asymmetric search model parameters into AsymmetricSearchModel class
- Create wrapper for all search type models. Put notes search model into it
- Test notes search end-to-end from client API layer to results.
Use model build on test data
- Cleaner, more idiomatic usage of a global variable
- Simplifies mocking when testing client in pytest as setting wrapped
in object rather than a simple type. So passed around by reference
- Use a SearchType to limit types that can be passed by user
- FastAPI automatically validates type passed in query param
- Available type options show up in Swagger UI, FastAPI docs
- controller code looks neater instead of doing string comparisons for type
- Test invalid, valid search types via pytest
- Break the compute embeddings method into separate methods:
compute_image_embeddings and compute_metadata_embeddings
- If image_metadata_embeddings isn't defined, do not use it to enhance
search results. Given image_metadata_embeddings wouldn't be defined
if use_xmp_metadata is False, we can avoid unnecessary addition of
args to query method
- Details
- The CLIP model can represent images, text in the same vector space
- Enhance CLIP's image understanding by augmenting the plain image
with it's text based metadata.
Specifically with any subject, description XMP tags on the image
- Improve results by combining plain image similarity score with
metadata similarity scores for the highest ranked images
- Minor Fixes
- Convert verbose to integer from bool in image_search.
It's already passed as integer from the main program entrypoint
- Process images with ".jpeg" extensions too