Use a better model for asymmetric semantic search

- The multi-qa-MiniLM-L6-cos-v1 is more extensively benchmarked[1]
- It has the right mix of model query speed, size and performance on benchmarks
- On hugging face it has way more downloads and likes than the msmarco model[2]
- On very preliminary evaluation of the model
  - It doubles the encoding speed of all entries (down from ~8min to 4mins)
  - It gave more entries that stay relevant to the query (3/5 vs 1/5 earlier)

[1]: https://www.sbert.net/docs/pretrained_models.html
[2]: https://huggingface.co/sentence-transformers
This commit is contained in:
Debanjum Singh Solanky
2022-07-18 20:00:19 +04:00
parent 5e302dbcda
commit 4a90972e38
5 changed files with 5 additions and 5 deletions

View File

@@ -33,7 +33,7 @@ search-type:
model_directory: "/data/models/symmetric"
asymmetric:
encoder: "sentence-transformers/msmarco-MiniLM-L-6-v3"
encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
model_directory: "/data/models/asymmetric"