Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall

Previous cross-encoder model was a few years old, newer models should have improved in quality. Model size increases by 50% compared to previous for better performance, at least on benchmarks
2026-03-02 21:19:12 +00:00 · 2024-04-24 09:13:14 +05:30
parent 7eaf9367fe
commit ec41482324
4 changed files with 4 additions and 4 deletions
--- a/documentation/docs/miscellaneous/advanced.md
+++ b/documentation/docs/miscellaneous/advanced.md
@@ -7,7 +7,7 @@ sidebar_position: 3
 ## Search across Different Languages (Self-Hosting)
 To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
 For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
-1. Manually update the search config in server's admin settings page. Go to [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/). Either create a new one, if none exists, or update the existing one. Set the bi_encoder to `sentence-transformers/multi-qa-MiniLM-L6-cos-v1` and the cross_encoder to `cross-encoder/ms-marco-MiniLM-L-6-v2`.
+1. Manually update the search config in server's admin settings page. Go to [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/). Either create a new one, if none exists, or update the existing one. Set the bi_encoder to `sentence-transformers/multi-qa-MiniLM-L6-cos-v1` and the cross_encoder to `mixedbread-ai/mxbai-rerank-xsmall-v1`.
 2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.

 ## Query Filters