Improve test data organization and update correspoding conftests

- Put test data for each content type into separate directories
- Makes config.yml for docker and local host consistent
  - Prepending tests to /data in sample_config.yml makes application
    run on local host using test data
  - Allows mounting separate volume for each content type in docker-compose
- Ignore gitignore to only add tests content, not generated models or embeddings
This commit is contained in:
Debanjum Singh Solanky
2022-01-29 01:57:08 -05:00
parent 3e889760c7
commit 79c2224eaa
9 changed files with 15 additions and 14 deletions

View File

@@ -0,0 +1,44 @@
* Emacs Semantic Search
/An Emacs interface for [[https://github.com/debanjum/semantic-search][semantic-search]]/
** Requirements
- Install and Run [[https://github.com/debanjum/semantic-search][semantic-search]]
** Installation
- Direct Install
- Put ~semantic-search.el~ in your Emacs load path. For e.g ~/.emacs.d/lisp
- Load via ~use-package~ in your ~/.emacs.d/init.el or .emacs file by adding below snippet
#+begin_src elisp
;; Org-Semantic Search Library
(use-package semantic-search
:load-path "~/.emacs.d/lisp/semantic-search.el"
:bind ("C-c s" . 'semantic-search))
#+end_src
- Use [[https://github.com/quelpa/quelpa#installation][Quelpa]]
- Ensure [[https://github.com/quelpa/quelpa#installation][Quelpa]], [[https://github.com/quelpa/quelpa-use-package#installation][quelpa-use-package]] are installed
- Add below snippet to your ~/.emacs.d/init.el or .emacs config file and execute it.
#+begin_src elisp
;; Org-Semantic Search Library
(use-package semantic-search
:quelpa (semantic-search :fetcher url :url "https://raw.githubusercontent.com/debanjum/semantic-search/master/interface/emacs/semantic-search.el")
:bind ("C-c s" . 'semantic-search))
#+end_src
** Usage
1. Call ~semantic-search~ using keybinding ~C-c s~ or ~M-x semantic-search~
2. Enter Query in Natural Language
e.g "What is the meaning of life?" "What are my life goals?"
3. Wait for results
*Note: It takes about 15s on a Mac M1 and a ~100K lines corpus of org-mode files*
4. (Optional) Narrow down results further
Include/Exclude specific words from results by adding to query
e.g "What is the meaning of life? -god +none"

View File

@@ -0,0 +1,47 @@
* Semantic Search
/Allow natural language search on user content like notes, images using transformer based models/
All data is processed locally. User can interface with semantic-search app via [[./interface/emacs/semantic-search.el][Emacs]], API or Commandline
** Dependencies
- Python3
- [[https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links][Miniconda]]
** Install
#+begin_src shell
git clone https://github.com/debanjum/semantic-search && cd semantic-search
conda env create -f environment.yml
conda activate semantic-search
#+end_src
** Run
Load ML model, generate embeddings and expose API to query specified org-mode files
#+begin_src shell
python3 main.py --input-files ~/Notes/Schedule.org ~/Notes/Incoming.org --verbose
#+end_src
** Use
- *Semantic Search via Emacs*
- [[https://github.com/debanjum/semantic-search/tree/master/interface/emacs#installation][Install]] [[./interface/emacs/semantic-search.el][semantic-search.el]]
- Run ~M-x semantic-search <user-query>~ or Call ~C-c C-s~
- *Semantic Search via API*
- Query: ~GET~ [[http://localhost:8000/search?q=%22what%20is%20the%20meaning%20of%20life%22][http://localhost:8000/search?q="What is the meaning of life"]]
- Regenerate Embeddings: ~GET~ [[http://localhost:8000/regenerate][http://localhost:8000/regenerate]]
- [[http://localhost:8000/docs][Semantic Search API Docs]]
- *Call Semantic Search via Python Script Directly*
#+begin_src shell
python3 search_types/asymmetric.py \
--compressed-jsonl .notes.jsonl.gz \
--embeddings .notes_embeddings.pt \
--results-count 5 \
--verbose \
--interactive
#+end_src
** Acknowledgments
- [[https://huggingface.co/sentence-transformers/msmarco-MiniLM-L-6-v3][MiniLM Model]] for Asymmetric Text Search. See [[https://www.sbert.net/examples/applications/retrieve_rerank/README.html][SBert Documentation]]
- [[https://github.com/openai/CLIP][OpenAI CLIP Model]] for Image Search. See [[https://www.sbert.net/examples/applications/image-search/README.html][SBert Documentation]]
- Charles Cave for [[http://members.optusnet.com.au/~charles57/GTD/orgnode.html][OrgNode Parser]]