Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interfaces for thesaurus datatype #57

Merged
merged 9 commits into from
Dec 11, 2024
Merged

Interfaces for thesaurus datatype #57

merged 9 commits into from
Dec 11, 2024

Conversation

CascadingRadium
Copy link
Member

@CascadingRadium CascadingRadium commented Oct 14, 2024

  • Add interfaces to abstract the thesaurus and its helper iterator methods.
  • Extend Fuzzy and Regex FieldDict interfaces to return abstracted automatons for
    calculating Damerau-Levenshtein distance and regex term matching, respectively,
    based on the original term/pattern using which these automatons were built.
  • Add interfaces for special synonym documents and synonym fields. These interfaces allow
    differentiation between synonym documents and regular documents during processing in the
    index.

@CascadingRadium CascadingRadium marked this pull request as ready for review December 9, 2024 11:07
@CascadingRadium CascadingRadium changed the title interfaces for thesaurus datatype Interfaces for thesaurus datatype Dec 9, 2024
index.go Show resolved Hide resolved
index.go Outdated Show resolved Hide resolved
@abhinavdangeti abhinavdangeti merged commit bc5aa25 into master Dec 11, 2024
9 checks passed
@abhinavdangeti abhinavdangeti deleted the synonyms branch December 11, 2024 16:31
abhinavdangeti added a commit to blevesearch/bleve that referenced this pull request Dec 19, 2024
- Allow setting up `synonym_sources` in the index mapping, which will
follow its own ingest pipeline, ingesting special synonym definitions
using the IndexSynonym API().
- A `synonym_source` can be set like an analyzer to a field mapping and
can be set as a default option at the document mapping or the index
mapping level.
- Each `synonym_source` can have its own analyzer, making it flexible to
allow for compatibility with the language analyzer specified for its
corresponding mapping.
- Compatibility with every term-based query where the term gets expanded
to include its synonyms at query time.
- Dependencies:
- blevesearch/[email protected] -
blevesearch/bleve_index_api#57
- blevesearch/[email protected] -
blevesearch/scorch_segment_api#46
- blevesearch/[email protected] -
blevesearch/vellum#22
- blevesearch/zapx@v16@latest -
blevesearch/zapx#268

---------

Co-authored-by: Abhinav Dangeti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants