Class EmailCjkSynonymAnalyzer

java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
com.apple.foundationdb.record.lucene.EmailCjkSynonymAnalyzer
All Implemented Interfaces:
Closeable, AutoCloseable

public class EmailCjkSynonymAnalyzer extends org.apache.lucene.analysis.StopwordAnalyzerBase
An analyzer that can handle emails, CJK, and synonyms. It essentially combines UAX29URLEmailAnalyzer, CJKUnigramFilter, and SynonymGraphFilter.
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

    org.apache.lucene.analysis.Analyzer.ReuseStrategy, org.apache.lucene.analysis.Analyzer.TokenStreamComponents
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
     

    Fields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

    stopwords

    Fields inherited from class org.apache.lucene.analysis.Analyzer

    GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
  • Constructor Summary

    Constructors
    Constructor
    Description
    EmailCjkSynonymAnalyzer(org.apache.lucene.analysis.CharArraySet stopwords, int minTokenLength, int minAlphanumericTokenLength, int maxTokenLength, boolean withEmailTokenizer, boolean withSynonymGraphFilter, org.apache.lucene.analysis.synonym.SynonymMap synonymMap)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents
    Deprecated.
    int
     
    int
     
    protected org.apache.lucene.analysis.synonym.SynonymMap
     
    protected org.apache.lucene.analysis.TokenStream
    normalize(String fieldName, org.apache.lucene.analysis.TokenStream in)
     
    protected boolean
     
    protected boolean
     

    Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

    getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet

    Methods inherited from class org.apache.lucene.analysis.Analyzer

    attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • EmailCjkSynonymAnalyzer

      public EmailCjkSynonymAnalyzer(@Nonnull org.apache.lucene.analysis.CharArraySet stopwords, int minTokenLength, int minAlphanumericTokenLength, int maxTokenLength, boolean withEmailTokenizer, boolean withSynonymGraphFilter, @Nullable org.apache.lucene.analysis.synonym.SynonymMap synonymMap)
  • Method Details

    • createComponents

      @Deprecated protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents createComponents(String fieldName)
      Deprecated.
      Specified by:
      createComponents in class org.apache.lucene.analysis.Analyzer
    • normalize

      protected org.apache.lucene.analysis.TokenStream normalize(String fieldName, org.apache.lucene.analysis.TokenStream in)
      Overrides:
      normalize in class org.apache.lucene.analysis.Analyzer
    • getMinTokenLength

      public int getMinTokenLength()
    • getMaxTokenLength

      public int getMaxTokenLength()
    • withSynonymGraphFilter

      protected boolean withSynonymGraphFilter()
    • withEmailTokenizer

      protected boolean withEmailTokenizer()
    • getSynonymMap

      @Nonnull protected org.apache.lucene.analysis.synonym.SynonymMap getSynonymMap()