Class AxiomaticF3EXP


public class AxiomaticF3EXP extends Axiomatic
F3EXP is defined as Sum(tf(term_doc_freq)*IDF(term)-gamma(docLen, queryLen)) where IDF(t) = pow((N+1)/df(t), k) N=total num of docs, df=doc freq gamma(docLen, queryLen) = (docLen-queryLen)*queryLen*s/avdl NOTE: the gamma function of this similarity creates negative scores
WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity

    Similarity.SimScorer
  • Field Summary

    Fields inherited from class org.apache.lucene.search.similarities.Axiomatic

    k, queryLen, s
  • Constructor Summary

    Constructors
    Constructor
    Description
    AxiomaticF3EXP(float s, int queryLen)
    Constructor setting s and queryLen, letting k to default
    AxiomaticF3EXP(float s, int queryLen, float k)
    Constructor setting all Axiomatic hyperparameters
  • Method Summary

    Modifier and Type
    Method
    Description
    protected double
    gamma(BasicStats stats, double freq, double docLen)
    compute the gamma component
    protected double
    idf(BasicStats stats, double freq, double docLen)
    compute the inverted document frequency component
    protected Explanation
    idfExplain(BasicStats stats, double freq, double docLen)
    Explain the score of the inverted document frequency component for a single document
    protected double
    ln(BasicStats stats, double freq, double docLen)
    compute the document length component
    protected Explanation
    lnExplain(BasicStats stats, double freq, double docLen)
    Explain the score of the document length component for a single document
    protected double
    tf(BasicStats stats, double freq, double docLen)
    compute the term frequency component
    protected Explanation
    tfExplain(BasicStats stats, double freq, double docLen)
    Explain the score of the term frequency component for a single document
    protected double
    tfln(BasicStats stats, double freq, double docLen)
    compute the mixed term frequency and document length component
    protected Explanation
    tflnExplain(BasicStats stats, double freq, double docLen)
    Explain the score of the mixed term frequency and document length component for a single document
    Name of the axiomatic method.

    Methods inherited from class org.apache.lucene.search.similarities.Axiomatic

    explain, explain, score

    Methods inherited from class org.apache.lucene.search.similarities.SimilarityBase

    fillBasicStats, log2, newStats, scorer

    Methods inherited from class org.apache.lucene.search.similarities.Similarity

    computeNorm, getDiscountOverlaps

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • AxiomaticF3EXP

      public AxiomaticF3EXP(float s, int queryLen, float k)
      Constructor setting all Axiomatic hyperparameters
      Parameters:
      s - hyperparam for the growth function
      queryLen - the query length
      k - hyperparam for the primitive weighting function
    • AxiomaticF3EXP

      public AxiomaticF3EXP(float s, int queryLen)
      Constructor setting s and queryLen, letting k to default
      Parameters:
      s - hyperparam for the growth function
      queryLen - the query length
  • Method Details

    • toString

      public String toString()
      Description copied from class: Axiomatic
      Name of the axiomatic method.
      Specified by:
      toString in class Axiomatic
    • tf

      protected double tf(BasicStats stats, double freq, double docLen)
      compute the term frequency component
      Specified by:
      tf in class Axiomatic
    • ln

      protected double ln(BasicStats stats, double freq, double docLen)
      compute the document length component
      Specified by:
      ln in class Axiomatic
    • tfln

      protected double tfln(BasicStats stats, double freq, double docLen)
      compute the mixed term frequency and document length component
      Specified by:
      tfln in class Axiomatic
    • idf

      protected double idf(BasicStats stats, double freq, double docLen)
      compute the inverted document frequency component
      Specified by:
      idf in class Axiomatic
    • gamma

      protected double gamma(BasicStats stats, double freq, double docLen)
      compute the gamma component
      Specified by:
      gamma in class Axiomatic
    • tfExplain

      protected Explanation tfExplain(BasicStats stats, double freq, double docLen)
      Description copied from class: Axiomatic
      Explain the score of the term frequency component for a single document
      Specified by:
      tfExplain in class Axiomatic
      Parameters:
      stats - the corpus level statistics
      freq - number of occurrences of term in the document
      docLen - the document length
      Returns:
      Explanation of how the tf component was computed
    • lnExplain

      protected Explanation lnExplain(BasicStats stats, double freq, double docLen)
      Description copied from class: Axiomatic
      Explain the score of the document length component for a single document
      Specified by:
      lnExplain in class Axiomatic
      Parameters:
      stats - the corpus level statistics
      freq - number of occurrences of term in the document
      docLen - the document length
      Returns:
      Explanation of how the ln component was computed
    • tflnExplain

      protected Explanation tflnExplain(BasicStats stats, double freq, double docLen)
      Description copied from class: Axiomatic
      Explain the score of the mixed term frequency and document length component for a single document
      Specified by:
      tflnExplain in class Axiomatic
      Parameters:
      stats - the corpus level statistics
      freq - number of occurrences of term in the document
      docLen - the document length
      Returns:
      Explanation of how the tfln component was computed
    • idfExplain

      protected Explanation idfExplain(BasicStats stats, double freq, double docLen)
      Description copied from class: Axiomatic
      Explain the score of the inverted document frequency component for a single document
      Specified by:
      idfExplain in class Axiomatic
      Parameters:
      stats - the corpus level statistics
      freq - number of occurrences of term in the document
      docLen - the document length
      Returns:
      Explanation of how the idf component was computed