Record Class PartitionEvaluator.Parameters

java.lang.Object
java.lang.Record
com.apple.foundationdb.kmeans.PartitionEvaluator.Parameters
Record Components:
distanceEstimator - the distance estimator used for all distance computations
minRelativeSseGain - minimum relative SSE (sum of squared errors) improvement required; candidates with less improvement are rejected. May be negative if the caller wants to accept some SSE increase (e.g. for merges).
minSeparation - minimum inter-cluster separation required; not checked when the candidate has fewer than two clusters
maxLowMarginRate - maximum fraction of vectors with low assignment margin; not checked when the candidate has fewer than two clusters
minSmallestFrac - minimum fraction of vectors in the candidate's smallest cluster; candidates that violate this are reported as PartitionEvaluator.Decision.INVALID_CANDIDATE
maxLargestFrac - maximum fraction of vectors in the candidate's largest cluster; candidates that violate this are reported as PartitionEvaluator.Decision.KEEP_CURRENT. Use 1.0 to disable this check.
lowMarginThreshold - distance threshold below which a vector's assignment margin is considered "low"; if non-positive, a metric-dependent default is used (0.02 for cosine, 5% of p95 for L2)
alphaSseGain - weight for the SSE gain component in the composite score
betaSeparationGain - weight for the separation gain component in the composite score
gammaImbalancePenalty - weight for the imbalance penalty in the composite score
deltaLowMarginPenalty - weight for the low-margin-rate penalty in the composite score
minScoreGain - minimum composite score improvement the candidate must achieve over the current partitioning to be accepted
Enclosing class:
PartitionEvaluator

public static record PartitionEvaluator.Parameters(@Nonnull DistanceEstimator distanceEstimator, double minRelativeSseGain, double minSeparation, double maxLowMarginRate, double minSmallestFrac, double maxLargestFrac, double lowMarginThreshold, double alphaSseGain, double betaSeparationGain, double gammaImbalancePenalty, double deltaLowMarginPenalty, double minScoreGain) extends Record
Tuning parameters that control when a candidate repartitioning is accepted or rejected, and how the composite quality score is computed.

The minSmallestFrac and maxLargestFrac thresholds apply to the candidate regardless of k; for a single-cluster candidate (k == 1) both smallestFrac and largestFrac are trivially 1.0, so callers should keep minSmallestFrac <= 1.0 and maxLargestFrac >= 1.0 if they want merges to a single cluster to be admissible. Callers should pick minSmallestFrac and maxLargestFrac based on their transition (e.g. tighter for an initial 1 → 2 split, looser for 2 → 3 or for merges).

  • Constructor Details

    • Parameters

      public Parameters(@Nonnull DistanceEstimator distanceEstimator)
      Convenience constructor that picks a moderately permissive set of defaults: 10% minimum relative SSE gain, separation floor of 0.3, max low-margin rate of 25%, smallest cluster fraction of 1.5%, no upper bound on the largest cluster, metric-default lowMarginThreshold, and the score weights (alpha=1.0, beta=0.5, gamma=1.0, delta=0.75) with a minScoreGain of 0.05. Tighten or loosen via the canonical constructor as needed.
      Parameters:
      distanceEstimator - the distance estimator used for all distance computations
    • Parameters

      public Parameters(@Nonnull DistanceEstimator distanceEstimator, double minRelativeSseGain, double minSeparation, double maxLowMarginRate, double minSmallestFrac, double maxLargestFrac, double lowMarginThreshold, double alphaSseGain, double betaSeparationGain, double gammaImbalancePenalty, double deltaLowMarginPenalty, double minScoreGain)
      Creates an instance of a Parameters record class.
      Parameters:
      distanceEstimator - the value for the distanceEstimator record component
      minRelativeSseGain - the value for the minRelativeSseGain record component
      minSeparation - the value for the minSeparation record component
      maxLowMarginRate - the value for the maxLowMarginRate record component
      minSmallestFrac - the value for the minSmallestFrac record component
      maxLargestFrac - the value for the maxLargestFrac record component
      lowMarginThreshold - the value for the lowMarginThreshold record component
      alphaSseGain - the value for the alphaSseGain record component
      betaSeparationGain - the value for the betaSeparationGain record component
      gammaImbalancePenalty - the value for the gammaImbalancePenalty record component
      deltaLowMarginPenalty - the value for the deltaLowMarginPenalty record component
      minScoreGain - the value for the minScoreGain record component
  • Method Details

    • toString

      public final String toString()
      Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components.
      Specified by:
      toString in class Record
      Returns:
      a string representation of this object
    • hashCode

      public final int hashCode()
      Returns a hash code value for this object. The value is derived from the hash code of each of the record components.
      Specified by:
      hashCode in class Record
      Returns:
      a hash code value for this object
    • equals

      public final boolean equals(Object o)
      Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared with Objects::equals(Object,Object); primitive components are compared with '=='.
      Specified by:
      equals in class Record
      Parameters:
      o - the object with which to compare
      Returns:
      true if this object is the same as the o argument; false otherwise.
    • distanceEstimator

      @Nonnull public DistanceEstimator distanceEstimator()
      Returns the value of the distanceEstimator record component.
      Returns:
      the value of the distanceEstimator record component
    • minRelativeSseGain

      public double minRelativeSseGain()
      Returns the value of the minRelativeSseGain record component.
      Returns:
      the value of the minRelativeSseGain record component
    • minSeparation

      public double minSeparation()
      Returns the value of the minSeparation record component.
      Returns:
      the value of the minSeparation record component
    • maxLowMarginRate

      public double maxLowMarginRate()
      Returns the value of the maxLowMarginRate record component.
      Returns:
      the value of the maxLowMarginRate record component
    • minSmallestFrac

      public double minSmallestFrac()
      Returns the value of the minSmallestFrac record component.
      Returns:
      the value of the minSmallestFrac record component
    • maxLargestFrac

      public double maxLargestFrac()
      Returns the value of the maxLargestFrac record component.
      Returns:
      the value of the maxLargestFrac record component
    • lowMarginThreshold

      public double lowMarginThreshold()
      Returns the value of the lowMarginThreshold record component.
      Returns:
      the value of the lowMarginThreshold record component
    • alphaSseGain

      public double alphaSseGain()
      Returns the value of the alphaSseGain record component.
      Returns:
      the value of the alphaSseGain record component
    • betaSeparationGain

      public double betaSeparationGain()
      Returns the value of the betaSeparationGain record component.
      Returns:
      the value of the betaSeparationGain record component
    • gammaImbalancePenalty

      public double gammaImbalancePenalty()
      Returns the value of the gammaImbalancePenalty record component.
      Returns:
      the value of the gammaImbalancePenalty record component
    • deltaLowMarginPenalty

      public double deltaLowMarginPenalty()
      Returns the value of the deltaLowMarginPenalty record component.
      Returns:
      the value of the deltaLowMarginPenalty record component
    • minScoreGain

      public double minScoreGain()
      Returns the value of the minScoreGain record component.
      Returns:
      the value of the minScoreGain record component