Package com.apple.foundationdb.kmeans
Record Class PartitionEvaluator.Parameters
java.lang.Object
java.lang.Record
com.apple.foundationdb.kmeans.PartitionEvaluator.Parameters
- Record Components:
distanceEstimator- the distance estimator used for all distance computationsminRelativeSseGain- minimum relative SSE (sum of squared errors) improvement required; candidates with less improvement are rejected. May be negative if the caller wants to accept some SSE increase (e.g. for merges).minSeparation- minimum inter-cluster separation required; not checked when the candidate has fewer than two clustersmaxLowMarginRate- maximum fraction of vectors with low assignment margin; not checked when the candidate has fewer than two clustersminSmallestFrac- minimum fraction of vectors in the candidate's smallest cluster; candidates that violate this are reported asPartitionEvaluator.Decision.INVALID_CANDIDATEmaxLargestFrac- maximum fraction of vectors in the candidate's largest cluster; candidates that violate this are reported asPartitionEvaluator.Decision.KEEP_CURRENT. Use1.0to disable this check.lowMarginThreshold- distance threshold below which a vector's assignment margin is considered "low"; if non-positive, a metric-dependent default is used (0.02 for cosine, 5% of p95 for L2)alphaSseGain- weight for the SSE gain component in the composite scorebetaSeparationGain- weight for the separation gain component in the composite scoregammaImbalancePenalty- weight for the imbalance penalty in the composite scoredeltaLowMarginPenalty- weight for the low-margin-rate penalty in the composite scoreminScoreGain- minimum composite score improvement the candidate must achieve over the current partitioning to be accepted
- Enclosing class:
PartitionEvaluator
public static record PartitionEvaluator.Parameters(@Nonnull DistanceEstimator distanceEstimator, double minRelativeSseGain, double minSeparation, double maxLowMarginRate, double minSmallestFrac, double maxLargestFrac, double lowMarginThreshold, double alphaSseGain, double betaSeparationGain, double gammaImbalancePenalty, double deltaLowMarginPenalty, double minScoreGain)
extends Record
Tuning parameters that control when a candidate repartitioning is accepted or rejected, and
how the composite quality score is computed.
The minSmallestFrac and maxLargestFrac thresholds apply to the candidate
regardless of k; for a single-cluster candidate (k == 1) both
smallestFrac and largestFrac are trivially 1.0, so callers should
keep minSmallestFrac <= 1.0 and maxLargestFrac >= 1.0 if they want merges
to a single cluster to be admissible. Callers should pick minSmallestFrac and
maxLargestFrac based on their transition (e.g. tighter for an initial 1 → 2 split,
looser for 2 → 3 or for merges).
-
Constructor Summary
ConstructorsConstructorDescriptionParameters(DistanceEstimator distanceEstimator) Convenience constructor that picks a moderately permissive set of defaults: 10% minimum relative SSE gain, separation floor of 0.3, max low-margin rate of 25%, smallest cluster fraction of 1.5%, no upper bound on the largest cluster, metric-defaultlowMarginThreshold, and the score weights(alpha=1.0, beta=0.5, gamma=1.0, delta=0.75)with aminScoreGainof0.05.Parameters(DistanceEstimator distanceEstimator, double minRelativeSseGain, double minSeparation, double maxLowMarginRate, double minSmallestFrac, double maxLargestFrac, double lowMarginThreshold, double alphaSseGain, double betaSeparationGain, double gammaImbalancePenalty, double deltaLowMarginPenalty, double minScoreGain) Creates an instance of aParametersrecord class. -
Method Summary
Modifier and TypeMethodDescriptiondoubleReturns the value of thealphaSseGainrecord component.doubleReturns the value of thebetaSeparationGainrecord component.doubleReturns the value of thedeltaLowMarginPenaltyrecord component.Returns the value of thedistanceEstimatorrecord component.final booleanIndicates whether some other object is "equal to" this one.doubleReturns the value of thegammaImbalancePenaltyrecord component.final inthashCode()Returns a hash code value for this object.doubleReturns the value of thelowMarginThresholdrecord component.doubleReturns the value of themaxLargestFracrecord component.doubleReturns the value of themaxLowMarginRaterecord component.doubleReturns the value of theminRelativeSseGainrecord component.doubleReturns the value of theminScoreGainrecord component.doubleReturns the value of theminSeparationrecord component.doubleReturns the value of theminSmallestFracrecord component.final StringtoString()Returns a string representation of this record class.
-
Constructor Details
-
Parameters
Convenience constructor that picks a moderately permissive set of defaults: 10% minimum relative SSE gain, separation floor of 0.3, max low-margin rate of 25%, smallest cluster fraction of 1.5%, no upper bound on the largest cluster, metric-defaultlowMarginThreshold, and the score weights(alpha=1.0, beta=0.5, gamma=1.0, delta=0.75)with aminScoreGainof0.05. Tighten or loosen via the canonical constructor as needed.- Parameters:
distanceEstimator- the distance estimator used for all distance computations
-
Parameters
public Parameters(@Nonnull DistanceEstimator distanceEstimator, double minRelativeSseGain, double minSeparation, double maxLowMarginRate, double minSmallestFrac, double maxLargestFrac, double lowMarginThreshold, double alphaSseGain, double betaSeparationGain, double gammaImbalancePenalty, double deltaLowMarginPenalty, double minScoreGain) Creates an instance of aParametersrecord class.- Parameters:
distanceEstimator- the value for thedistanceEstimatorrecord componentminRelativeSseGain- the value for theminRelativeSseGainrecord componentminSeparation- the value for theminSeparationrecord componentmaxLowMarginRate- the value for themaxLowMarginRaterecord componentminSmallestFrac- the value for theminSmallestFracrecord componentmaxLargestFrac- the value for themaxLargestFracrecord componentlowMarginThreshold- the value for thelowMarginThresholdrecord componentalphaSseGain- the value for thealphaSseGainrecord componentbetaSeparationGain- the value for thebetaSeparationGainrecord componentgammaImbalancePenalty- the value for thegammaImbalancePenaltyrecord componentdeltaLowMarginPenalty- the value for thedeltaLowMarginPenaltyrecord componentminScoreGain- the value for theminScoreGainrecord component
-
-
Method Details
-
toString
Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components. -
hashCode
public final int hashCode()Returns a hash code value for this object. The value is derived from the hash code of each of the record components. -
equals
Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared withObjects::equals(Object,Object); primitive components are compared with '=='. -
distanceEstimator
Returns the value of thedistanceEstimatorrecord component.- Returns:
- the value of the
distanceEstimatorrecord component
-
minRelativeSseGain
public double minRelativeSseGain()Returns the value of theminRelativeSseGainrecord component.- Returns:
- the value of the
minRelativeSseGainrecord component
-
minSeparation
public double minSeparation()Returns the value of theminSeparationrecord component.- Returns:
- the value of the
minSeparationrecord component
-
maxLowMarginRate
public double maxLowMarginRate()Returns the value of themaxLowMarginRaterecord component.- Returns:
- the value of the
maxLowMarginRaterecord component
-
minSmallestFrac
public double minSmallestFrac()Returns the value of theminSmallestFracrecord component.- Returns:
- the value of the
minSmallestFracrecord component
-
maxLargestFrac
public double maxLargestFrac()Returns the value of themaxLargestFracrecord component.- Returns:
- the value of the
maxLargestFracrecord component
-
lowMarginThreshold
public double lowMarginThreshold()Returns the value of thelowMarginThresholdrecord component.- Returns:
- the value of the
lowMarginThresholdrecord component
-
alphaSseGain
public double alphaSseGain()Returns the value of thealphaSseGainrecord component.- Returns:
- the value of the
alphaSseGainrecord component
-
betaSeparationGain
public double betaSeparationGain()Returns the value of thebetaSeparationGainrecord component.- Returns:
- the value of the
betaSeparationGainrecord component
-
gammaImbalancePenalty
public double gammaImbalancePenalty()Returns the value of thegammaImbalancePenaltyrecord component.- Returns:
- the value of the
gammaImbalancePenaltyrecord component
-
deltaLowMarginPenalty
public double deltaLowMarginPenalty()Returns the value of thedeltaLowMarginPenaltyrecord component.- Returns:
- the value of the
deltaLowMarginPenaltyrecord component
-
minScoreGain
public double minScoreGain()Returns the value of theminScoreGainrecord component.- Returns:
- the value of the
minScoreGainrecord component
-