Package com.apple.foundationdb.kmeans
Class PartitionEvaluator
java.lang.Object
com.apple.foundationdb.kmeans.PartitionEvaluator
Evaluates whether a candidate partitioning should replace a current partitioning. Works
symmetrically across splits (more clusters in the candidate), merges (fewer clusters in the
candidate), and same-k re-partitionings; the
k == 1 case on either side is handled by
treating missing per-cluster statistics (separation, low-margin rate) as neutral zero
contributions to the composite score and by skipping separation/margin hard rejects when the
candidate has fewer than two clusters.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumThe verdictevaluate(java.util.List<V>, com.apple.foundationdb.kmeans.PartitionEvaluator.Partition<?>, java.util.List<V>, com.apple.foundationdb.kmeans.PartitionEvaluator.Partition<?>, com.apple.foundationdb.util.Lens<V, com.apple.foundationdb.linear.RealVector>, com.apple.foundationdb.kmeans.PartitionEvaluator.Parameters)returns about a candidate partitioning relative to the current one.static final recordThe outcome of evaluating a candidate partitioning against the current layout.static final recordTuning parameters that control when a candidate repartitioning is accepted or rejected, and how the composite quality score is computed.static final recordRepresents a partitioning of a set of vectors intokclusters, as passed toevaluate(java.util.List<V>, com.apple.foundationdb.kmeans.PartitionEvaluator.Partition<?>, java.util.List<V>, com.apple.foundationdb.kmeans.PartitionEvaluator.Partition<?>, com.apple.foundationdb.util.Lens<V, com.apple.foundationdb.linear.RealVector>, com.apple.foundationdb.kmeans.PartitionEvaluator.Parameters).static final recordQuality statistics computed for a partitioning (current or candidate). -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <V> PartitionEvaluator.EvaluationResultevaluate(List<V> currentVectors, PartitionEvaluator.Partition<?> current, List<V> candidateVectors, PartitionEvaluator.Partition<?> candidate, Lens<V, RealVector> vectorLens, PartitionEvaluator.Parameters parameters) Evaluates a candidate partitioning against the current one and returns whether to accept, keep, or reject it.
-
Constructor Details
-
PartitionEvaluator
public PartitionEvaluator()
-
-
Method Details
-
evaluate
@Nonnull public static <V> PartitionEvaluator.EvaluationResult evaluate(@Nonnull List<V> currentVectors, @Nonnull PartitionEvaluator.Partition<?> current, @Nonnull List<V> candidateVectors, @Nonnull PartitionEvaluator.Partition<?> candidate, @Nonnull Lens<V, RealVector> vectorLens, @Nonnull PartitionEvaluator.Parameters parameters) Evaluates a candidate partitioning against the current one and returns whether to accept, keep, or reject it.The decision is made by computing a panel of quality statistics for both partitionings (see
PartitionEvaluator.PartitionStats) and combining them into a composite score:
Hard rejects are applied first (scoreGain = alphaSseGain * relativeSseGain + betaSeparationGain * separationGain - gammaImbalancePenalty * imbalancePenalty - deltaLowMarginPenalty * lowMarginPenaltyminSmallestFrac,maxLargestFrac, candidate separation/low-margin thresholds whencandidate.k() >= 2, and the absoluteminRelativeSseGain/minScoreGainfloors); only if all of those pass is the candidate accepted.Symmetric handling: separation and low-margin rate are undefined when
k < 2; this method treats them as0in the score formula and skips the corresponding hard rejects whencandidate.k() < 2, so the same logic correctly handles splits, merges (including merges to a single cluster), and same-kre-partitionings.- Type Parameters:
V- caller's input vector representation- Parameters:
currentVectors- the vectors belonging to the current partitioning. Must be non-emptycurrent- the current partitioningcandidateVectors- the vectors belonging to the candidate partitioning. Often the same list ascurrentVectorsbut may differ when the caller is re-clustering a different point setcandidate- the candidate partitioning to evaluatevectorLens- lens that extracts aRealVectorfrom eachcurrentVectors/candidateVectorselementparameters- tuning parameters that control thresholds and score weights- Returns:
- an
PartitionEvaluator.EvaluationResultcarrying the decision, both stats, and the metrics that led to the decision
-