Interface RealVector

All Known Implementing Classes:
AbstractRealVector, DoubleRealVector, EncodedRealVector, FloatRealVector, HalfRealVector, MutableDoubleRealVector

public interface RealVector
Real-valued mathematical vector — the common API every dense vector representation in this package implements. Concrete subtypes differ only in element precision and storage layout: DoubleRealVector stores 64-bit doubles, FloatRealVector stores 32-bit floats truncated to doubles, HalfRealVector stores 16-bit halves likewise. All numerical methods on this interface return values in double regardless of the underlying precision; the precision difference shows up as round-trip loss in toHalfRealVector() / toFloatRealVector() conversions, not in arithmetic.

The interface exposes three families of operations:

  • Field Details

    • EPS

      static final double EPS
      Threshold (in L2-norm units) below which a vector is treated as "effectively zero" — i.e. its direction is considered undefined for metrics that depend on it, principally cosine similarity. Used by isNearlyZeroNorm(), which compares the squared L2 norm against EPS * EPS to avoid a sqrt.

      The value is sized to sit well above the floating-point noise of typical double-precision accumulations and well below any norm a meaningful vector would have in practice. Callers generally shouldn't need to consult this constant directly; prefer isNearlyZeroNorm().

      See Also:
    • VECTOR_TYPES

      static final com.google.common.collect.ImmutableList<VectorType> VECTOR_TYPES
      Cached snapshot of VectorType.values() as an immutable list. Used by fromVectorTypeOrdinal(int) so the type lookup avoids re-cloning the enum's value array on every call.
  • Method Details

    • getNumDimensions

      int getNumDimensions()
      Returns the number of elements in the vector, i.e. the number of dimensions.
      Returns:
      the number of dimensions
    • getComponent

      double getComponent(int dimension)
      Gets the component of this object at the specified dimension.

      The dimension is a zero-based index. For a 3D vector, for example, dimension 0 might correspond to the x-component, 1 to the y-component, and 2 to the z-component. This method provides direct access to the underlying data element.

      Parameters:
      dimension - the zero-based index of the component to retrieve.
      Returns:
      the component at the specified dimension, which is guaranteed to be non-null.
      Throws:
      IndexOutOfBoundsException - if the dimension is negative or greater than or equal to the number of dimensions of this object.
    • getData

      @Nonnull double[] getData()
      Returns the underlying data array.

      The returned array is guaranteed to be non-null. Note that this method returns a direct reference to the internal array, not a copy.

      Returns:
      the data array of type R[], never null.
    • withData

      @Nonnull RealVector withData(@Nonnull double[] data)
      Returns a new vector of the same precision and length as the receiver but with the given component data. Implementations decide whether the returned vector aliases data (immutable subtypes typically do; mutable subtypes copy through their existing storage).
      Parameters:
      data - the components for the new vector; length must match this vector's dimensionality
      Returns:
      a non-null vector with the given data
    • getRawData

      @Nonnull byte[] getRawData()
      Gets the raw byte data representation of this object.

      This method provides a direct, unprocessed view of the object's underlying data. The format of the byte array is implementation-specific and should be documented by the concrete class that implements this method.

      Returns:
      a non-null byte array containing the raw data.
    • toHalfRealVector

      @Nonnull HalfRealVector toHalfRealVector()
      Converts this object into a RealVector of Half precision floating-point numbers.

      As this is an abstract method, implementing classes are responsible for defining the specific conversion logic from their internal representation to a RealVector using Half objects to serialize and deserialize the vector. If this object already is a HalfRealVector this method should return this.

      Returns:
      a non-null HalfRealVector containing the Half precision floating-point representation of this object.
    • toFloatRealVector

      @Nonnull FloatRealVector toFloatRealVector()
      Converts this object into a RealVector of single precision floating-point numbers.

      As this is an abstract method, implementing classes are responsible for defining the specific conversion logic from their internal representation to a RealVector using floating point numbers to serialize and deserialize the vector. If this object already is a FloatRealVector this method should return this.

      Returns:
      a non-null FloatRealVector containing the single precision floating-point representation of this object.
    • toDoubleRealVector

      @Nonnull DoubleRealVector toDoubleRealVector()
      Converts this vector into a DoubleRealVector.

      This method provides a way to obtain a double-precision floating-point representation of the vector. If the vector is already an instance of DoubleRealVector, this method may return the instance itself. Otherwise, it will create a new DoubleRealVector containing the same elements, which may involve a conversion of the underlying data type.

      Returns:
      a non-null DoubleRealVector representation of this vector.
    • toMutable

      @Nonnull default MutableDoubleRealVector toMutable()
      Returns a MutableDoubleRealVector carrying the same components as this vector. The default implementation clones the underlying data so the returned mutable instance is independent of the receiver; MutableDoubleRealVector.toMutable() overrides this to return this since it's already mutable.
      Returns:
      a fresh (or in the case of MutableDoubleRealVector, the same) mutable double-precision vector
    • toImmutable

      @Nonnull RealVector toImmutable()
      Returns an immutable view of this vector — i.e. one whose components cannot be subsequently mutated through any reference. Immutable subtypes return this; MutableDoubleRealVector.toImmutable() returns a fresh DoubleRealVector with cloned data.
      Returns:
      a non-null immutable vector with the same components as this vector
    • dot

      default double dot(@Nonnull RealVector other)
      Returns the dot product Σ_i this[i] * other[i]. The receiver is not mutated.
      Parameters:
      other - the right operand; must have the same dimensionality as this vector
      Returns:
      the dot product
      Throws:
      IllegalArgumentException - if other has a different dimensionality
    • l2SquaredDistance

      default double l2SquaredDistance(@Nonnull RealVector other)
      Returns the squared Euclidean distance to other, i.e. Σ_i (this[i] - other[i])^2. Equivalent to but cheaper than subtract(other).l2SquaredNorm() (no temporary allocation), and cheaper than Math.pow(estimator.distance(this, other), 2) for the Euclidean metric (skips a sqrt that would just be squared again).
    • isNearlyZeroNorm

      default boolean isNearlyZeroNorm()
      Returns true when this vector's L2 norm is at or below the EPS threshold, i.e. when the vector is "effectively zero" and its direction is undefined for cosine-style metrics. Implemented as a squared-norm comparison so no sqrt is needed.
      Returns:
      true if this vector is effectively zero
    • l2Norm

      default double l2Norm()
      Returns the L2 (Euclidean) norm sqrt(Σ_i this[i]^2). Prefer l2SquaredNorm() when you only need to compare or threshold magnitudes — it skips the sqrt.
      Returns:
      the L2 norm of this vector
    • l2SquaredNorm

      double l2SquaredNorm()
      Returns the squared L2 norm Σ_i this[i]^2. Implementations typically memoize this since the value is reused by l2Norm() and several distance helpers.
      Returns:
      the squared L2 norm of this vector
    • normalize

      @Nonnull default RealVector normalize()
      Returns a new vector pointing in the same direction as this vector but scaled to unit L2 norm. The receiver is not mutated; the result is a fresh allocation of the same precision type.
      Returns:
      a non-null unit-norm vector
      Throws:
      IllegalArgumentException - if this vector's L2 norm is zero, infinite, or NaN — direction is undefined in those cases
    • add

      @Nonnull default RealVector add(@Nonnull RealVector other)
      Returns a new vector whose components are the element-wise sum of this vector and other. The receiver is not mutated.
      Parameters:
      other - the right operand; must have the same dimensionality as this vector
      Returns:
      a non-null vector with result[i] = this[i] + other[i]
      Throws:
      IllegalArgumentException - if other has a different dimensionality
    • add

      @Nonnull default RealVector add(double scalar)
      Returns a new vector with scalar added to every component of this vector. The receiver is not mutated.
      Parameters:
      scalar - the value to add to each component
      Returns:
      a non-null vector with result[i] = this[i] + scalar
    • subtract

      @Nonnull default RealVector subtract(@Nonnull RealVector other)
      Returns a new vector whose components are the element-wise difference of this vector and other. The receiver is not mutated.
      Parameters:
      other - the right operand; must have the same dimensionality as this vector
      Returns:
      a non-null vector with result[i] = this[i] - other[i]
      Throws:
      IllegalArgumentException - if other has a different dimensionality
    • subtract

      @Nonnull default RealVector subtract(double scalar)
      Returns a new vector with scalar subtracted from every component of this vector. The receiver is not mutated.
      Parameters:
      scalar - the value to subtract from each component
      Returns:
      a non-null vector with result[i] = this[i] - scalar
    • multiply

      @Nonnull default RealVector multiply(double scalar)
      Returns a new vector with every component of this vector scaled by scalar. The receiver is not mutated.
      Parameters:
      scalar - the factor to scale each component by
      Returns:
      a non-null vector with result[i] = this[i] * scalar
    • fromVectorTypeOrdinal

      @Nonnull static VectorType fromVectorTypeOrdinal(int ordinal)
      Returns the VectorType with the given ordinal in VectorType.values(), looked up from the cached VECTOR_TYPES list. Used while deserializing a vector to dispatch to the right subtype's fromBytes.
      Parameters:
      ordinal - the type's enum ordinal
      Returns:
      the matching VectorType; never null
      Throws:
      IndexOutOfBoundsException - if ordinal is not a valid enum ordinal
    • fromBytes

      @Nonnull static RealVector fromBytes(@Nonnull byte[] vectorBytes)
      Creates a RealVector from a byte array.

      This method interprets the input byte array by interpreting the first byte of the array as the type of vector. It then delegates to fromBytes(VectorType, byte[]) to do the actual deserialization.

      Parameters:
      vectorBytes - the non-null byte array to convert.
      Returns:
      a new RealVector instance created from the byte array.
    • fromBytes

      @Nonnull static RealVector fromBytes(@Nonnull VectorType vectorType, @Nonnull byte[] vectorBytes)
      Creates a RealVector from a byte array.

      This implementation dispatches to the actual logic that deserialize a byte array to a vector which is located in the respective implementations of RealVector.

      Parameters:
      vectorType - the vector type of the serialized vector
      vectorBytes - the non-null byte array to convert.
      Returns:
      a new RealVector instance created from the byte array.