cblearn.datasets.noisy_triplet_response#

cblearn.datasets.noisy_triplet_response(triplets, embedding, result_format=None, noise=None, noise_options={}, noise_target='differences', random_state=None, distance='euclidean')[source]#

Triplet response for an embedding with noise.

Parameters:
  • triplets (ndarray | COO | spmatrix) – Numpy array or sparse matrix of triplet indices

  • embedding (ndarray) – Numpy array of object coordinates, (n_objects, n_components) or distance matrix (n_objects, n_objects)

  • result_format (str | None) – Format of the result. If none, keeps input format.

  • noise (None | str | Callable) – Noise distribution. Can be the name of a distribution function from numpy.random.RandomState or a function accepting the same arguments. If None, no noise will be applied.

  • noise_options (Dict) – Additional arguments passed to the noise function as keyword arguments.

  • noise_target (str | NoiseTarget) – ‘points’ if noise should be added to triplet coordinates or ‘differences’ if noise should be added to distance difference.

  • random_state (None | int | RandomState) – State or seed for noise sampling.

  • distance (str | Distance) – {‘euclidean’, ‘precomputed’}. Specifies distance metrix between embedding points or if distances are passed directly as distance matrix.

Returns:

Response in format as defined by response_format, either numpy array (n_triplets,) or sparse matrix

If return_indices is True, a tuple of indices and responses can be returned

Return type:

ndarray | COO | spmatrix

>>> from cblearn.datasets import noisy_triplet_response
>>> triplets = [[0, 1, 2], [1, 2, 3]]
>>> embedding = [[0.1], [0.5], [0.9], [1.]]
>>> noisy_triplet_response(triplets, embedding, result_format='list-order')
array([[0, 1, 2],
       [1, 2, 3]], dtype=uint32)
>>> noisy_triplet_response(triplets, embedding, result_format='list-order',
...                       noise='normal', noise_options={'scale': 1}, random_state=42)
array([[0, 2, 1],
       [1, 2, 3]], dtype=uint32)
>>> from sklearn.metrics.pairwise import euclidean_distances
>>> distances = euclidean_distances(embedding)
>>> print(distances.shape)
(4, 4)
>>> noisy_triplet_response(triplets, distances, result_format='list-order', distance='precomputed')
array([[0, 1, 2],
       [1, 2, 3]], dtype=uint32)