cblearn.datasets.fetch_musician_similarity#
- cblearn.datasets.fetch_musician_similarity(data_home=None, download_if_missing=True, shuffle=True, random_state=None, return_triplets=False, valid_triplets=True)[source]#
Load the MusicSeer musician similarity dataset (triplets).
Triplets
118.263
Objects (Artists)
448
Dimensionality
unknown
Warning
This dataset contains triplets of musicians, which are not unique. I.e. for some triplets (i, j, k), i==j, j==k, or i==k is possible. This function by default filters out these triplets, but this can be disabled by setting valid_triplets=False.
See Musician Similarity dataset for a detailed description.
- Parameters:
data_home (PathLike | None) – optional, default: None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
download_if_missing (bool) – optional, default=True
shuffle (bool) – default = True Shuffle the order of triplet constraints.
random_state (RandomState | None) – optional, default = None Initialization for shuffle random generator
return_triplets (bool) – boolean, default=False. If True, returns numpy array instead of a Bunch object.
valid_triplets (bool) – boolean, default=True. If True, only valid triplets are returned. I.e. triplets where i!=j!=k.
- Returns:
BunchDictionary-like object, with the following attributes.
- datandarray, shape (n_triplets, 3)
Each row corresponding a triplet constraint. The columns represent the target, choosen and other musician index.
- judgement_idnp.ndarray, shape (n_triplets, )
Id of survey Query.
- survey_or_gamenp.ndarray, shape (n_triplets,)
Letter ‘S’ or ‘G’ indicating if comparison origins from survey or game.
- usernp.ndarray, shape (n_triplets, )
Array of the user ids, answering the triplet question
- artist_namenp.ndarray, shape (413,)
Names of artists, corresponding to the triplet indices.
- artist_idnp.ndarray, shape (413,)
Ids of artists, corresponding to the triplet indices.
- DESCRstring
Description of the dataset.
- tripletsnumpy array (n_triplets, 3)
Only present when return_triplets=True.
- Return type:
dataset
- Raises:
IOError – If the data is not locally available, but download_if_missing=False