cblearn.datasets.fetch_things_similarity#
- cblearn.datasets.fetch_things_similarity(data_home=None, download_if_missing=True, shuffle=True, random_state=None, return_data=False)[source]#
Load the things similarity dataset (odd-one-out).
Trials
146,012
Objects (Things)
1,854
Query
3 images, odd one out
See Things Similarity dataset for a detailed description.
>>> dataset = fetch_things_similarity(shuffle=True) >>> dataset.word[[0, -1]].tolist() ['aardvark', 'zucchini'] >>> dataset.data.shape (146012, 3)
- Parameters:
data_home (PathLike | None) – optional, default: None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
download_if_missing (bool) – optional, default=True
shuffle (bool) – default = True Shuffle the order of triplet constraints.
random_state (RandomState | None) – optional, default = None Initialization for shuffle random generator
return_triplets – boolean, default=False. If True, returns numpy array instead of a Bunch object.
- Returns:
BunchDictionary-like object, with the following attributes.
- datandarray, shape (n_query, 3)
Each row corresponding a odd-one-out query, entries are object indices. The first column is the selected odd-one.
- word(n_objects,)
Single word associated with the thing objects.
- synset(n_objects,)
Wordnet Synset associated with the thing objects.
- wordnet_id(n_objects,)
Wordnet Id associated with the thing objects.
- thing_id(n_objects,)
Unique Id string associated with the thing objects.
- DESCRstring
Description of the dataset.
- datanumpy arrays (n_query, 3)
Only present when return_data=True.
- Return type:
dataset
- Raises:
IOError – If the data is not locally available, but download_if_missing=False