cblearn.datasets.fetch_imagenet_similarity#
- cblearn.datasets.fetch_imagenet_similarity(data_home=None, download_if_missing=True, shuffle=True, random_state=None, version='0.1', return_data=False)[source]#
Load the imagenet similarity dataset (rank 2 from 8).
Trials v0.1/v0.2
25,273 / 384,277
Objects (Images)
1,000 / 50,000
Classes
1,000
Query
rank 2 from 8
See Imagenet Similarity dataset for a detailed description.
>>> dataset = fetch_imagenet_similarity(shuffle=True, version='0.1') >>> dataset.class_label[[0, -1]].tolist() ['n01440764', 'n15075141'] >>> dataset.n_select, dataset.is_ranked (2, True) >>> dataset.data.shape (25273, 9)
- Parameters:
data_home (PathLike | None) – optional, default: None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
download_if_missing (bool) – optional, default=True
shuffle (bool) – default = True Shuffle the order of triplet constraints.
random_state (RandomState | None) – optional, default = None Initialization for shuffle random generator
version (str) – Version of the dataset. ‘0.1’ contains one object per class, ‘0.2’ 50 objects per class.
return_triplets – boolean, default=False. If True, returns numpy array instead of a Bunch object.
- Returns:
BunchDictionary-like object, with the following attributes.
- datandarray, shape (n_query, 9)
Each row corresponding a rank-2-of-8 query, entries are object indices. The first column is the reference, the second column is the most similar, and the third column is the second most similar object.
- rt_msndarray, shape (n_query, )
Reaction time in milliseconds.
- n_selectint
Number of selected objects per trial.
- is_rankedbool
Whether the selection is ranked in similarity to the reference.
- session_id(n_query,)
Ids of the survey session for query recording.
- stimulus_id(50.000,)
Ids of the images.
- stimulus_filepath(50.000,)
Filepaths of images.
- class_id(50.000,)
ImageNet class assigned to each image.
- class_label(1.000,)
WordNet labels of the classes.
- DESCRstring
Description of the dataset.
- datanumpy arrays (n_query, 9)
Only present when return_data=True.
- Return type:
dataset
- Raises:
IOError – If the data is not locally available, but download_if_missing=False