cblearn.datasets.fetch_material_similarity#
- cblearn.datasets.fetch_material_similarity(data_home=None, download_if_missing=True, shuffle=True, random_state=None, return_triplets=False)[source]#
Load the material similarity dataset (triplets).
Triplets Train/Test
22801 / 3000
Responses
92892 / 11800
Objects (Materials)
100
See Material Similarity dataset for a detailed description.
>>> dataset = fetch_material_similarity(shuffle=True) >>> dataset.material_name[[0, -1]].tolist() ['alum-bronze', 'yellow-plastic'] >>> dataset.triplet.shape, dataset.response.shape ((92892, 3), (92892,))
- Parameters:
data_home (PathLike | None) – optional, default: None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
download_if_missing (bool) – optional, default=True
shuffle (bool) – default = True Shuffle the order of triplet constraints.
random_state (RandomState | None) – optional, default = None Initialization for shuffle random generator
return_triplets (bool) – boolean, default=False. If True, returns numpy array instead of a Bunch object.
- Returns:
BunchDictionary-like object, with the following attributes.
- tripletndarray, shape (n_triplets, 3)
Each row corresponding a triplet constraint. The columns represent the reference and two other material indices.
- responsendarray, shape (n_triplets, )
The count of subject responses that chose the first other (positive) or second other (negative) material to be more similar to the reference material.
- test_tripletndarray, shape (n_test_triplets, 3)
handoff test set.
- test_responsendarray, shape (n_test_triplets, )
handoff test set.
- material_namendarray, shape (100, )
Names of the materials.
- DESCRstring
Description of the dataset.
- triplets, responsenumpy arrays (n_triplets, 3) and (n_triplets, )
Only present when return_triplets=True.
- Return type:
dataset
- Raises:
IOError – If the data is not locally available, but download_if_missing=False