HDNNPDataset¶
-
class
hdnnpy.dataset.hdnnp_dataset.
HDNNPDataset
(descriptor, property_, dataset=None)[source]¶ Bases:
object
Combine and preprocess descriptor and property dataset.
It is desirable that the type of descriptor and property used for HDNNP is fixed at initialization.Also, an instance itself does not have any dataset at initialization and you need to executeconstruct()
.Ifdataset
is given it will be an instance’s own dataset.Parameters: - descriptor (DescriptorDatasetBase) – Descriptor instance you want to use as HDNNP input.
- property_ (PropertyDatasetBase) – Property instance you want to use as HDNNP label.
- dataset (dict [ndarray], optional) – If specified, dataset will be initialized with this.
-
__len__
()[source]¶ Redicect to
partial_size
-
construct
(all_elements=None, preprocesses=None, shuffle=True, verbose=True)[source]¶ Construct an instance’s own dataset.
This method does following steps:
- Check compatibility between descriptor and property datasets.
- Expand feature dimension of descriptor dataset according to
all_elements
and pre-process descriptor dataset in a given order and add to its own dataset. - Add property dataset to its own dataset.
- Clear up the original data in descriptor and property dataset.
- Shuffle the order of the data.
Parameters: - all_elements (list [str], optional) – If specified, it expands feature dimensions of descriptor dataset according to this.
- preprocesses (list [PreprocessBase], optional) – If specified, it pre-processes descriptor dataset in a given order.
- shuffle (bool, optional) – If specified, it shuffles the order of the data.
- verbose (bool, optional) – Print log to stdout.
Raises: AssertionError
– If descriptor and property datasets are incompatible.
-
scatter
(max_buf_len=268435456)[source]¶ Scatter dataset by MPI communication.
Each instance is re-initialized with received dataset.
Parameters: max_buf_len (int, optional) – Each data is divided into chunks of this size at maximum.
-
take
(index)[source]¶ Return copied object that has sliced dataset.
Parameters: index (int or slice) – Copied object has dataset indexed or sliced by this.
-
descriptor
¶ Descriptor dataset instance.
Type: DescriptorDatasetBase
-
property
¶ Property dataset instance.
Type: PropertyDatasetBase