skorch.utils¶

skorch utilities.

Should not have any dependency on other skorch packages.

class skorch.utils.Ansi(value)[source]¶: An enumeration.

class skorch.utils.FirstStepAccumulator[source]¶

Store and retrieve the train step data.

This class simply stores the first step value and returns it.

For most uses, skorch.utils.FirstStepAccumulator is what you want, since the optimizer calls the train step exactly once. However, some optimizerss such as LBFGSs make more than one call. If in that case, you don’t want the first value to be returned (but instead, say, the last value), implement your own accumulator and make sure it is returned by NeuralNet.get_train_step_accumulator method.

Methods

`get_step`()	Return the stored step.
`store_step`(step)	Store the first step.

get_step()[source]¶: Return the stored step.

store_step(step)[source]¶: Store the first step.

class skorch.utils.TeeGenerator(gen)[source]¶: Stores a generator and calls tee on it to create new generators when TeeGenerator is iterated over to let you iterate over the given generator more than once.

skorch.utils.check_indexing(data)[source]¶

Perform a check how incoming data should be indexed and return an appropriate indexing function with signature f(data, index).

This is useful for determining upfront how data should be indexed instead of doing it repeatedly for each batch, thus saving some time.

skorch.utils.check_is_fitted(estimator, attributes=None, msg=None, all_or_any=<built-in function all>)[source]¶

Checks whether the net is initialized.

Note: This calls sklearn.utils.validation.check_is_fitted under the hood, using exactly the same arguments and logic. The only difference is that this function has an adapted error message and raises a skorch.exception.NotInitializedError instead of an sklearn.exceptions.NotFittedError.

skorch.utils.data_from_dataset(dataset, X_indexing=None, y_indexing=None)[source]¶

Try to access X and y attribute from dataset.

Also works when dataset is a subset.

Parameters

datasetskorch.dataset.Dataset or torch.utils.data.Subset: The incoming dataset should be a skorch.dataset.Dataset or a torch.utils.data.Subset of a skorch.dataset.Dataset.
X_indexingfunction/callable or None (default=None): If not None, use this function for indexing into the X data. If None, try to automatically determine how to index data.
y_indexingfunction/callable or None (default=None): If not None, use this function for indexing into the y data. If None, try to automatically determine how to index data.

Raises

AttributeError: If X and y could not be accessed from the dataset.

skorch.utils.duplicate_items(*collections)[source]¶

Search for duplicate items in all collections.

Examples

>>> duplicate_items([1, 2], [3])
set()
>>> duplicate_items({1: 'a', 2: 'a'})
set()
>>> duplicate_items(['a', 'b', 'a'])
{'a'}
>>> duplicate_items([1, 2], {3: 'hi', 4: 'ha'}, (2, 3))
{2, 3}

skorch.utils.freeze_parameter(param)[source]¶: Convenience function to freeze a passed torch parameter. Used by skorch.callbacks.Freezer

skorch.utils.get_dim(y)[source]¶: Return the number of dimensions of a torch tensor or numpy array-like object.

skorch.utils.get_map_location(target_device, fallback_device='cpu')[source]¶: Determine the location to map loaded data (e.g., weights) for a given target device (e.g. ‘cuda’).

skorch.utils.is_skorch_dataset(ds)[source]¶: Checks if the supplied dataset is an instance of skorch.dataset.Dataset even when it is nested inside torch.util.data.Subset.

skorch.utils.multi_indexing(data, i, indexing=None)[source]¶

Perform indexing on multiple data structures.

Currently supported data types:

numpy arrays
torch tensors
pandas NDFrame
a dictionary of the former three
a list/tuple of the former three

i can be an integer or a slice.

Parameters

data: Data of a type mentioned above.
iint or slice: Slicing index.
indexingfunction/callable or None (default=None): If not None, use this function for indexing into the data. If None, try to automatically determine how to index data.

Examples

>>> multi_indexing(np.asarray([1, 2, 3]), 0)
1

>>> multi_indexing(np.asarray([1, 2, 3]), np.s_[:2])
array([1, 2])

>>> multi_indexing(torch.arange(0, 4), np.s_[1:3])
tensor([ 1.,  2.])

>>> multi_indexing([[1, 2, 3], [4, 5, 6]], np.s_[:2])
[[1, 2], [4, 5]]

>>> multi_indexing({'a': [1, 2, 3], 'b': [4, 5, 6]}, np.s_[-2:])
{'a': [2, 3], 'b': [5, 6]}

>>> multi_indexing(pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}), [1, 2])
   a  b
1  2  5
2  3  6

skorch.utils.noop(*args, **kwargs)[source]¶

No-op function that does nothing and returns None.

This is useful for defining scoring callbacks that do not need a target extractor.

skorch.utils.open_file_like(f, mode)[source]¶: Wrapper for opening a file

skorch.utils.params_for(prefix, kwargs)[source]¶

Extract parameters that belong to a given sklearn module prefix from kwargs. This is useful to obtain parameters that belong to a submodule.

Examples

>>> kwargs = {'encoder__a': 3, 'encoder__b': 4, 'decoder__a': 5}
>>> params_for('encoder', kwargs)
{'a': 3, 'b': 4}

skorch.utils.to_device(X, device)[source]¶

Generic function to modify the device type of the tensor(s) or module.

PyTorch distribution objects are left untouched, since they don’t support an API to move between devices.

Parameters

Xinput data

Deals with X being a:

torch tensor

tuple of torch tensors

dict of torch tensors

PackSequence instance

torch.nn.Module

devicestr, torch.device

The compute device to be used. If device=None, return the input unmodified. If device=’auto’, hardware acceleration like CUDA is being used if available, and CPU otherwise.

skorch.utils.to_numpy(X)[source]¶

Generic function to convert a pytorch tensor to numpy.

This function tries to unpack the tensor(s) from supported data structures (e.g., dicts, lists, etc.) but doesn’t go beyond.

Returns X when it already is a numpy array.

skorch.utils.to_tensor(X, device, accept_sparse=False)[source]¶

Turn input data to torch tensor.

Parameters

Xinput data

Handles the cases:

PackedSequence
numpy array
torch Tensor
scipy sparse CSR matrix
list or tuple of one of the former
dict with values of one of the former

devicestr, torch.device

The compute device to be used. If set to ‘cuda’, data in torch tensors will be pushed to cuda tensors before being sent to the module. If set to ‘auto’, hardware acceleration like CUDA is being used if available, and CPU otherwise.

accept_sparsebool (default=False)

Whether to accept scipy sparse matrices as input. If False, passing a sparse matrix raises an error. If True, it is converted to a torch COO tensor.

Returns

outputtorch Tensor

skorch.utils.unfreeze_parameter(param)[source]¶: Convenience function to unfreeze a passed torch parameter. Used by skorch.callbacks.Unfreezer