skorch.utils

skorch utilities.

Should not have any dependency on other skorch packages.

class skorch.utils.Ansi[source]

An enumeration.

class skorch.utils.FirstStepAccumulator[source]

Store and retrieve the train step data.

This class simply stores the first step value and returns it.

For most uses, skorch.utils.FirstStepAccumulator is what you want, since the optimizer calls the train step exactly once. However, some optimizerss such as LBFGSs make more than one call. If in that case, you don’t want the first value to be returned (but instead, say, the last value), implement your own accumulator and make sure it is returned by NeuralNet.get_train_step_accumulator method.

Methods

get_step() Return the stored step.
store_step(step) Store the first step.
get_step()[source]

Return the stored step.

store_step(step)[source]

Store the first step.

class skorch.utils.TeeGenerator(gen)[source]

Stores a generator and calls tee on it to create new generators when TeeGenerator is iterated over to let you iterate over the given generator more than once.

skorch.utils.check_indexing(data)[source]

Perform a check how incoming data should be indexed and return an appropriate indexing function with signature f(data, index).

This is useful for determining upfront how data should be indexed instead of doing it repeatedly for each batch, thus saving some time.

skorch.utils.check_is_fitted(estimator, attributes, msg=None, all_or_any=<built-in function all>)[source]

Checks whether the net is initialized.

Note: This calls sklearn.utils.validation.check_is_fitted under the hood, using exactly the same arguments and logic. The only difference is that this function has an adapted error message and raises a skorch.exception.NotInitializedError instead of an sklearn.exceptions.NotFittedError.

skorch.utils.data_from_dataset(dataset, X_indexing=None, y_indexing=None)[source]

Try to access X and y attribute from dataset.

Also works when dataset is a subset.

Parameters:
dataset : skorch.dataset.Dataset or torch.utils.data.Subset

The incoming dataset should be a skorch.dataset.Dataset or a torch.utils.data.Subset of a skorch.dataset.Dataset.

X_indexing : function/callable or None (default=None)

If not None, use this function for indexing into the X data. If None, try to automatically determine how to index data.

y_indexing : function/callable or None (default=None)

If not None, use this function for indexing into the y data. If None, try to automatically determine how to index data.

skorch.utils.duplicate_items(*collections)[source]

Search for duplicate items in all collections.

Examples

>>> duplicate_items([1, 2], [3])
set()
>>> duplicate_items({1: 'a', 2: 'a'})
set()
>>> duplicate_items(['a', 'b', 'a'])
{'a'}
>>> duplicate_items([1, 2], {3: 'hi', 4: 'ha'}, (2, 3))
{2, 3}
skorch.utils.freeze_parameter(param)[source]

Convenience function to freeze a passed torch parameter. Used by skorch.callbacks.Freezer

skorch.utils.get_dim(y)[source]

Return the number of dimensions of a torch tensor or numpy array-like object.

skorch.utils.get_map_location(target_device, fallback_device='cpu')[source]

Determine the location to map loaded data (e.g., weights) for a given target device (e.g. ‘cuda’).

skorch.utils.is_skorch_dataset(ds)[source]

Checks if the supplied dataset is an instance of skorch.dataset.Dataset even when it is nested inside torch.util.data.Subset.

skorch.utils.multi_indexing(data, i, indexing=None)[source]

Perform indexing on multiple data structures.

Currently supported data types:

  • numpy arrays
  • torch tensors
  • pandas NDFrame
  • a dictionary of the former three
  • a list/tuple of the former three

i can be an integer or a slice.

Parameters:
data

Data of a type mentioned above.

i : int or slice

Slicing index.

indexing : function/callable or None (default=None)

If not None, use this function for indexing into the data. If None, try to automatically determine how to index data.

Examples

>>> multi_indexing(np.asarray([1, 2, 3]), 0)
1
>>> multi_indexing(np.asarray([1, 2, 3]), np.s_[:2])
array([1, 2])
>>> multi_indexing(torch.arange(0, 4), np.s_[1:3])
tensor([ 1.,  2.])
>>> multi_indexing([[1, 2, 3], [4, 5, 6]], np.s_[:2])
[[1, 2], [4, 5]]
>>> multi_indexing({'a': [1, 2, 3], 'b': [4, 5, 6]}, np.s_[-2:])
{'a': [2, 3], 'b': [5, 6]}
>>> multi_indexing(pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}), [1, 2])
   a  b
1  2  5
2  3  6
skorch.utils.noop(*args, **kwargs)[source]

No-op function that does nothing and returns None.

This is useful for defining scoring callbacks that do not need a target extractor.

skorch.utils.open_file_like(f, mode)[source]

Wrapper for opening a file

skorch.utils.params_for(prefix, kwargs)[source]

Extract parameters that belong to a given sklearn module prefix from kwargs. This is useful to obtain parameters that belong to a submodule.

Examples

>>> kwargs = {'encoder__a': 3, 'encoder__b': 4, 'decoder__a': 5}
>>> params_for('encoder', kwargs)
{'a': 3, 'b': 4}
skorch.utils.to_device(X, device)[source]

Generic function to move module output(s) to a device.

Deals with X being a torch tensor or a tuple of torch tensors.

skorch.utils.to_numpy(X)[source]

Generic function to convert a pytorch tensor to numpy.

Returns X when it already is a numpy array.

skorch.utils.to_tensor(X, device, accept_sparse=False)[source]

Turn input data to torch tensor.

Parameters:
X : input data
Handles the cases:
  • PackedSequence
  • numpy array
  • torch Tensor
  • scipy sparse CSR matrix
  • list or tuple of one of the former
  • dict with values of one of the former
device : str, torch.device

The compute device to be used. If set to ‘cuda’, data in torch tensors will be pushed to cuda tensors before being sent to the module.

accept_sparse : bool (default=False)

Whether to accept scipy sparse matrices as input. If False, passing a sparse matrix raises an error. If True, it is converted to a torch COO tensor.

Returns:
output : torch Tensor
skorch.utils.unfreeze_parameter(param)[source]

Convenience function to unfreeze a passed torch parameter. Used by skorch.callbacks.Unfreezer