skorch utilities.

Should not have any dependency on other skorch packages.

class skorch.utils.Ansi(value)[source]

An enumeration.

class skorch.utils.FirstStepAccumulator[source]

Store and retrieve the train step data.

This class simply stores the first step value and returns it.

For most uses, skorch.utils.FirstStepAccumulator is what you want, since the optimizer calls the train step exactly once. However, some optimizerss such as LBFGSs make more than one call. If in that case, you don’t want the first value to be returned (but instead, say, the last value), implement your own accumulator and make sure it is returned by NeuralNet.get_train_step_accumulator method.



Return the stored step.


Store the first step.


Return the stored step.


Store the first step.

class skorch.utils.TeeGenerator(gen)[source]

Stores a generator and calls tee on it to create new generators when TeeGenerator is iterated over to let you iterate over the given generator more than once.


Perform a check how incoming data should be indexed and return an appropriate indexing function with signature f(data, index).

This is useful for determining upfront how data should be indexed instead of doing it repeatedly for each batch, thus saving some time.

skorch.utils.check_is_fitted(estimator, attributes=None, msg=None, all_or_any=<built-in function all>)[source]

Checks whether the net is initialized.

Note: This calls sklearn.utils.validation.check_is_fitted under the hood, using exactly the same arguments and logic. The only difference is that this function has an adapted error message and raises a skorch.exception.NotInitializedError instead of an sklearn.exceptions.NotFittedError.

skorch.utils.data_from_dataset(dataset, X_indexing=None, y_indexing=None)[source]

Try to access X and y attribute from dataset.

Also works when dataset is a subset.

datasetskorch.dataset.Dataset or

The incoming dataset should be a skorch.dataset.Dataset or a of a skorch.dataset.Dataset.

X_indexingfunction/callable or None (default=None)

If not None, use this function for indexing into the X data. If None, try to automatically determine how to index data.

y_indexingfunction/callable or None (default=None)

If not None, use this function for indexing into the y data. If None, try to automatically determine how to index data.


If X and y could not be accessed from the dataset.


Search for duplicate items in all collections.


>>> duplicate_items([1, 2], [3])
>>> duplicate_items({1: 'a', 2: 'a'})
>>> duplicate_items(['a', 'b', 'a'])
>>> duplicate_items([1, 2], {3: 'hi', 4: 'ha'}, (2, 3))
{2, 3}

Convenience function to freeze a passed torch parameter. Used by skorch.callbacks.Freezer


Return the number of dimensions of a torch tensor or numpy array-like object.

skorch.utils.get_map_location(target_device, fallback_device='cpu')[source]

Determine the location to map loaded data (e.g., weights) for a given target device (e.g. ‘cuda’).


Checks if the supplied dataset is an instance of skorch.dataset.Dataset even when it is nested inside

skorch.utils.multi_indexing(data, i, indexing=None)[source]

Perform indexing on multiple data structures.

Currently supported data types:

  • numpy arrays

  • torch tensors

  • pandas NDFrame

  • a dictionary of the former three

  • a list/tuple of the former three

i can be an integer or a slice.


Data of a type mentioned above.

iint or slice

Slicing index.

indexingfunction/callable or None (default=None)

If not None, use this function for indexing into the data. If None, try to automatically determine how to index data.


>>> multi_indexing(np.asarray([1, 2, 3]), 0)
>>> multi_indexing(np.asarray([1, 2, 3]), np.s_[:2])
array([1, 2])
>>> multi_indexing(torch.arange(0, 4), np.s_[1:3])
tensor([ 1.,  2.])
>>> multi_indexing([[1, 2, 3], [4, 5, 6]], np.s_[:2])
[[1, 2], [4, 5]]
>>> multi_indexing({'a': [1, 2, 3], 'b': [4, 5, 6]}, np.s_[-2:])
{'a': [2, 3], 'b': [5, 6]}
>>> multi_indexing(pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}), [1, 2])
   a  b
1  2  5
2  3  6
skorch.utils.noop(*args, **kwargs)[source]

No-op function that does nothing and returns None.

This is useful for defining scoring callbacks that do not need a target extractor.

skorch.utils.open_file_like(f, mode)[source]

Wrapper for opening a file

skorch.utils.params_for(prefix, kwargs)[source]

Extract parameters that belong to a given sklearn module prefix from kwargs. This is useful to obtain parameters that belong to a submodule.


>>> kwargs = {'encoder__a': 3, 'encoder__b': 4, 'decoder__a': 5}
>>> params_for('encoder', kwargs)
{'a': 3, 'b': 4}
skorch.utils.to_device(X, device)[source]

Generic function to modify the device type of the tensor(s) or module.

PyTorch distribution objects are left untouched, since they don’t support an API to move between devices.

Xinput data

Deals with X being a:

  • torch tensor

  • tuple of torch tensors

  • dict of torch tensors

  • PackSequence instance

  • torch.nn.Module

devicestr, torch.device

The compute device to be used. If device=None, return the input unmodified


Generic function to convert a pytorch tensor to numpy.

This function tries to unpack the tensor(s) from supported data structures (e.g., dicts, lists, etc.) but doesn’t go beyond.

Returns X when it already is a numpy array.

skorch.utils.to_tensor(X, device, accept_sparse=False)[source]

Turn input data to torch tensor.

Xinput data
Handles the cases:
  • PackedSequence

  • numpy array

  • torch Tensor

  • scipy sparse CSR matrix

  • list or tuple of one of the former

  • dict with values of one of the former

devicestr, torch.device

The compute device to be used. If set to ‘cuda’, data in torch tensors will be pushed to cuda tensors before being sent to the module.

accept_sparsebool (default=False)

Whether to accept scipy sparse matrices as input. If False, passing a sparse matrix raises an error. If True, it is converted to a torch COO tensor.

outputtorch Tensor

Convenience function to unfreeze a passed torch parameter. Used by skorch.callbacks.Unfreezer