Since skorch provides extra functionality on top of a pure PyTorch training code, it is expected that it will add an overhead to the total runtime. For typical workloads, this overhead should be unnoticeable.
In a few situations, skorch’s extra functionality may add significant overhead.
This is especially the case when the amount of data and the neural net are
relatively small. The reason is that typically, most time is spent on the
backward, and parameter update calls. When those are really
fast, the skorch overhead will get noticed.
There are, however, a few things that can be done to reduce the skorch overhead. We will focus on accelerating the training process, where the overhead should be largest. Below, some mitigations are described, including the potential downsides.
First make sure that there is any significant slowdown¶
Neural nets are notoriously slow to train. Therefore, if your training takes a lot of time, that doesn’t automatically mean that the skorch overhead is at fault. Maybe the training would take the same time without skorch. If you have some measurements about training the same model without skorch, first make sure that this points to skorch being the culprit before trying to optimize using the mitigations described below. If it turns out skorch is not the culprit, look into optimizing the performance of PyTorch code in general.
Many people use skorch for hyper-parameter search. Remember that this implies
fitting the model repeatedly, thus a long run time is expected. E.g. if you run
a grid search on two hyper-parameters, each with 10 variants, and 5 splits,
there will actually be 10 x 10 x 5 fit calls, so expect the process to take
approximately 500 times as long as a single model fit. Increase the verbosity on
the grid search to get a better idea on the progress (e.g.
Turning off verbosity¶
By default, skorch produces a print log of the training progress. This is useful
for checking the training progress, monitor overfitting, etc. If you don’t need
these diagnostics, you can turn them off via the
verbose parameter. This
way, printing is deactivated, saving time on i/o. You can still access the
diagnostics through the
history attribute after training has finished.
net = NeuralNet(..., verbose=0) # turn off verbosity net.fit(X, y) train_loss = net.history[..., 'train_loss'] # access history as usual
Disabling callbacks all together¶
If you don’t need any callbacks at all, turning them off can be potential time
saver. Callbacks present the most significant “extra” that skorch provides over
pure PyTorch, hence they might add a lot of overhead for small workloads. By
turning them off, you lose their functionality, though. It’s up to you to
determine if that’s a worthwhile trade-off or not. For instance, in contrast to
just turning down verbosity, you will no longer have access to useful
diagnostics in the
# skorch version 0.10 or later: net = NeuralNet(..., callbacks='disable') net.fit(X, y) print(net.history) # no longer contains useful diagnostics # skorch version 0.9 or earlier net = NeuralNet(...) net.initialize() net.callbacks_ =  # manually remove all callbacks net.fit(X, y)
Instead of turning off all callbacks, you can also turn off specific callbacks, including default callbacks. This way, you can decide which ones to keep and which ones to get rid off. Typically, callbacks that calculate some kind of metric tend to be slow.
# deactivate callbacks that determine train and valid loss after each epoch net = NeuralNet(..., callbacks__train_loss=None, callbacks__valid_loss=None) net.fit(X, y) print(net.history) # no longer contains 'train_loss' and 'valid_loss' entries
Prepare the Dataset¶
skorch can deal with a number of different input data types. This is very
convenient, as it removes the necessity for the user to deal with them, but it
also adds a small overhead. Therefore, if you can prepare your data so that it’s
already contained in an appropriate
check can be skipped.
X, y = ... # let's assume that X and y are numpy arrays net = NeuralNet(...) # normal way: let skorch figure out how to create the Dataset net.fit(X, y) # faster way: prepare Dataset yourself from torch.utils.data import TensorDataset Xt = torch.from_numpy(X) yt = torch.from_numpy(y) tensor_ds = TensorDataset(Xt, yt) net.fit(tensor_ds, None)
Still too slow¶
You find your skorch code still to be slow despite trying all of these tips, and you made sure that the slowdown is indeed caused by skorch. What can you do now? In this case, please search our issue tracker for solutions or open a new issue. Provide as much context as possible and, if available, a minimal code example. We will try to help you figure out what the problem is.