UP | HOME

forward forward

Instead of storing gradients at each layer, perform the following at each layer:

Note that gradient descent is used per-layer. But backprop is not used on the whole network

1. sources

Created: 2025-11-02 Sun 18:54