Meri Leeworthy

gradient descent

Definition: Gradient descent is an optimization algorithm used to minimize the loss function in machine learning models, including neural networks.

Purpose: It aims to find the optimal set of model parameters (weights and biases) that minimize the error or loss.

How It Works:

  1. Initialization: Start with an initial set of parameters.
  2. Compute Gradients: Calculate the gradient of the loss function with respect to each parameter. This involves computing the partial derivatives of the loss function.
  3. Update Parameters: Adjust the parameters in the direction opposite to the gradient to reduce the loss. The update rule is typically: $$ \theta := \theta - \eta \nabla L(\theta) $$ where ( \theta ) represents the parameters, ( \eta ) is the learning rate, and ( \nabla L(\theta) ) is the gradient of the loss function with respect to the parameters.
  4. Iteration: Repeat the process until the parameters converge to the minimum loss or for a fixed number of iterations.

Variants:

Relationship Between Gradient Descent and Backpropagation

I live and work on the land of the Wurundjeri people of the Kulin Nation. I pay respect to their elders past and present and acknowledge that sovereignty was never ceded. Always was, always will be Aboriginal land.

This site uses open source typefaces, including Sligoil by Ariel Martín Pérez, and Vercetti by Filippos Fragkogiannis