Gradient Descent

Gradient Descent is a first-order iterative optimization algorithm for finding local minima of differentiable functions. Starting from an initial point x₀, it updates: xₖ₊₁ = xₖ - αₖ∇f(xₖ), where αₖ is the step size (learning rate) and ∇f is the gradient. Variants include: batch gradient descent (uses entire dataset), stochastic gradient descent (SGD - uses single samples), mini-batch gradient descent, and momentum-based methods. Convergence rate depends on function properties: linear for strongly convex smooth functions. Widely used in machine learning, deep learning, and large-scale optimization.

» OR glossary