gradient
Why does the gradient point in the direction of steepest ascent? Imagine a function \(w=f(x,y)\). The partial derivatives \(\frac{\partial w}{\partial x}\) and \(\frac{\partial w}{\partial y}\) tell you how much you gain from moving along the \(x\) and \(y\) axes respectively. If you take some linear combination of steps along \(x\) and \(y\) you get \(a \frac{\partial w}{\partial x} + b \frac{\partial w}{\partial y}\) gain. If you take a step along a vector \(u=(a,b)\). Then, the gain you get is \(u \cdot \nabla{f} = ||u||\cdot || \nabla{f} || \cos{\theta}\). This is maximized when \(\theta=0\).