tensor
I was introduced to tensor products before tensors themselves, so there's a description of tensors that I felt was missing from wikipedia, but makes the most sense to me.
Basically, wikipedia gives the following definition for tensors: A tensor of type \((p,q)\) is an assignment of a multidimensional array \[ T^{i_1,...,i_p}_{j_1,...,j_p}[\mathbf{f}] \] to each basis \(\mathbf{f} = [\mathbf{e}_1,...,\mathbf{e}_n]\) of an \(n\) -dimensional vector space such that, if we apply the change of basis \[ \mathbf{f} \rightarrow \mathbf{f}\cdot R = (\mathbf{e}_iR_1^i,...,\mathbf{e}_iR_n^i) \] then the the multidimensional array obeys the transformation law \[ T^{i'_1,...,i'_p}_{j'_1,...,j'_q}[\mathbf{f}\cdot R] = (R^{-1})^{i'_1}_{i_1}\cdots (R^{-1})^{i'_p}_{i_p} T^{i_1,...,i_p}_{j_1,...,j_q} (R)^{j_1}_{j'_1}\cdots (R)^{j_p}_{j'_p} \]
The first thing that confused me was what the phrase "assignment to every basis" meant. I realized it just means that a tensor can be described in any basis. And between two bases, the above transformation law must hold.
The einstein summation (see also numpy einsum) also confused me. What made it click was remembering that there are tensors which can be written as the tensor product of vectors. So consider the tensor: \[ T \in \underbrace{V \otimes \cdots \otimes V}_{p} \otimes \underbrace{V^* \otimes \cdots \otimes V^*}_{q} \]
Or, let's take a more specific example: \[ T = v\otimes w \] where \(v,w\in V\) Here, \(v\) and \(w\) can be described as coordinates with respect to some basis. Then the transformation law just says that when we transform the basis with \(R\) we need to transform the vectors in the tensor product.
\[ T = (R^{-1})v \otimes (R^{-1})w \]
Here \((R^{-1})\) is used because vectors are covariant (see covariant and contravariant)
The einsteim summation just picks out the components in the tensor product to apply the transformation to.
To see this, consider a concrete example. Let \[T =\begin{bmatrix} v_1w_1\\ v_1w_2\\ v_2w_1\\ v_2w_2 \end{bmatrix} \]
We can think of this as a tensor \(T^{i_1,i_2}\), where the first index divides the matrix in two (top, bottom), and the second matrix indexes within each half. You can think that the first index indexes \(v\) and the second index indexes \(w\) (see also tensor product)
Then say that we apply the change of basis \(R\). The components of the tensor are give by the einstein summation: \[ T^{i'_1,i'_2} = (R^{-1})^{i'_1}_{i_1}(R^{-1})^{i'_2}_{i_2}T^{i_1,i_2} \]
Let's fix \(i'_1\) to be 1, then we see that \[ T^{1,i'_2} = \cdots(R^{-1})^{i'_2}_{i_2}T^{1,i_2} \] is just applying \((R^{-1})\) to the first half of the vector. In general, each \((R^{-1})^{i'}_{i}\) term can be thought of as applying \((R^{-1})\) to index \(i\).
This can also be seen if you try writing a for-loop to compute the new tensor components
def compute_T(i'_1, i'_2): sum = 0 for i_1 in [1,2]: for i_2 in [1,2]: sum += R_inv[i'_1, i_1]*R_inv[i'_2, i_2]T[i_1,i_2]
You can swap around the for loops and see that the same thing happens for the other index.