The text is a formal proof of an inequality involving Frobenius norm, which is a matrix norm defined as the square root of the sum of the absolute squares of its elements. The inequality states that the product of two tensors is greater than or equal to one-third times the square of their trace. The proof uses matrix notation and properties, Cauchy-Schwarz inequality for Frobenius inner product, and operator norms induced by vector norms. The proof is detailed and rigorous, and provides references and explanations for each step. The text also gives some examples and applications of the inequality in physics and geometry.
The inequality in question is
The inequality (inequality 1) is found in Landau and Lifshitz. Classical Field Theory, Section 97, which is the second volume of their Theoretical Physics where it helps to prove that the metric determinant goes to zero in finite time, i.e., there is a necessary singularity in the metric in synchronous reference frame. Landau and Lifshitz write in a footnote that inequality (1) can be "easily" found to be true by transforming the tensor
Proof
To prove the inequality
for any 2-rank 3D tensors
- Write
as a matrix with elements . - Then
is the sum of squares of all elements of , and is the trace of . - Use Cauchy-Schwarz inequality for Frobenius inner product of two matrices, which states that
- Choose
, where is the identity matrix. Then we havewhich simplifies to
- Use the fact that
for an identity matrix. We get - Substitute back
for and simplify. We get - Divide both sides by 3 and rearrange to get the desired inequality.
where
Therefore, we have proved that
It is easy to see that this proof can be extended for tensors of any rank and any dimension. In addition, the Cauchy-Schwarz inequality for Frobenius inner product is true for any matrix, including complex and non-commutative matrices.
Explanation for the proof
This proof is based on the idea of using matrix notation and properties to simplify the tensor expression. It also uses a clever trick of applying a well-known inequality for matrices to get a lower bound for the tensor product. The following is a step-by-step explanation of the terse proof given in the previous section.
- In this step, we write the tensor
as a matrix with elements . So, instead of a tensor, which is a geometric and physics object we get a matrix which is an algebraic object (linear algebra). A matrix is easier to work with because we can use matrix operations and rules. For example, we can use the fact that the trace of a matrix is equal to the sum of its diagonal elements. We don't have to worry about covariance and contravariance of indices because there are no implicit basis and basis vectors. Matrix elements are viewed as scalars and not as vector components.Especially important in our case is the fact that we can replace the tensor product
with the matrix product . A tensor product is a way of combining two vector spaces into a new vector space that captures the properties of bilinear maps. The tensor product symbol is , and the tensor product of two vectors and is written as .For example, if
The tensor product of two vector spaces and , denoted by , is the vector space that consists of all linear combinations of elementary tensors of the form , where and . For example, if and are both two-dimensional vector spaces with bases and , respectively, then is a four-dimensional vector space with basis .The tensor product of two vector spaces has the property that any bilinear map from
to another vector space can be uniquely factorized as a linear map from to . This is called the universal property of the tensor product and can be expressed as where is a bilinear map, is a linear map, and the vertical arrows are natural maps that send to and to . - In this step, we rewrite the tensor products in terms of matrix elements. We use the Einstein summation convention which means that repeated indices are summed over. For example,
means that we multiply each element of by its corresponding element in transposed and then add them all up. This is equivalent to taking the sum of squares of all elements of . Similarly, means that we add up all the diagonal elements of . This is equivalent to taking the trace of . Both and are scalars in their tensor and their matrix forms. Having scalars on both sides makes it possible to compare their magnitudes with equalities or inequalities. - In this step, we use a powerful inequality for matrices called Cauchy-Schwarz inequality. It says that if we have two matrices
and , then their inner product (which is like a dot product but for matrices) cannot be larger than their norms (which are like lengths but for matrices) multiplied together. The inner product and norm are defined using Frobenius notation which means that we take square roots and traces of products and squares of matrices. The inequality can be written in two equivalent ways: either using inner products or using traces. - In this step, we choose a special matrix
to apply the inequality. We pick , where is the identity matrix which has 1s on the diagonal and 0s everywhere else. This makes things simpler because when we multiply any matrix by , we get back the same matrix. So when we take the trace or inner product of and , we just get back the trace or norm of . Then we have - In this step, we square both sides of the inequality to get rid of the absolute value sign. We also use another fact about identity matrices: their norm is equal to the square root of their size (the number of rows or columns). So if we have a 3x3 identity matrix, its norm is equal to
. Then we have - In this step, we go back to tensor notation by replacing
with and simplifying. We use another fact about traces: they are invariant under transposition (flipping rows and columns). So when we take the trace of , it's equal to taking the trace of . Then we have - In this final step, we divide both sides by 3 and rearrange them to get our desired result.
So what does this proof tell us? It tells us that no matter what tensor
This is a cubic polynomial (
The first root is real, and the other two roots are complex conjugates. These roots are, in fact, eigenvalues so we can find them immediately, without resorting to the characteristic polynomial, with the function Eigenvalues.
These exact symbolic eigenvalues, however, do not allow to estimate the elements of
Using NSolve for each eigenvalue with these random hyperplanes, we find that the first and third eigenvalue each have 2 real solutions, and the second eigenvalue returns an empty set (no real solutions). For the first eigenvalue:
$ \left(
$ \left(
and for the third eigenvalue:
$ \left(
$ \left(
In the application in the Landau and Lifshitz book, the tensor
A PDF file with the Mathematica calculations can be found at my Google drive.
No comments:
Post a Comment