More on Lie Groups

\(e^x\), But Not Quite

I know that I already introduced a bit of material on Lie theory in my last blog post, but the more I thought about it the more it really bothered me how much I was leaving out. If I have any hopes of introducing the cool intersections between geometry and theoretical physics, then I'll definitely need to go into some heavy linear algebra derived from — you guessed it — Lie theory.

As I've noted before, I find it easiest to think of groups as mathematical representations of how symmetry occurs — an intuitive example of this was the dihedral group \(D_n\) which consists of a regular (i.e. equilateral and equiangular) \(n\)-sided polygon's symmetries. Since we defined a Lie group to be a group with a smooth manifold structure, it should make sense that Lie groups typically motivate smooth symmetries / transformations. Moreover, we showed that the Lie algebra \(\mathfrak{g}\) associated to a Lie group \(G\) was really the tangent space of \(G\) near the identity element; thus, our Lie algebra \(\mathfrak{g}\) represents infinitesmal transformations of a space, as well the linearization of our group \(G\) near the identity!

Given all these motivations, it should be pretty clear how Lie theory ties in so closely to geometry and higher-level mathematics. Hence, we will continue to explore the connections between a Lie group \(G\) and the Lie algebra \(\mathfrak{g}\) associated to it.

To start off, if we know that two Lie groups \(G\) and \(H\) have a similar structure, what can we say about their associated Lie algebras \(\mathfrak{g}\) and \(\mathfrak{h}\)? Let's take a look at the following theorem:

Let \(G\) and \(H\) be Lie groups and suppose \(\phi: G \to H\) is a homomorphism. Then $$ d\phi:\mathfrak{g} \to \mathfrak{h} $$ is a Lie algebra homomorphism. That is, $$ d\phi[X,Y] = [d\phi(X), d\phi(Y)] \hspace{3em}\forall X, Y \in \mathfrak{g} $$

The theorem above tells us pretty much what we would expect, given our knowledge of a map's differential and the tangent space. However, what happens when our homomorphism in question is the inclusion map

$$ \iota : H \hookrightarrow{} G $$

for \(H \subset G\)? From our study of manifolds, we know that \(H\) is considered a submanifold of \(G\) when \(\iota\) is an immersion (that is, the differential is injective). Moreover, we want \(H\) to inherit the group structure from \(G\) — therefore, we say that \(H\) is a Lie subgroup if

  1. \(H\) is a subgroup of \(G\).

  2. \(H\) is an immersed submanifold of \(G\)

In a similar fashion, given a Lie algebra \(\mathfrak{g}\), we say that \(\mathfrak{h} \subset \mathfrak{g}\) is a Lie subalgebra if it is closed under the Lie bracket \([\cdot, \cdot]\); that is, \([X, Y] \in \mathfrak{h}\) for all \(X, Y \in \mathfrak{h}\).

Hopefully the reader can see where this is going; if not, we are simply trying to find a correspondence between Lie subalgebras \(\mathfrak{h}\) and their associated Lie subgroups \(H\). To establish this connection, I first need to establish a few definitions:

  • Distribution: Given a manifold \(M\), we say that a distribution \(\mathcal{D}\) is a sub-bundle of \(TM\) — that is, for each \(p \in M\), \(\mathcal{D}_p\) is a linear subspace of \(T_pM\).

  • Involutive: Given a manifold \(M\) and distribution \(\mathcal{D}\) on \(M\), we say \(\mathcal{D}\) is involutive if, given \(X, Y \in \mathcal{D}\) (that is, at each \(p\in M\) we have \(X_p, Y_p \in \mathcal{D}_p\)) the Lie bracket \([X, Y] \in \mathcal{D}\).

  • Integral Manifold: Given a manifold \(M\) and a distribution \(\mathcal{D}\), if \(N \subset M\) is a submanifold of \(M\) such that \(TN = \mathcal{D}\) then we say \(N\) is an integral manifold of \(\mathcal{D}\).

For the sake of brevity, we will skip over Frobenius' Theorem which states than, given a smooth, involutive distribution \(\mathcal{D}\) on \(M\), there exists a maximal connected integral manifold of \(\mathcal{D}\). However, we will use Frobenius' theorem to prove that there is a one-to-one correspondence between Lie subalgebras and Lie groups. For a proof of Frobenius' Theorem, I suggest Foundations of Differentiable Manifolds and Lie Groups by Warner, Theorem 1.64.

Let \(G\) be a Lie group with Lie algebra \(\mathfrak{g}\), and let \(\widetilde{\mathfrak{h}} \subset \mathfrak{g}\) be a Lie subalgebra. Then there exists a unique connected Lie subgroup \(\iota: H \hookrightarrow G\) such that \(d\iota(\mathfrak{h}) = \widetilde{\mathfrak{h}}\).

Though the proof above seems like an intuitive fact, it actually takes a great deal of work to show that there is a one-to-one correspondence between subgroups of \(G\) and subalgebras of its Lie algebra \(\mathcal{g}\). Indeed, mathematics graduate students may spend weeks trying to build up the machinery to Frobenius' theorem, and then even longer to get where we're at. Fortunately, it sets us up well for our next theorem (oh yeah, we're just getting started):

Let \(G\) and \(H\) be Lie groups whose associated Lie algebras are \(\mathfrak{g}\) and \(\mathfrak{h}\), respectively. If \(G\) is simply connected and \(\Phi : \mathfrak{g} \to \mathfrak{h}\) is a Lie algebra homomorphism, then there exists a unique Lie group homomorphism \(\phi : G \to H\) with \(\phi_* = \Phi\).

At this point, you're likely a little tired of theorems and may be curious as to where this is all going. Fortunately, all of the connections between homomorphisms, Lie subgroups, and Lie subalgebras actually gives us the necessary tools we need to recover information about an underlying Lie group \(G\) from its Lie algebra \(\mathfrak{g}\). Since our Lie algebra \(\mathfrak{g}\) corresponds to the tangent plane \(T_eG\) of the group we wish to find, we treat our elements \(X \in \mathfrak{g}\) much like we would differential equations in order to recover the local information of \(G\). But that raises the question, how do we solve differential equations on an arbitrary Lie algebra \(\mathfrak{g}\)? If you've learned anything from geometry so far, you shouldn't be surprised to know we just evaluate on \(\mathbb{R}^n\) and map back to our set of interest 🙂.

In particular, we know that \(\mathbb{R}\) is a Lie group (under usual addition) with identity \(0\). If we let \(\mathfrak{r} = T_0\mathbb{R}\) denote the associated Lie algebra, it should be pretty easy to see that

$$ \frac{d}{dt}\vert_{t=0} $$

generates our Lie algebra \(\mathfrak{r}\). For any \(X \in \mathfrak{g}\), we may define a simple Lie algebra homomorphism \(\Phi_X : \mathfrak{r} \to \mathfrak{g}\) by

$$ \Phi_X(\frac{d}{dt}\vert_{t=0}) = X $$

By our previous theorem, this implies that there exists a Lie group homomorphism \(\phi_X : \mathbb{R} \to G\) with

$$ \phi'_X(0) = (\phi_X)_*(\frac{d}{dt}\vert_{t=0}) = \Phi_X(\frac{d}{dt}\vert_{t=0}) = X $$

Those familiar with dynamical systems may recognize that \(\phi_X\) is actually what is known as an integral curve. In mathematics (specifically dynamical systems on manifolds), integral curves represent solutions to systems of ordinary differential equations. One may extend the notion of a global flow as well using the map \(\gamma : \mathbb{R} \times G \to G\) defined by

$$ \gamma(t, g) = L_g \circ \phi_X(t) $$

For the time being we focus on the map \(\textrm{exp} : \mathfrak{g} \to G\) defined by

$$ \textrm{exp}(X) = \phi_X(1) $$

Thus, given a tangent vector \(X \in \mathfrak{g}\), we use our structure on \(\mathbb{R}\) to move a unit step along our integral curve from the identity \(e\). Since we think of the Lie algebra \(\mathfrak{g}\) as the linearization of \(G\) near the identity, it makes sense to think of \(\mathrm{exp} : \mathfrak{g} \to G\) as the "de-linearization" of \(\mathfrak{g}\).

A visual depiction of how the exponential map 'delinearizes' the tangent space

But why do we call this map the exponential map? That is, how does our map \(\textrm{exp} : \mathfrak{g} \to G\) remotely resemble our map \(e^x\)? For starters, \(\mathbb{R}\) is a Lie group under standard multiplication, so we may define left-translation by \( \mu_t(s) = ts\). Thus,

$$ (\phi_X \circ \mu_t)'(0) = \phi_X'(\mu_t(0))\mu_t'(0) = t \phi_X'(0) = tX $$

Since \(tX = (\mu_t)_*(X)\), we have that \(\phi_X(t) = \textrm{exp}(tX)\) and \(d\,\textrm{exp}(tX) = tX\). But \(\phi_X(t)\) is a Lie group homomorphism from \(\mathbb{R}\) to \(G\), so we may expand

$$ \begin{align} \textrm{exp}(t_1 + t_2)X &= \phi_X(t_1 + t_2) = \phi_X(t_1)\phi_X(t_2) = \textrm{exp}(t_1X)\textrm{exp}(t_2X) \\ \textrm{exp}(-tX) &= (\textrm{exp}(tX))^{-1} \end{align} $$

Putting everything together, our map \(\textrm{exp}: \mathfrak{g} \to G\) should begin to feel quite familiar; it isn't quite the same as \(e^x\), but provides enough analogies to allow similar properties.

The Lie Group \(GL(n, \mathbb{R})\)

We now take a look at a bit of linear algebra to see the first applications of Lie algebras and the exponential map. Now suppose we want to form a group out of all \(m \times n\) matrices over \(\mathbb{R}\) under matrix multiplication. In order to satisfy the first group axiom, we must have that \(A (BC) = (AB) C\) for all matrices \(A, B, C \in \mathbb{R}^{m\times n}\) — however, we know from elementary linear algebra that this is the case. However, our identity element becomes a bit trickier — since a matrix \(A \in \mathbb{R}^{m \times n}\) has \(m\) rows and \(n\) columns, we have

$$ AI_n = A \\ I_mA = A $$

Therefore, our identity is consistent with both left and right multipliciation if and only if \(m = n\) — consequently, we turn our attention to square matrices \(A \in \mathbb{R}^{n \times n}\). Our last group axiom we need to fill is the existence of an inverse; that is, for each \(A \in \mathbb{R}^{n\times n}\) there needs to be a matrix \(A^{-1}\) with \(A^{-1}A = AA^{-1} = I_n\). Those familiar with linear algebra know that such an element does not always exist! For example, there exists no inverse to the matrix

$$ \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} $$

A useful theorem in linear algebra tells us that a square matrix \(A \in \mathbb{R}^{n \times n}\) exists if and only if \(\det(A) \neq 0\). Therefore, we define the general linear group (denoted \(GL(n,\mathbb{R})\)) to be

$$ GL(n,\mathbb{R}) = \{ A \in \mathbb{R}^{n \times n} \mid \det(A) \neq 0 \} $$

Now, the determinant \(\det : \mathbb{R}^{n \times n} \to \mathbb{R}\) is ultimately a polynomial in the entries \(a_{ij}\) of \(A\) since

$$ \det A = \sum_{\sigma \in S_n} \textrm{sgn}(\sigma) a_{1, \sigma(1)}\dots a_{n,\sigma(n)} $$

Therefore, \(\det : \mathbb{R}^{n\times n} \to \mathbb{R}\) is a continuous (and smooth) map; but then \(GL(n, \mathbb{R})\) is in fact the preimage of the open set \(\mathbb{R} - \{0\}\), so \(GL(n,\mathbb{R})\) must be an open set of \(\mathbb{R}^{n \times n}\) by definition of continuity. There is a clear isomorphism \(\mathbb{R}^{n\times n} \simeq \mathbb{R}^{n^2}\), so we must have that \(GL(n, \mathbb{R})\) is isomorphic to an open set of \(\mathbb{R}^{n^2}\) and is thus a manifold! Pretty cool, huh? When I first went into linear algebra, I always thought of matrices as these discrete boxes that acted like \(2D\) arrays in programming for storing data — turns out this was an incredibly unhelpful visualization and could not be further from the truth. Indeed, invertible matrices resemble linear maps over a space (i.e. \(\mathbb{R}^n\)) and the group \(GL(n, \mathbb{R})\) simply looks like an open set in the much larger space \(\mathbb{R}^{n^2}\).

Moving forward, recall that we may define matrix multiplication at each entry by

$$ (AB)_{ij} = \sum_{k=1}^n a_{ik}b_{kj} $$

Since this is a polynomial over the entries of \(A\) and \(B\), we have that matrix multiplication is smooth group operation. In a similar fashion, Cramer's rule tells us that

$$ (A^{-1})_{ij} = \frac{1}{\det A}(-1)^{i + j} ((j,i)-\textrm{minor}\ \textrm{of}\ A) $$

Since \(\det A \neq 0\), we have that the inverse map \(\iota(A) = A^{-1}\) is smooth. Therefore, \(GL(n,\mathbb{R})\) forms a Lie group.

Let \(\mathfrak{gl}(n,\mathbb{R})\) denote the Lie algebra of the Lie group \(GL(n, \mathbb{R})\). Then \(\mathfrak{gl}(n,\mathbb{R}) = \mathbb{R}^{n \times n}\).

Now fix some \(A \in \mathfrak{gl}(n, \mathbb{R})\). As before, we seek to construct a countinuous group homomorphism \(\phi_A : \mathbb{R} \to GL(n, \mathbb{R})\); in fact, this will be done by using the formal power series

$$ \phi(t) = \sum_{k=0}^\infty \frac{(tA)^k}{k!} $$

If we let \(z : \mathbb{R}^{n \times n} \to \mathbb{R}^{n^2}\) be the obvious isomorphism (i.e. the element in the \((i,j)\) position gets sent to the \(ni + j^{\mathrm{th}}\) basis element), it follows that

$$ z \circ \phi(t) = \sum_{k=0}^\infty \frac{z(A^k)t^k}{k!} $$

Now this power series has an infinite radius of absolute convergence, so we have that

$$ \mathrm{exp}(A) = \phi(1) = \sum_{k=0}^\infty \frac{A^k}{k!} $$

This should hopefully look familiar considering it is the matrix exponential from linear algebra — for the sake of notation, we define \(A^0 = I\) for all \(A \in \mathbb{R}^{n \times n}\). For those who have not seen the matrix exponential, it is enough to know at the moment that this is the Lie algebra exponential map for the set \(\mathfrak{gl}(n, \mathbb{R}) = \mathbb{R}^{n\times n}\); thus, this map "de-linearizes" the set of all matrices by sending each matrix to the set of invertible matrices (since the inverse of \(e^A\) is simply \(e^{-A}\)).

We clearly wouldn't be doing our job of exploring \(GL(n, \mathbb{R})\)'s properties if we didn't at least explore the matrix exponential map with more detail. To start off, recall from elementary linear algebra that every square matrix \(A \in \mathfrak{gl}(n,\mathbb{R})\) has a Jordan normal form \(C = BAB^{-1}\) (where \(B \in GL(n, \mathbb{R})\)). Now our matrix \(C\) is upper triangular, so \(\det(C)\) is simply the product of its diagonal entries — if we represent \(C\) as

$$ C = \begin{pmatrix} c_1 & \ & * \\ \ & \ddots & \ \\ 0 & \ & c_n \end{pmatrix} $$


$$ \textrm{exp}(X) = \sum_{k=0}^\infty \frac{X^k}{k!} = \sum_{k=0}^\infty \frac{1}{k!} \begin{pmatrix} c_1^k & \ & * \\ \ & \ddots & \ \\ 0 & \ & c_n^k \end{pmatrix} = \begin{pmatrix} e^{c_1} & \ & * \\ \ & \ddots & \ \\ 0 & \ & e^{c_n} \end{pmatrix} $$

Putting everything together, we wind up with the following formula:

$$ \begin{align} \det(\mathrm{exp}(A)) &= \det( B \sum_{k=0}^\infty \frac{A^k}{k!} B^{-1} ) \\&= \det(\sum_{k=0}^\infty \frac{(BAB^{-1})^k}{k!}) \\&= \det(\mathrm{exp}(BAB^{-1})) \\&= \prod_{i=1}^n e^{c_{ii}} \\&= e^{\textrm{tr}(C)} \\&= e^{\textrm{tr}(BAB^{-1})} \\&= e^{\textrm{tr}(ABB^{-1})} \\&= e^{\textrm{tr}(A)} \end{align} $$

Here, \(\textrm{tr}(A)\) denotes the trace of our matrix, which is effectively the sum of all diagonal elements. In the system of equations above, we use the simple fact that \(\mathrm{tr}(XY) = \mathrm{tr}(YX)\) for all \(X, Y \in \mathbb{R}^{n \times n}\) since

$$ \textrm{tr}(XY) = \sum_{i} (XY)_{ii} = \sum_{i}\sum_k x_{ik}y_{ki} = \sum_i\sum_k = y_{ki}x_{ik} = \sum_i (YX)_{ii} = \textrm{tr}(YX) $$

Now it is easy to see that the determinant serves as a Lie group homomorphism from \(GL(n,\mathbb{R}))\) to \(\mathbb{R}\) since \(\det(AB) = \det(A)\det(B)\) and \(\det(A^{-1}) = \frac{1}{\det(A)}\) for all \(A, B \in GL(n, \mathbb{R})\). The first theorem of this article tells us that the differential of \(\det : GL(n, \mathbb{R}) \to \mathbb{R}\) is therefore a Lie algebra homomorphism between \(\mathfrak{gl}(n, \mathbb{R})\) and \(T_1\mathbb{R}\). But what is the differential of the determinant? For many who have only seen basic calculus, this seems like a silly question since \(\det\) is an operator on matrices. However, we know that \(GL(n, \mathbb{R})\) is indeed a manifold so we are able to use our pushforward.

Fix \(X \in \mathfrak{gl}(n, \mathbb{R})\) and let \(c: \mathbb{R} \to GL(n, \mathbb{R})\) be a curve with \(c(0) = I\) and \(c'(0) = X\) — in particular, we pick \(c(t) = e^{tX}\) to satisfy these conditions. Then

$$ \det\nolimits_{*,I}(X) = \frac{d}{dt}\det(e^{tX}) \vert_{t=0} = \frac{d}{dt}e^{t\,\textrm{tr}X}\vert_{t =0} = (\textrm{tr}X)e^{t\,\textrm{tr}X} \vert_{t=0} = \textrm{tr}X $$

The Lie Group \(SL(n, \mathbb{R})\)

In the previous section we studied \(GL(n, \mathbb{R})\), which was shown to be an important archetype for Lie groups in our study of geometry. Indeed, the general linear group represents the space of all invertible linear transformations and gives us a much more concrete reasoning for how Lie groups represent continuous symmetry. However, an arbitrary linear transformation need not preserve important geometric information such as length, area, and volume. For example, consider the linear transformation

$$ A = 2I_n = \begin{pmatrix} 2 & \dots & 0 \\ \ & \ddots & \ \\ 0 & \dots & 2 \end{pmatrix} $$

For any vector \(\overrightarrow{v}\), we clearly have that \( \| A\overrightarrow{v} \| = 2 \| \overrightarrow{v} \| \), so our linear transformation does not preserve length. It winds up being the case that our linear transformation does not preserve volume when \(\det(A) \neq 1\). To see this, we consider the following example: in \(\mathbb{R}^2\) the image of our basis vectors \(e_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}\) and \(e_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}\) encompass the unit square.

The unit square

Now if we apply some linear transformation \(B \in GL(2, \mathbb{R})\) to \(e_1\) and \(e_2\) so that \(v_1 = Be_1\) and \(v_2 = Be_2\), the image of these basis vectors \(v_1, v_2\) forms a parallelogram. If we represent our linear transformation \(B\) verbosely

$$ B = \begin{pmatrix} a & c \\ b & d \end{pmatrix} $$

then we can represent our basis vectors as \(v_1 = \begin{pmatrix} a \\ b \end{pmatrix}\), \(v_2 = \begin{pmatrix} c \\ d \end{pmatrix}\), and our paralellogram by

The unit square transformed under a change of basis

Through a series of cuts, it is fairly simple to show that the area of our parallelogram is equal to the area of the rectangle of lengths \(a\) and \(d\) with the rectangle of lengths \(c\) and \(b\) cut out.

A visual depiction of how we may find the area of the new parallelogram under change of basis

Therefore, we must have that the area of our parallelogram is given by \( ad - bc = \det(B)\). Hence, the area of our unit square is preserved under linear transformation if and only if the determinant is equal to \(1\). This idea extends to higher dimensions; that is, volume and orientation are both preserved by a linear map \(B \in GL(n, \mathbb{R})\) when \(\det(B) = 1\). We therefore define the special linear group, denoted \(SL(n, \mathbb{R})\), to be the set

$$ SL(n, \mathbb{R}) = \{ A \in GL(n, \mathbb{R}) \mid \det(A) = 1 \} $$

If we consider the map \(g : \mathbb{R}^{n \times n} \to \mathbb{R}\) given by \(g(A) = \det(A) - 1\), it is clear that \(SL(n, \mathbb{R})\) is the zero set of \(g\). This has a pushforward at the identity of \(g_{*, I}(X) = \mathrm{tr}\,(X) - 0 = \mathrm{tr}\,(X)\), so we define the set \(\mathfrak{sl}(n, \mathbb{R})\) to be the zero set

$$ \mathfrak{sl}(n, \mathbb{R}) = \{ A \in \mathbb{R}^{n \times n} \mid \mathrm{tr}\,(A) = 0 \} $$

Now given any matrix \(A \in \mathfrak{sl}(n, \mathbb{R})\), its easy to show that \(\textrm{exp}\,(A) \in SL(n, \mathbb{R})\) since

$$ \det(\textrm{exp}\,(A)) = e^{tr(A)} = e^{0} = 1 $$

by our formula from the last section. Ultimately, our notation should give away the simple fact that \(\mathfrak{sl}(n, \mathbb{R})\) is the Lie algebra associated to the Lie group \(SL(n, \mathbb{R})\). Showing that \(SL(n, \mathbb{R})\) is a subgroup of \(GL(n, \mathbb{R})\) is an easy result due to the identities \(\det(AB) = \det(A)\det(B)\) and \(\det(A^{-1}) = 1/ \det(A)\); however, showing that \(SL(n, \mathbb{R})\) is an immersed submanifold is a little bit more difficult. We will not go into the details, but the lion's share of the proof is due to the fact that \(d(\det)_A\) is surjective for \(A\) non-singular, so that \(1\) is a regular value and \(\det^{-1}(1) = SL(n, \mathbb{R})\). When put together, these facts show that \(SL(n, \mathbb{R})\) is a Lie subgroup of the Lie group \(GL(n, \mathbb{R})\). To see that \(\mathfrak{sl}(n, \mathbb{R})\) is a Lie subalgebra of the Lie algebra \(\mathfrak{gl}(n, \mathbb{R})\), one must simply use the fact that \(\textrm{tr}\,(AB) = \textrm{tr}\,(BA)\) to show that

$$ \mathrm{tr}([A, B]) = \mathrm{tr}(AB - BA) = 0 $$

It turns out we may make a much more general statement about Lie subgroups of \(GL(n, \mathbb{R})\) which will come in handy for the following sections.

Let \(\psi : \mathfrak{gl}(n, \mathbb{R}) \to \mathfrak{gl}(n, \mathbb{R})\) be a bounded linear map such that \(\psi^{-1} = \psi\) and \(\psi(AB) = \psi(A)\psi(B)\) whenever \(AB = BA\). Then $$ G = \{ A \in GL(n, k) \mid A^{-1} = \psi(A) \} $$ Is a submanifold and closed Lie subgroup of \(GL(n, \mathbb{R})\) with Lie subalgebra $$ \mathfrak{g} = \{A \in \mathfrak{gl}(n, \mathbb{R}) \mid \psi(A) = -A\} $$ of \(\mathfrak{gl}(n, k)\) as its Lie algebra.

The Lie Groups \(O(n)\) and \(SO(n)\)

In my previous blog post, I defined what was known as an isometry between Riemannian manifolds — in truth, isometries between Riemannian manifolds are much more complicated than regular isometries since

  1. Our inner product varies from point to point
  2. We are not actually looking at our original space (i.e. the manifold itself in this case), but instead at the tangent bundle
  3. The Riemannian metric requires a partition of unity to be made global

The reader should be happy to hear then that a linear isometry is much simpler. Given two normed vector spaces \((V, \|\cdot\|_V), (W, \|\cdot\|_W)\) and a linear map \(f: V \to W\), we call \(f\) a linear isometry if

$$ \| f(v) \|_W = \|v\|_V $$

for all \(v \in V\).

Whenever we are in an inner-product space \(V\), we are able to define a norm \(\|\cdot\| : V \to \mathbb{R}\) simply by \(\|v\|^2 = \langle v, v \rangle \). In this case, since linear maps can be represented as matrices over \(V\), we have that a linear isometry is a matrix \(A\) satisfying

$$ \langle x, y \rangle = \langle Ax, Ay\rangle $$

for all \(x, y \in V\). Now, over \(\mathbb{R}\) we have that our inner-product \(\langle x, y \rangle\) can be given by \(xy^T\) (where \(T\) denotes the matrix transposition). Therefore, our equation above simplifies to

$$ xy^T = (Ax)(Ay)^T = (AA^T)xy^T $$

which implies \(AA^T = I_n\). Hence, we define the orthogonal group, denoted \(O(n)\), to be the set

$$ O(n) = \{ A \in GL(n, \mathbb{R}) \mid A^{-1} = A^T \} $$

In a similar fashion, we define the special orthogonal group, denoted \(SO(n)\), to be the set

$$ SO(n) = O(n) \cap SL(n, \mathbb{R}) = \{ A \in GL(n, \mathbb{R} \mid A^{-1} = A^T,\ \det(A) = 1 \} $$

We sometimes refer to \(SO(n)\) as the rotation group since rotations are linear maps which preserve the origin, distance, and orientation. Consider the set

$$ \mathfrak{o}(n) = \{A \in GL(n, \mathbb{R}) \mid A^T = -A\} $$

known as the set of skew-symmetric matrices — this is not a group since the product of two skew-symmetric matrices is not itself a skew-symmetric matrix; however, it is easy to show that it forms a linear subspace of \(\mathfrak{gl}(n, \mathbb{R})\). When we consider the image of \(\mathfrak{o}(n)\) under the exponential map, we have

$$ I = \textrm{exp}\,(0) = \textrm{exp}\,(A + A^T) = \textrm{exp}\,(A)\textrm{exp}\,(A^T) $$

so that \(\textrm{exp}\,(A^T) = (\textrm{exp}\,(A))^{-1}\). Thus, \(\mathrm{exp}\) maps skew-symmetric matricies to orthogonal matrices. Moreover, since every skey-symmetric matrix has \(0\)'s along its diagonal, it must also have a null trace. Ultimately, this tells us \(\det(\mathrm{exp}\,(A)) = e^{\textrm{tr}(A)}\) for all \(A \in \mathfrak{o}(n)\), so the map \(\mathrm{exp} : \mathfrak{o}(n) \to SO(n, \mathbb{R})\) is well-defined.

"But what about the exponential map back to the orthogonal group?" you might ask. Well since \(SO(n)\) is a Lie subgroup of \(O(n)\), we can also write \(\mathrm{exp}\, : \mathfrak{o}(n) \to O(n)\) since we know that every skew-symmetric matrix is mapped to an orthogonal matrix. However, in this case our map fails to be surjective since \(O(n)\) is actually composed of two connected components: orthogonal matrices with determinant \(1\) and orthogonal matrices with determinant \(-1\); this is due to the fact that for any orthogonal matrix \(A \in O(n)\), we have \(AA^T = I_n\), so

$$ 1 = \det(I_n) = \det(AA^T) = \det(A)\det(A^T) = \det(A)^2 $$

For this reason, you will often see \(\mathfrak{so}(n) = \mathfrak{o}(n)\) since both \(O(n)\) and \(SO(n)\) share the same Lie algebra. In addition, since we may define \(\psi(A) = A^T\) to satisfiy \(\psi = \psi^{-1}\) and \(\psi(AB) = \psi(A)\psi(B)\) whenever \(AB = BA\), our theorem above tells us that, indeed, \(O(n)\) and \(SO(n)\) are Lie subgroups of \(GL(n, \mathbb{R})\) with \(\mathfrak{o}(n)\) as their Lie subalgebra of \(\mathfrak{gl}(n, \mathbb{R})\).

It may be helpful to the reader to consider an actual example, so consider the case where \(n = 2\). Every skew-symmetric \(2\times 2\) matrix is of the form

$$ A = \begin{pmatrix} 0 & -\theta \\ \theta & 0 \end{pmatrix} = \theta \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} $$

for some number \(\theta\). We will refer to the matrix \( \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}\) as \(J\) henceforth. The reader can easily compute \(J^2 = -I_2\), so if \(A = \theta J\) is a skey-symmetric \(2\times 2\) matrix then \(A^2 = -\theta^2 I_n\). Multiplying by \(A\) again, since \(A^2\) is just the identity scaled, we get \(A^3 = -\theta^3 J\). Lastly, it is easy to see that

$$ A^4 = A^2A^2 = (-\theta^2 I_2)(-\theta^2 I_2) = \theta^4 I_2 $$

The reader may notice that our matrix \(J\) behaves very similarly to the complex number \(i\). That is,

$$ \begin{align} J \hspace{1em} &= \hspace{1em} J, \hspace{3em} i \hspace{1em} = \hspace{1em} i \\ J^2 \hspace{1em} &= \hspace{1em} -I_2, \hspace{3em} i^2 \hspace{1em} = \hspace{1em} -1 \\ J^3 \hspace{1em} &= \hspace{1em} -J, \hspace{3em} i^3 \hspace{1em} = \hspace{1em} -i \\ J^4 \hspace{1em} &= \hspace{1em} I_2, \hspace{3em} i^4 \hspace{1em} = \hspace{1em} 1 \end{align} $$

Thus, every symmetric matrix \(A = \theta J \in \mathfrak{o}(2)\) corresponds to some pure imaginary number \(i\theta\) — hopefully those familiar with Euler's identity see where this is going. Just like we do in first year calculus, we expand our map \(\mathrm{exp}\)

$$ \begin{align} \mathrm{exp}\,(A) &= I_2 + \frac{\theta}{1!}J - \frac{\theta^2}{2!}I_2 - \frac{\theta^3}{3!}J + \frac{\theta^4}{4!}I_2 + \frac{\theta^5}{5!}J - \frac{\theta^6}{6!}I_2 - \frac{\theta^7}{7!}J + \dots \\&= \left( 1 - \frac{\theta^2}{2!} + \frac{\theta^4}{4!} - \frac{\theta^6}{6!} + \dots \right)I_2 + \left( \frac{\theta}{1!} - \frac{\theta^3}{3!} + \frac{\theta^5}{5!} - \frac{\theta^7}{7!} + \dots \right)J \\&= \cos \theta\, I_2 + \sin\theta\, J \\&= \begin{pmatrix} \cos\theta & 0 \\ 0 & \cos \theta \end{pmatrix} + \begin{pmatrix} 0 & -\sin\theta \\ \sin\theta & 0 \end{pmatrix} \\&= \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} \end{align} $$

Since \(\mathrm{exp}\,(A)\) is an element of \(SO(2)\), the reader can hopefully see why the special orthogonal group is referred to as "the rotation group".

The Lie algebra exponential map \(\mathrm{exp} : \mathfrak{o}(n) \to SO(n)\) is surjective.

By the theorem above, every rotation matrix \(A \in SO(2)\) can be represented as \(A = \mathrm{exp}\,(\theta J)\) (where \(J\) is the matrix above corresponding to the imaginary number \(i\)). Therefore, there is a clear isomorphism \(\mathrm{exp}\,(\theta J) \mapsto e^{i\theta}\), which shows that \(SO(2)\) is indeed diffeomorphic to the circle \(S^1\)!

The Lie Groups \(U(n)\) and \(SU(n)\)

Those familiar with higher-level linear algebra should know that our orthogonal matrices in \(O(n)\) only have the desired properties of preserving inner products and norms over the real numbers \(\mathbb{R}\). Indeed, multiplication over \(\mathbb{C}\) is algebraically quite different than multiplication over \(\mathbb{R}^2\) — for this reason, when we consider complex matrices we don't talk about the transpose but instead the complex transpose \(A^*\). If \(a_{ij}\) is the element in the \(i^{\mathrm{th}}\)-row, \(j^{\mathrm{th}}\)-column, then

$$ (a_{ij})^* = \overline{a_{ji}} $$

For example, we have

$$ \begin{pmatrix} 3 & 3 - 2i \\ 3 + i & 2 \end{pmatrix}^* = \begin{pmatrix} 3 & 3 - i \\ 3 + 2i & 2 \end{pmatrix} $$

As the reader may have guessed, this leads to a complex analogue of our Lie group \(O(n)\) above by essentially replacing matrix transpose \(A^T\) with conjugate transpose \(A^*\). We call this group the unitary group, denoted \(U(n)\), and define it by

$$ U(n) = \{ A \in GL(n, \mathbb{C}) \mid A^{-1} = A^* \} $$

As we have done for all other matrix groups thus far, we may also define the special unitary group, denoted \(SU(n)\), to be those elements of \(U(n)\) with determinant 1. In other words,

$$ SU(n) = U(n) \cap SL(n) $$

Now since \( (A^*)^* = A\), we may use our theorem above to show that \(U(n)\) is a Lie subgroup of \(GL(n, \mathbb{C})\) with Lie subalgebra

$$ \mathfrak{u}(n) = \{ A \in GL(n, \mathbb{C}) \mid A^* = -A \} $$

The reader may note that \(\mathfrak{u}(n)\) is essentially \(\mathfrak{o}(n)\) with transpose replaced by conjugate transpose (these matrices are referred to skew hermetian instead of skew symmetric). However, there is one subtlety between our groups \(\mathfrak{so}(n)\) and \(\mathfrak{su}(n)\): our groups \(O(n)\) and \(SO(n)\) shared the same Lie algebra, while \(U(n)\) and \(SU(n)\) do not. This is because every skew symmetric matrix had null trace, and thus the exponential map carried it back to an orthogonal matrix of determinant 1 in \(SO(n)\). On the other hand, not every skew hermetian matrix has null trace — consider the following example

$$ A = \begin{pmatrix} -i & 1 + i \\ -1 + i & 0 \end{pmatrix} $$ $$ \\ -A = \begin{pmatrix} i & -1 - i \\ 1 - i & 0 \end{pmatrix} = A^* $$

This is clearly a skew hermetian matrix, but it has a trace of \(\mathrm{tr}\,(A) = i\). Therefore, the Lie algebra of \(SU(n)\) must be forcibly restricted to those skew hermetian matrices with null trace:

$$ \mathfrak{su}(n) = \mathfrak{u}(n) \cap \mathfrak{sl}(n, \mathbb{C}) $$

The Symplectic and Compact Symplectic Groups

This brings us to our final (and most important with regard to future blog posts) Lie subgroup of the general linear group. I could jump straight into the definitions, but I'd prefer to lay down some ground work so that the reader understands why our last Lie subgroups are important.

In classical mechanics, we define a physical system to take place on some smooth manifold \(M\) — the manifold simply represents possible positions of an object, and is frequently called the configuration space. Now once we introduce dynamics to the system, we have that the paths of objects are represented as integral curves of some vector field \(X : M \to TM\). In order to numerically solve (or minimize) these integral curves, we must find some way of assigning a value of our vector field over each point. To this end, one defines the Lagrangian \(L :TM \to \mathbb{R}\) to be the kinetic energy minus the potential energy (this function is often generalized in higher level mathematics) — hence, integral curves minimize the Lagrangian function \(L\).

For example, if we have a force field that acts centrally, given by \(m \frac{d^2x}{dt^2} = \frac{k}{|x|^2}\) on \(M = \mathbb{R} \backslash \{ 0 \}\), our potential energy is \(k / |x|\) and thus we may solve for the Lagrangian explicitly:

$$ L(x, v) = \textrm{kinetic}\ \textrm{energy} \ - \ \textrm{potential}\ \textrm{energy} = \frac{1}{2}mv^2 - \frac{k}{|x|} $$

As indicated, the Lagrangian classically represents the kinetic energy minus the potential energy (as one often sees in high school level physics). In Hamiltonian mechanics, one goes a step further by defining a bundle morphism \(TM \to T^*M\) known as the Legendre transformation. The space \(T^*M\) is typically referred to as the phase space of our system, and can easily be shown to also be a smooth manifold by sending basis elements to basis elements.

For example, if we have a smooth function \(f(x, v)\) on \(TM\) then

$$ df = \left( \frac{\partial f}{\partial x} \right)\, dx + \left( \frac{\partial f}{\partial v} \right)\,dv $$

By letting \((y, p) = \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial v} \right) \) denote the coordinates on \(T^*M\), our product rule gives us that

$$ d(pv) = v\,dp + p\,dv $$

Subtracting the two equations, we get

$$ y\,dx + p\,dv - v\,dp - p\,dv = \frac{\partial f}{\partial x} \,dx - v\frac{\partial f}{\partial v} = d\left( f - pv \right) $$

The function \(g(x, p) = f(x, v) - pv\) is known as the Legendre transform of \(f\); geometrically, since \(p = \frac{\partial f}{\partial v}\) this may be associated (in one variable) to the vertical intercept of the tangent line of \(f\) at given point. When our smooth function \(f\) is in fact \(L\) we denote the function \(g(x,p)\) by \(H(x, p)\) and call it the Hamiltonian of our system. Since the Lagrangian is given in classical mechanics by

$$ L(x, v) = \frac{1}{2}mv^2 - P(x) $$

(where \(P(x)\) denotes the potential function) our conjugate variable for \(v\) is \(p = \frac{\partial L}{\partial v} = mv\), which many should recognize as momentum! Therefore, our Hamiltonian can be explicitly written as

$$ H(x, p) = L(x,v) - (mv)v = -\frac{1}{2}mv^2 - P(x) = - ( \textrm{kinetic}\ \textrm{energy}\ - \ \textrm{potential}\ \textrm{energy}) $$

Depending on the author, one may see the Lagrangian represented by a sign change as \(pv - L\) — in either case, the absolute value of our Hamiltonian represents the total energy of our system. In other words, our Hamiltonian assigns each state of the phase space its energy!

I'll go ahead and quit beating around the bush in order to explain what a symplectic form is and get back to Lie groups as planned. Jumping straight to the punch-line, we don't just want to look at scalar values of our energy state, but instead look at the 1-form \(dH\) like a vector field — this will allow us to incorporate dynamics into our phase space \(T^*M\). In other words, we look at the set of linear maps \(f : T^*M \to TM\) — one typically denotes this set \(\textrm{Hom}\,(T^*M, TM)\). By definition this is precisely the dual of our set \(TM\), so we are in fact looking at elements \(\omega \in T^*M \otimes T^*M\). In general, we define a symplectic form to be a section of \(T^*M \otimes T^*M\) that is

  1. Non-Degenerate: If there exists an \(X \in T_M\) with \(\omega(X, Y) = 0\) for all \(Y \in TM\), then \(X = 0\).
  2. Skew-Symmetric: For all \(X, Y \in TM\), we have that \(\omega(X, Y) = -\omega(Y, X)\).
  3. Closed: \(d\omega = 0\).

In physics, we may associate a symplectic form \(\omega_H\) to a Hamiltonian \(H\) so that every differential \(dH\) has a corresponding vector field \(V_H\) that satisfies \(dH = \omega(V_H, \cdot)\) (i.e. \(dH(X) = \omega(V_H, X)\) for all \(X \in TM\)). To summarize, symplectic manifolds are the setting in which Hamiltonian dynamics takes place (whew).

If we consider the trivial case when our manifold \(M\) is simply Euclidean space \(\mathbb{R}^n\), then our tangent bundle and cotangent bundle are also both \(\mathbb{R}^n\) (since Euclidean space is its own dual). Therefore our phase space is precisely \(M \oplus T^*M = \mathbb{R}^n \oplus \mathbb{R}^n = \mathbb{R}^{2n}\), which tells us that our symplectic form must be expressed by some skew-symmetric, non-degenerate \(2\)-form on \(\mathbb{R}^{2n}\). The standard symplectic form on \(\mathbb{R}^{2n}\) is just the \(2n \times 2n\) matrix

$$ J = \begin{pmatrix} 0 & I_n \\ -I_n & 0 \end{pmatrix} $$

where \(I_n\) denotes the identity matrix. In fact, this matrix should look quite similar to the matrix \(J = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}\) described earlier in a The Lie Groups \(O(n)\) and \(SO(n)\). Given an arbitrary symplectic form \(\Omega\) on \(\mathbb{R}^{2n}\), our form must satisfy

$$ \Omega(x, y) = \langle -Jx, y \rangle = (-Jx)^Ty = x^TJy $$

by skew-symmetry. If we have some matrix \(A \in GL(2n, \mathbb{C})\) that satisfies \(A^TJA = J\), then

$$ \Omega(x, y) = x^TJy = x^T(A^TJA)y = (-JAx)^T(Ay) = \Omega(Ax, Ay) $$

In other words, matrices which satisfy \(J = A^TJA\) are exactly those linear maps which preserve symplectic structure! If we rewrite \(J = A^TJA\) as \(A^{-1} = J^{-1}A^TJ\), then it is an easy exercise to show that the function \(\psi(A) = J^{-1}A^TJ\) satisfies \(\psi^{-1} = \psi\) and \(\psi(AB) = \psi(A)\psi(B)\) whenever \(AB = BA\). Again using our theorem above, we see that the set

$$ Sp(n, \mathbb{C}) = \{ A \in GL(2n, \mathbb{C}) \mid A^{-1} = J^{-1}A^TJ \} $$

is a Lie subgroup of \(GL(2n, \mathbb{C})\) with Lie subalgebra

$$ \mathfrak{sp}(2n, \mathbb{C}) = \{ A \in \mathfrak{gl}(2n, \mathbb{C}) \mid -A = J^{-1}A^TJ \} $$

of \(\mathfrak{gl}(2n, \mathbb{C})\). The Lie group \(Sp(n, \mathbb{C})\) is known as the symplectic group and physically represents transformations which preserve Hamiltons equations in classical mechanics.

We additionally define the Lie group \(Sp(n)\) to be the set

$$ Sp(n) = Sp(n, \mathbb{C}) \cap U(2n) = \{ A \in U(2n) \mid A^* = J^{-1}A^TJ \} $$

more commonly known as the compact symplectic group. Before you say it, I know what you're thinking — the notation used is incredibly confusing between the two sets, but I'm not the one that came up with it. Unfortunately, I've dragged on this article long enough so I won't go into the details regarding why \(Sp(n)\) is in fact a smooth manifold and thus a Lie group. Additionally, we will leave it to a future article to show that \(Sp(n)\) is in fact the hyperunitary group \(U(n, \mathbf{H})\) over the quaternions.

If you've made it this far, I'd like to express my thanks — my articles never turn out as short as I hope and generally take quite a bit of effort to assemble. However, the fact that readers are as eager about these subjects as I am always lifts spirits on my end. Cheers! 😁