The Geometric Viewpoint

geometric and topological excursions by and for undergraduates


header image

Topology & Infinite-Dimensional Linear Algebra

For the student wishing to see interplay between the three major branches of mathematics (analysis, algebra, topology), Hilbert Space is a great place to explore!  Hilbert Space is a tool that gives us the ability to do linear algebra in infinite dimensions.  The very fact that infinity is involved should tell us that we will need analysis, and where ever there’s analysis, there’s also topology.  Oftentimes, interplay between analysis, algebra, and topology is not glimpsed at the undergraduate level; such connections are designated as “grad school material”.  Hilbert Space will offer us a chance to see these connections at work. Rather than give a host of definitions that define Hilbert Space and then give an example, it will perhaps be useful to work in the reverse order.  Consider the set of all complex-valued sequences.  An element of this set might look like this: (4+ i, 3 - i, 1 - 7i, 5, 0, 5-9i, 2i,\ldots).  We look at a special subset:  let \ell_2 be the subset consisting of sequences that are square summable; that is, the sequences (x_n) satisfying

\sum\limits_{i=1}^\infty {\vert x_{i} \vert}^2 < \infty.

It shouldn’t be entirely clear why we are interested in sequences satisfying this seemingly arbitrary condition, but shortly we will see its importance.  Notice the similarity between the dot product and the infinite sum on the left – the sum looks a lot like the dot product of a vector in \mathbb{C}^n with itself.  The set \ell_2 is an example of Hilbert Space; it is just the natural extension of \mathbb{C}^n.  We will work a lot with \ell_2, but first let’s make sure we really understand this space.  We set out on defining Hilbert Space – a fairly tall order as we shall see!

David Hilbert first introduced the concept of Hilbert Space

David Hilbert first introduced the concept of Hilbert Space

This may sound intimidating; it shouldn’t be. A Hilbert Space is just a very special type of vector space. Recall from linear algebra that a (real or complex) vector space V is a set that is closed under addition and scalar multiplication (by real or complex numbers). We call a subset B of V a basis if V = span(B) and if B is linearly independent. In this case we define the dimension of V by saying dim(V) = \vert B \vert . Notice that there is nothing about this definition which requires B to be a finite set. Indeed, while finite dimensional vector spaces are the primary object of consideration in linear algebra, so-called infinite dimensional vector spaces are the central object in a subject called operator theory, and Hilbert Space is to operator theory what \mathbb{R}^n and \mathbb{C}^n are to linear algebra. We need a few preliminary definitions in order to define a Hilbert Space. We will work over \mathbb{C} (it is no more difficult to do so than to work over \mathbb{R}). We first define an inner product on a vector space V.  An inner product is just a generalization of the dot product on \mathbb{R}^n or \mathbb{C}^n.  Recall the importance of the dot product: it gives us a notion of length, angle, and orthogonality.  So an inner product on an arbitrary vector space is a way of giving the space some geometry. An inner product is denoted \langle \cdot , \cdot \rangle, and we replace the dots with vectors to indicate that we’re taking the inner product of those two vectors.  Of course, there is a more rigorous, axiomatic definition.  For thoroughness, we state this definition, but it can be safely ignored without loss of understanding later on.  An inner product on a vector space V is a function from V \times V to \mathbb{C} that satisfies the following four rules:

  1. \langle a + b, c \rangle = \langle a , c \rangle + \langle b , c \rangle for every a, b, c \in V.
  2. \langle \lambda a , b \rangle = \lambda \langle a , b \rangle for every a, b \in V, \lambda \in \mathbb{C}.
  3. \langle a , b \rangle = \overline{\langle b , a \rangle} for every a, b \in V .
  4. \langle a , a \rangle is real and greater than 0 if a \neq 0.

Note that if we were working over \mathbb{R}, property (3) would just say that the inner product is symmetric.  We call a vector space with an associated inner product an inner product space. Recall that the definition of the dot product on \mathbb{C}^n is x \cdot y = \sum\limits_{i=l}^n x_{i}\overline{y_{i}}, where x_{i} and y_{i} are the components of x and y, respectively. The dot product satisfies all the properties above, and so it is an inner product on \mathbb{C}^n.  Once one has verified that the dot product on \mathbb{C}^n is an inner product, it is not too hard to convince oneself that the extension of the dot product to \ell_2 is an inner product as well.  We define an inner product on \ell_2 by

\langle x , y \rangle = \sum\limits_{i=1}^\infty x_i \overline{y_i}


An example of a norm, just the usual distance function on the plane

An example of a norm, just the usual distance function on the plane

The square summable condition we imposed on \ell_2 suddenly makes sense.  If we tried to compute the above inner product on sequences that are not square summable, we might end up with a divergent series on the right side of the equation – and we don’t want that! We define the norm of an element v in an inner product space V to be \Vert v \Vert = \langle v , v \rangle ^\frac{1}{2}. We will denote the norm on \ell_2 by {\Vert \cdot \Vert}_2.  Notice that if one applies this definition to \mathbb{R}^n, the norm of a point is just its distance in the origin.  So we think of a norm as a function that assigns lengths to vectors in our vector space.  In general, a norm is any function \Vert \cdot \Vert: V \to \mathbb{R} that satisfies the following three axioms:

  1. \Vert x \Vert \geq 0 for all x \in V, with equality if and only if x = 0.
  2. \Vert \lambda x \Vert = \lambda \Vert x \Vert for all x \in V, \lambda \in \mathbb{C}.
  3. \Vert x + y \Vert \leq \Vert x \Vert + \Vert y \Vert for all x,y \in V.

One can verify that any inner product induces a norm. Although we defined the norm in terms of an inner product, we say that any function satisfying (1), (2), and (3) is a norm, whether or not it is given in terms of an inner product. So, any inner product defines a norm, but not every norm is given by an inner product. For example, it is impossible to define an inner product on \mathbb{R}^2 such that the induced norm is \Vert (x,y) \Vert = max\{ \vert x \vert , \vert y \vert \} .

<img src='' alt='\mathhbb{Q}' title='\mathhbb{Q}' class='latex' />

Sequences of rational numbers can ”converge” to irrational numbers, so the rationals are not complete

We need one more definition before we can define a Hilbert Space. We need the concept of completeness. This is a fundamental property of the real numbers – completeness is what allows us to do real analysis. Essentially a space is complete if there are no “gaps” in it. For example, \mathbb{Q} is not complete because the sequence 3, 3.1, 3.14, 3.141, 3.1415, \ldots should converge, but it doesn’t (in \mathbb{Q}). Such a gap does not exist in \mathbb{R}, so we say the reals are complete. We are now in a position to define a Hilbert Space: a Hilbert Space is a complete vector space equipped with an inner product. A similar structure is a Banach Space, which is a complete vector space equipped with a norm. So any Hilbert Space is a Banach Space, but the converse is not true. We can immediately get our hands on some Hilbert Spaces: \mathbb{R}^n and \mathbb{C}^n are both finite-dimensional Hilbert Spaces. These are not particularly interesting Hilbert Spaces because they are finite-dimensional. But we are also ready to consider an infinite-dimensional Hilbert Space.  As we stated before, \ell_2 is a Hilbert Space.  It is not difficult to show that \ell_2 is a vector space, and we’ve already defined an inner product on it.  Showing that \ell_2 is complete does take a bit of work, but it’s doable.  We can also readily see that \ell_2 has no finite basis.  Indeed, an example of a basis for \ell_2 is the collection of sequences e_{i} = (0, 0,\ldots, 1, 0, 0,\ldots) where the 1 appears in the i^{th} entry.  Of course, there are many other examples of Hilbert Spaces, but a somewhat remarkable fact is that every Hilbert Space that has a countable (indexed by \mathbb{N}) basis is isomorphic to \ell_2! For this reason, mathematicians sometimes refer to “the” Hilbert Space, as if there is only one. The upshot is that we can work exclusively in \ell_2 without sacrificing the generality obtained by referring to a general Hilbert Space.

A rotation about the origin is a linear operator on the plane

A rotation about the origin is a linear operator on the plane

Since \ell_2 is a vector space, the natural thing to do is think about linear transformations of the space.  We define a linear operator on \ell_2 in the same way a linear transformation is defined in linear algebra. A function T: \ell_2 \to \ell_2 is a linear operator if


  1. T(x+y) = T(x) + T(y) for every x,y \in \ell_2.
  2. T(\lambda x) = \lambda T(x) for every x \in \ell_2 and \lambda \in \mathbb{C}.

It should be noted that not everything one may have learned about linear transformations in linear algebra is true for linear operators on \ell_2. For example, consider the shift operators S and S^* on \ell_2 defined by S(x_1,x_2,x_3,\ldots) = (0,x_1,x_2,\ldots) and S^*(x_1,x_2,x_3,\ldots) = (x_2, x_3,x_4,\ldots). It is easily verified that these are both linear operators, and that S is injective but not surjective, S^* is surjective but not injective, and S^*S = I but SS^* \neq I. In linear algebra, one learns that all of these conditions are equivalent, but in Hilbert Space this is not the case.  An important part of operator theory is determining what kinds of operators on \ell_2 behave like linear transformations on a finite-dimensional vector space.

We call a linear operator T on \ell_2 bounded if there is a constant M such that T is bounded on the unit ball \mathbb{B} = \{ x \in \ell_2 : {\Vert x \Vert}_2 \leq 1 \} by M. The norm {\Vert T \Vert}_{op} of a linear operator is defined to be the smallest such M that works in the preceding definition. Equivalently, {\Vert T \Vert}_{op} is the largest value of {\Vert T(x) \Vert}_2, where x ranges over the unit ball in \ell_2. An interesting fact about linear operators on \ell_2 is that they are continuous if and only if they are bounded (an exercise!). We define B(\ell_2) to be the set of all bounded (continuous) linear operators on \ell_2.

B(\ell_2) is an interesting space in and of itself: equipped with the norm defined in the preceding paragraph, B(\ell_2) is a Banach Space (a complete normed vector space) with respect to pointwise addition and multiplication by complex scalars. We have seen a little bit of analysis and algebra, so it’s time to introduce some topology to the mix. Our goal is to define and understand some topologies on B(\ell_2). This should be a bit surprising. After all, what does it mean for a subset of linear operators to be “open”? There are many topologies that can be put this set, but we will consider the three most common ones: the norm topology, the strong operator topology, and the weak operator topology. In describing a topology on a space, it is difficult to pin down exactly what the topology is; this is because in most interesting spaces, there are so many open sets that it’s impossible to list all of them. Instead we can define a topology by describing what properties our set has when equipped with this topology. For example, we might say that a topology on a set A is “the smallest topology such that our space A has property X“. We can also define a topology in terms of a base which, roughly speaking, is a collection of open sets that generates the rest of the open sets (by taking unions). Before we embark on defining different topologies on B(\ell_2), let’s stop and think about why we may need different topologies on the same set, and whether it is of real importance. A question one may ask is, “What properties of B(\ell_2) will change when the topology changes?” As we will see shortly, the definition of convergent sequence changes drastically. It may not seem obvious that changing the topology will have a big effect on convergence of sequences; after all, the definition of convergence that one meets in a real analysis course does not explicitly mention open sets! But let’s examine this a little closer. In real analysis, convergence is usually defined in regards to a metric. In more general topological spaces, there may be no metric (although in Hilbert Space the metric is induced by the norm), so the definition from real analysis may not be applicable. Nevertheless, we can define convergence in terms of open sets thusly (and this definition works in every topological space, including metric spaces):

Take an open interval about 1 on the vertical axis, eventually, all but finitely many points are in this interval

Take an open interval about 1 on the vertical axis and eventually, all but finitely many points are in this interval

Let (x_n) be a sequence in a topological space X.  We say x_n \to x if for every open set U \subset X containing x, we have x_n \in U for sufficiently large n. It is clear, now, that convergence depends on the definition of open set. As we introduce new topologies on B(\ell_2), we can describe what convergence “means” with respect to this new topology.

The norm topology on B(\ell_2) is the topology induced by the operator norm. To explain what this means, let us consider that any normed space has a corresponding topology induced by its norm. Think about \mathbb{R} for example. The norm of a point in \mathbb{R} is

The open set O(1.5, 1.5)

The open set O(1.5, 1.5)

just its absolute value. Think about the subsets of \mathbb{R} defined by O(x,\epsilon) = \{ y \in \mathbb{R}^2 : \vert x - y \vert < \epsilon \}, where x \in \mathbb{R} and \epsilon is a positive real number. Each set is an open interval centered at x and of radius \epsilon. The open sets in \mathbb{R} are open intervals and unions of open intervals, so the collection \{O(x,\epsilon) : x \in \mathbb{R}, \epsilon > 0 \} is a base for the usual topology on \mathbb{R}. Now we return to our set B(\ell_2), where the norm topology will be defined analogously. The collection of all subsets of B(\ell_2) of the form O(T, \epsilon) = \{ S \in B(\ell_2) : {\Vert S - T \Vert}_{op} < \epsilon \} is a base for the norm topology on B(\ell_2).  Remember that a norm is a way of defining length or distance in our space.  So the set O(T, \epsilon) is the set of all operators that have distance from T less than \epsilon.  It probably seems odd think about the distance between two functions – this is why we need careful and precise definitions of norms and, in particular, the operator norm.  The norm topology is a very important topology on B(H) indeed – it is the topology which makes B(\ell_2) a Banach Space.

Before looking at any special properties of the norm topology, we introduce the next topology on B(\ell_2) because the interesting thing to do is to compare the different topologies. Next we consider the Strong Operator Topology (SOT). The SOT is defined to be the smallest topology containing all sets of the form U(T,x,\epsilon) = \{ S \in B(\ell_2) : {\Vert S(x) - T(x) \Vert}_2 < \epsilon \} where T is any bounded linear operator, x is any element of H, and \epsilon is any positive real number. Equivalently, SOT is the smallest topology on B(\ell_2) such that the evaluation maps T \mapsto T(x) are continuous for every choice of x \in \ell_2. A word of caution: The sets U(T,x,\epsilon) may seem very similar to the sets we defined in the previous paragraph, but they’re not. Notice that we use the \ell_2 norm here, and the operator norm in the preceding paragraph. It is important to keep track of which norm function is being used and what quantity is inside the norm – of course, it wouldn’t make sense to take the operator norm of the quantity S(x) - T(x)! We can describe some of the properties of the SOT. The SOT is (somewhat paradoxically) weaker than the norm topology; that is, there are more open sets in the norm topology than there are in the SOT. The SOT is perhaps a more natural choice of topology on B(\ell_2). To explain this, I first pose a question: what does it mean for a sequence of bounded linear operators (T_n) to converge to an operator T? Well, it depends on which topology you use! In the SOT, the sequence (T_n) converges to T if for every x \in \ell_2, the sequence (T_n(x)) converges to T(x). That is, convergence in SOT means that a sequence of operators converges pointwise to some operator. This indeed seems like a very natural definition of convergence. Contrast this with convergence in the norm topology: the sequence (T_n) converges to T in the norm topology if {\Vert T - T_n \Vert}_{op} \to 0.  In a way, this definition of convergence seems more complicated and less natural than convergence in SOT.  This already is an advantage of using SOT instead of the norm topology. We mentioned that SOT is weaker than the norm topology. It isn’t too difficult to show that convergence in the norm topology implies convergence in SOT (for the reader who wants a challenge: assume {\Vert T_n - T \Vert}_{op} \to 0. Pick x \in \ell_2 and show T_n(x) \to T(x)). Using the given definitions of convergence in each topology, we can show that the converse is not true. Consider the following sequence of operators (P_n) on \ell_2:

P_n(x) = \sum\limits_{i=1}^n \langle x , e_i \rangle x,

where \{e_i\} is the basis we defined earlier. The operators P_n are called projections, because they take an input x and project x onto the linear span of the first n basis vectors. It is not hard to see that as n \to \infty, we have P_n(x) \to x, i.e. P_n \to I in SOT (I is the identity operator). However, the claim is that P_n \nrightarrow I in the norm topology.  We will just sketch a proof here.  Let m,n \in \mathbb{N} with m > n.  Then find a lower bound on {\Vert P_m - P_n \Vert}_{op} (a bound of 1 is easy to obtain by picking the unit vector that has a 1 in the m^{th} entry and else all zeroes).  What this tells us is that every element of the sequence is at least distance 1 from every other element of the sequence, and clearly no sequence with this property can converge.

The third and final topology we introduce on the space B(\ell_2) is the Weak Operator Topology (WOT). The WOT is the smallest topology on B(\ell_2) containing the following sets U(T,x,y,\epsilon) = \{ S \in B(\ell_2) : \vert \langle T(x) - S(x) , y \rangle \vert < \epsilon \}, where T \in B(\ell_2), x, y \in \ell_2 and \epsilon is a positive real number. Equivalently, the WOT is the smallest topology such that the map (called a linear functional) T \mapsto \langle T(x) , y \rangle is continuous for any choice of x, y \in \ell_2.  So a sequence of operators (T_n) converges to an operator T if \langle T_n(x) , y \rangle \to \langle T(x) , y \rangle for every choice of x, y \in \ell_2.   In order for the names given to these topologies to make sense, we had better hope WOT is weaker than SOT. Alas, this is indeed the case. Again, it is not hard to prove that SOT convergence implies WOT convergence (the only part that may be difficult is to show that an inner product is continuous). Again we can show the converse does not hold. Recall the definition of the shift operator S on \ell_2. Define S_n by composing S n times. That is, S_n(x_1, x_2, x_3,\ldots) = (0, 0,\ldots,x_1, x_2, x_3,\ldots) where the sequence on the right starts with n zeroes. First we can show that S_n \nrightarrow 0 in SOT. Consider x = (1,0,0,\ldots). Then the sequence (S_n(x)) does not converge to the 0 sequence. To see this, take any m,n \in \mathbb{N} with m \neq n. Then S_m(x) - S_n(x) is a sequence with 1 in the m^{th} entry and -1 in the n^{th} entry, hence we have {{\Vert S_n(x) - S_m(x) \Vert}^2_{2}} = 2. Since this is true for any choice of m and n, the distance between any two distinct points in the sequence (S_n(x)) is \sqrt{2}.  So the sequence does not converge to any limit, including 0. It is harder (but not too difficult) to show that S_n \to0 in WOT. The reader who wishes to provide a proof of this statement may want to read about linear functionals and the Riesz Representation Theorem.


Of course, there is much more to say about Hilbert Space (and even about B(\ell_2) than I could fit into this post.  Hilbert Space could easily be the sole focus of a semester- or even year-long course.  In his Mathematics: a Very Short Introduction, mathematician Timothy Gowers wrote, “The notion of a Hilbert Space sheds light on so much of modern mathematics, from number theory to quantum mechanics, that if you do not know at least the rudiments of Hilbert Space theory then you cannot claim to be a well-educated mathematician.”  Hopefully the reader leaves with an appreciation for the fact that Hilbert Space is a (relatively) easy space to understand and that algebra, analysis, and topology are all lurking around in Hilbert Space.

The author Dan Medici was a student in Scott Taylor’s Fall 2014 Topology class at Colby College.

This entry was posted in Uncategorized. Bookmark the permalink.


Leave a Reply