Arithmetic of Quaternion Algebra 2012 - Wai Kiu Chan

43
Arithmetic of Quaternion Algebra 2012 1 Quaternion Algebras In this section, F is a field of characteristic 6= 2. Unless stated otherwise, all algebras considered here are finite dimensional algebras over F . If 1 A (or simply 1) is the identity of an F -algebra A, then the map α 7α1 A is a monomorphism of F -algebras. This map identifies F as a subalgebra of A. If R is a ring, then R × denotes the group of units in R. 1.1 Basic Definitions Definition 1.1 A quaternion algebra H over F is a 4-dimensional algebra over F with a basis {1, i, j, k} such that i 2 = a, j 2 = b, ij = k = -ji for some a, b F × . In this definition, notice that k 2 = -ab. The basis {1, i, j, k} is called a standard basis for H and we write H = a,b F . Note that there are infinitely many standard bases for H , and hence there could be another pair of nonzero elements c, b F , different from the pair a, b, such that c,d F = a,b F . For instance, a,b F = ax 2 ,by 2 F for any x, y F × , and a,b F = a,-ab F . The notation H = a,b F is functorial in F , that is, if K is a field extension of F , then a, b F K = a, b K as K-algebras. Example 1.2 In M 2 (F ), let i = 1 0 0 -1 , j = 0 1 1 0 , k = ij = 0 1 -1 0 . Then i 2 = j 2 = 1 and {1, i, j, k} is a basis for M 2 (F ). Therefore, M 2 (F )= 1,1 F = 1,-1 F . Example 1.3 Another familiar example of quaternion algebras is Hamilton’s quaternions H. It is a quaternion algebra over R with a basis {1, i, j, k} such that i 2 = -1, j 2 = -1, ij = k = -ji. This shows that H = -1,-1 R . A simple calculation shows that any two elements from {i, j, k} are anti-commutative. Moreover, ij = k,jk = i and ki = j . 1

Transcript of Arithmetic of Quaternion Algebra 2012 - Wai Kiu Chan

Arithmetic of Quaternion Algebra 2012

1 Quaternion Algebras

In this section, F is a field of characteristic 6= 2. Unless stated otherwise, all algebrasconsidered here are finite dimensional algebras over F . If 1A (or simply 1) is the identityof an F -algebra A, then the map α 7→ α1A is a monomorphism of F -algebras. This mapidentifies F as a subalgebra of A.

If R is a ring, then R× denotes the group of units in R.

1.1 Basic Definitions

Definition 1.1 A quaternion algebra H over F is a 4-dimensional algebra over F with abasis 1, i, j, k such that

i2 = a, j2 = b, ij = k = −ji

for some a, b ∈ F×.

In this definition, notice that k2 = −ab. The basis 1, i, j, k is called a standard basis

for H and we write H =(a,bF

). Note that there are infinitely many standard bases for

H, and hence there could be another pair of nonzero elements c, b ∈ F , different from the

pair a, b, such that(c,dF

)=(a,bF

). For instance,

(a,bF

)=(ax2,by2

F

)for any x, y ∈ F×, and(

a,bF

)=(a,−abF

).

The notation H =(a,bF

)is functorial in F , that is, if K is a field extension of F , then(

a, b

F

)⊗K ∼=

(a, b

K

)as K-algebras.

Example 1.2 In M2(F ), let

i =

(1 00 −1

), j =

(0 11 0

), k = ij =

(0 1−1 0

).

Then i2 = j2 = 1 and 1, i, j, k is a basis for M2(F ). Therefore, M2(F ) =(

1,1F

)=(

1,−1F

).

Example 1.3 Another familiar example of quaternion algebras is Hamilton’s quaternionsH. It is a quaternion algebra over R with a basis 1, i, j, k such that

i2 = −1, j2 = −1, ij = k = −ji.

This shows that H =(−1,−1

R

). A simple calculation shows that any two elements from

i, j, k are anti-commutative. Moreover, ij = k, jk = i and ki = j.

1

Theorem 1.4 Let a, b ∈ F×. Then(a,bF

)exists.

Proof. Fix α, β in an algebraic closure E of F such that α2 = a and β2 = −b. Consider thetwo matrices

i =

(α 00 −α

), j =

(0 β−β 0

).

Direct computations show that

i2 = a, j2 = b, ij =

(0 αβαβ 0

)= −ji.

Since I2, i, j, ij is clearly independent over E, it is also linearly independent over F .

Therefore the F -span of I2, i, j, ij is a 4-dimensional algebra H over F , and H =(a,bF

).2

Theorem 1.5 A quaternion algebra over F is central simple, that is, its center is F andit does not have any nonzero proper two-sided ideal.

Proof. Let H be a quaternion algebra over F , and 1, i, j, k be a standard basis of H overF . Consider an element x = α+βi+γj+ δk in the center of H, where α, β, γ, δ ∈ F . Then

0 = jx− xj = 2k(β + δj).

Since k is invertible in H, we must have β = δ = 0. Similarly, γ = 0. Hence x is in F .Next, we need to show that a nonzero two-sided ideal a is H itself. It is sufficient to

show that a contains a nonzero element of F . Take a nonzero element y = a+ bi+ cj + dkin a, where a, b, c, d ∈ F . We may assume that one of b, c and d is nonzero. By replacingy by one of iy, jy and ky, we may further assume that a 6= 0. Since yj − jy ∈ a and 2kis invertible in H, we see that b + dj, and hence bi + dk as well, are in a. This showsthat a + cj is in a. By the same token, a + bi and a + dk are also in a. As a result,−2a = y − (a+ bi)− (a+ cj)− (a+ dk) is a nonzero element of F lying in a.2

So, we may study quaternion algebras using the theory of central simple algebras. Belowis a couple of well-known theorems concerning the structure of central simple algebras.

Theorem 1.6 (Wedderburn’s Structure Theorem) Let A be a finite dimensional simplealgebra over F . Then A is isomorphic to Mn(D), where D ∼= EndA(N) is a division algebraover F with N a nonzero minimal right ideal of A. The integer n and the isomorphismclass of the division algebra D is uniquely determined by A.

Theorem 1.7 (Skolem-Noether Theorem) Let A be a finite dimensional central simplealgebra over F and let B be a finite dimensional simple algebra over F . If φ, ψ are algebrahomomorphisms from B to A, then there exists an invertible element c ∈ A such thatφ(b) = c−1ψ(b)c for all b ∈ B. In particular, all nonzero endomorphisms of A are innerautomorphisms.

2

Theorem 1.8 Let H be a quaternion algebra over F .

(a) Either H is a division algebra or H ∼= M2(F ).

(b) Let E be a subfield of H which is a quadratic extension of F , and τ be the nontrivialautomorphism of E/F . Then there exists j ∈ H× such that j2 ∈ F×, H = E + Ej,and jx = τ(x)j for all x ∈ E.

Proof. Part (a) is a direct consequence of Wedderburn’s structure theorem. For part (b),since the characteristic of F is not 2, we can write E = F (i) so that i2 ∈ F×. Let τ be thenontrivial automorphism of E/F . By the Skolem-Noether Theorem, −i = τ(i) = jij−1 forsome invertible element j ∈ H. Clearly j 6∈ E and 1, i, j are linearly independent over F . Ifij = α+ βi+ γj with α, β, γ ∈ F , then (i− γ)j = α+ βi. But i− γ 6= 0; thus j ∈ F (i) = Ewhich is impossible. Therefore, 1, i, j, ij is a basis for H. Note that ij = −ji and soj2ij−2 = i. Therefore, j2 is in the center of H which is F , and this means that j2 = b ∈ F .Clearly, H = E + Ej.2

Definition 1.9 Let 1, i, j, k be a standard basis for a quaternion algebraH. The elementsin the subspace H0 spanned by i, j and k are called the pure quaternions of H.

The next proposition shows that H0 does not depend on the choice of the standard basisfor H.

Proposition 1.10 A nonzero element x ∈ H is a pure quaternion if and only if x 6∈ F andx2 ∈ F .

Proof. Let 1, i, j, k be a standard basis for H =(a,bF

). Let x be a nonzero element in H.

We can write x = a0 + a1i+ a2j + a3k with a` ∈ F for all `. Then

x2 = (a20 + aa2

1 + ba22 − aba2

3) + 2a0(a1i+ a2j + a3k).

If x is in the F -space spanned by i, j and k, then a0 = 0 and hence x 6∈ F but x2 ∈ F .Conversely, suppose that x 6∈ F and x2 ∈ F . Then one of a1, a2 and a3 is nonzero, andhence a0 = 0. Thus x is a pure quaternion.2

Thus each x ∈ H has a unique decomposition x = a+α, where a ∈ F and α ∈ H0. Theconjugate of x, denoted x, is defined by x = a− α. For any x, y ∈ H,

(i) x+ y = x+ y;

(ii) xy = y x;

(ii) x = x;

(iv) rx = rx for all r ∈ F .

(v) x = x if and only if x ∈ F .

3

In particular, the conjugation is an involution on H (or, equivalently, an algebra isomor-phism from H to its opposite algebra H). In M2(F ),(

a bc d

)=

(d −b−c a

).

Equivalently, if M ∈M2(F ), M = adj(M), the adjoint of M .

Definition 1.11 For x ∈ H, the (reduced) norm and (reduced) trace of x are the elementsnr(x) = xx and tr(x) = x+ x, respectively.

A direct computation shows that both nr(x) and tr(x) are elements of F . The normis multiplicative, that is, nr(xy) = nr(x)nr(y) for all x, y ∈ H. The invertible elements ofH are precisely those with nonzero norm. The trace, however, is linear as tr(ax + by) =atr(x) + btr(y) for all a, b ∈ F . For M2(F ), the norm of an element is just its determinant.

1.2 The Matrix Algebras

In this subsection, we discuss when a quaternion algebra H over F is isomorphic to M2(F ).

Definition 1.12 A nonzero element x in H is said to be isotropic if nr(x) = 0.

Theorem 1.13 For H =(a,bF

), the following are equivalent:

(a) H ∼=(

1,1F

)∼= M2(F ).

(b) H is not a division algebra.

(c) H has an isotropic element.

(d) H0 has an isotropic element.

(e) The equation ax2 + by2 = 1 has a solution (x, y) ∈ F × F .

(f) If E = F (√b), then a ∈ NE/F (E).

Proof. We have seen in Theorem 1.8 that (a) is equivalent to (b).Suppose that H is not a division algebra. Then it has a nonzero element x which is not

invertible. So, nr(x) = 0. This proves (b) ⇒ (c).For (c) ⇒ (d), let x be an isotropic element in H. Let 1, i, j, k be a standard basis for

H and writex = a0 + a1i+ a2j + a3k.

We may assume that a0 6= 0. So, at least one of a1, a2 and a3 is also nonzero. Without lossof generality, we assume that a1 6= 0. From nr(x) = 0, we obtain a2

0 − ba22 = a(a2

1 − ba23).

Lety = b(a0a3 + a1a2)i+ a(a2

1 − ba23)j + (a0a1 + ba2a3)k.

4

Check that nr(y) = 0. If y 6= 0, then we are done. If y = 0, then −aa21 + aba2

3 = 0 and thus

nr(a1i+ a3k) = 0.

Since a1 6= 0, a1i+ a3k is an isotropic element in H0.Suppose that H0 has an isotropic element a1i+ a2j+ a3k. Then −aa2

1− ba22 + aba2

3 = 0,and so at least two of a1, a2 and a3 are nonzero. If a3 6= 0, then

a

(a2

aa3

)2

+ b

(a1

ba3

)2

= 1.

If a3 = 0, then

a

(1 + a

2a

)2

+ b

(a2(1− a)

2aa1

)2

= 1.

This proves (d) ⇒ (e).For (e)⇒ (f), we assume that ax2

0+by20 = 1. If x0 = 0, then

√b ∈ F and E = F , in which

case (f) is certainly true. If x0 6= 0, then one can check that NE/F (x−10 + x−1

0 y0

√b) = a.

Lastly, we suppose that a is a norm from F (√b). If b = c2 for some c ∈ F , then

(c+ j)(c− j) = 0 and H is not a division algebra. So, we may assume that√b 6∈ F . Then

a = x21 − by2

1 for some x1, y1 ∈ F . Since nr(x1 + i + y1j) = 0, H again is not a divisionalgebra. This proves (f) ⇒ (b). 2

Definition 1.14 Let K/F be a field extension. A quaternion algebra H over F splits overK if H ⊗F K ∼= M2(K). We say that a quaternion algebra over F splits if it splits over F .

Corollary 1.15 If F is algebraically closed, then every quaternion algebra over F splits.

Proof. This is clear since every element of F is a square in F .2

Corollary 1.16 The quaternion algebras(

1,aF

)and

(a,−aF

)splits.

Proof. Apply (d) and (e) of Theorem 1.13.2

Example 1.17 Theorem 1.13 is very useful in constructing quaternion algebras that do notsplit. For example, let p be a prime ≡ −1 mod 4. Then the congruence −x2 +py2 ≡ z2 mod4 does not have any solution with gcd(x, y, z) = 1. Therefore, the equation −x2 + py2 = 1does not have any rational solution. Thus, by Theorem 1.13(e), the quaternion algebra(−1,pQ

)does not split. However, if p ≡ 1 mod 4, then p is a sum of two integer squares,

which means that p is a norm from Q(√−1). So, when p ≡ 1 mod 4, the algebra

(−1,pQ

)splits. Thus we have shown; Let p be a prime. Then

(−1,pQ

)splits if and only if p ≡ 1 mod

4.

Proposition 1.18 Let H be a quaternion division algebra over F . If E is a subfield of Hwhich is a quadratic extension of F , then H splits over E.

Proof. As is in the proof of Theorem 1.1, there exists a standard basis 1, i, j, k for H with

E = F (i) and i2 = a ∈ F . Thus H =(a,bF

)and hence H⊗F E =

(a,bE

)=(

1,bE

)= M2(E).2

5

1.3 Quaternion Algebras over Finite Fields

Theorem 1.19 (Wedderburn’s Little Theorem) Let A be a finite division ring. Then A isa field.

Proof. Let F be the center of A. Then F is a finite field of order q, a prime power ≥ 2. Letn = dimF A. We shall show that n = 1. Assume the contrary that n > 1. The finite groupA× acts on itself by conjugation. It follows from the class equation that

|A×| = qn − 1 = q − 1 +∑a

[A× : CA(a)×],

where CA(a) is the centralizer of a, and the a in the summation runs over a (nonempty) setof representatives of non-singleton conjugacy classes of A×. Let r(a) = dimF CA(a). Then1 ≤ r(a) < n, and the transitivity of dimensions shows that r(a) | n. Rewriting the classequation, we have

(∗) qn − 1 = q − 1 +∑a

qn − 1

qr(a) − 1.

Let r be one of the r(a) in the summation. Since r | n, we have the following factorizationin Z[x]:

xn − 1 = Φn(x)(xr − 1)h(x), h(x) ∈ Z[x],

where Φn(x) is the n-th cyclotomic polynomial. This equation implies that (qn−1)/(qr−1)is always an integer divisible by Φn(q). It follows from (∗) that Φn(q) divides q− 1 as well.In particular,

q − 1 ≥ |Φn(q)| =∏|q − ζ|,

where ζ ranges over all the primitive n-th roots of unity. This is impossible since n > 1 andq ≥ 2 clearly implies that |q − ζ| > q − 1 ≥ 1 for each ζ.2

Corollary 1.20 If A is a central simple algebra over a finite field F , then A ∼= Mn(F ) forsome n ≥ 1. In particular, every quaternion algebra over a finite field splits.

2 Quaternion Algebras as Quadratic Spaces

Let H be a quaternion algebra over F . The norm map is a quadratic form on H, that is, itsatisfies:

(i) nr(αx) = α2nr(x) for all α ∈ F ,

(ii) the function B : H ×H → F defined by

B(x, y) :=1

2(nr(x+ y)− nr(x)− nr(y)) =

1

2tr(xy)

is a symmetric bilinear form on H.

In this section, we will review some results from the algebraic theory of quadratic formsthat are useful for later discussion.

6

2.1 Quadratic Spaces

A quadratic space over a field F is a pair (V,Q), where V is a finite dimensional vectorspace over F and Q : V → F satisfies:

(a) Q(ax) = a2Q(x) for all a ∈ F and all x ∈ V ;

(b) the function B(x, y) = 12 (Q(x+ y)−Q(x)−Q(y)) is a symmetric bilinear form on

V .

The function Q is called a quadratic form on V . Note that B determines Q by B(x, x) =Q(x) for all x ∈ V . So we also use (V,B) to denote the quadratic space (V,Q). A nonzerovector v in a quadratic space (V,Q) over a field F is isotropic if Q(v) = 0; otherwise v iscalled anisotropic. The space V is said to be isotropic if it has an isotropic vector.

Two subsets X and Y of V are said to be orthogonal if B(x, y) = 0 for all x ∈ X andy ∈ Y . The set of vector in V which are orthogonal to every vector in X is denoted byX⊥. The space V is called nondegenerate if V ⊥ = 0, that is, there is no nonzero vectorin V which is orthogonal to all vectors in V . A basis of V is called an orthogonal basis ifits vectors are orthogonal to each other.

Let B = v1, . . . , vn be a basis for V . The symmetric matrix MB = (B(vi, vj)) is calledthe matrix of V with respect to B. The following is a easy consequence from linear algebra.

Lemma 2.1 If B and B′ are two bases for V , then there exists a matrix T in GLn(F ) suchthat MB′ = TMBT

t.

Let V ∗ be the dual space of V , the vector space of all linear maps V → F . If B =v1, . . . , vn is a basis for V , then B∗ = v∗1, . . . , v∗n denotes its dual basis for V ∗, where

v∗i (vj) = δij (Kronecker’s delta).

The function B : V → V ∗ defined by B(v)(u) = B(v, u) is obviously a linear transformation.

Lemma 2.2 If B is a basis for V and B∗ is its dual basis for V ∗, then the matrix of thelinear transformation B with respect to B and B∗ is MB.

Proof. From B(vi)(vj) = B(vi, vj) follows

B(vi) =

n∑j=1

B(vi, vj)v∗j

which is what is needed to be shown.2

Lemma 2.3 If W is a subspace of V , then W⊥ = ker(π B) where π : V ∗ → W ∗ is thelinear map induced by restricting functions in V ∗ on W .

Proof. A vector v of V is in W⊥ if and only if B(v, w) = 0 for all w ∈ W . This meansB(v)|W = 0 and hence v ∈ ker(π B).2

7

Corollary 2.4 A quadratic space (V,B) is nondegenerate if and only if B is an isomor-phism or, equivalently, when the matrix MB is invertible for one particular basis B forV .

Proof. This follows from the previous two lemmas.2

Corollary 2.5 If W is a nondegenerate subspace of V , then V = W ⊥ W⊥ (orthogonalsum).

Proof. Clearly, W and W⊥ are orthogonal to each other. So it remains to show thatV = W ⊕W⊥. As W is nondegenerate, W ∩W⊥ = 0. Let v ∈ V and f = B(v)|W . Thenbecause W is nondegenerate, there exists w ∈W with B(w) = f . Hence for all z ∈W ,

B(v, z) = B(v)(z) = f(z) = B(w)(z) = B(w, z).

So, v − w ∈W⊥ and we can write v = w + (v − w). Thereby W ⊥W⊥ = V .2

Theorem 2.6 Every quadratic space has an orthogonal basis.

Proof. Let (V,B) be a quadratic space. If B = 0, then the assertion is clear. So, weassume that B 6= 0. Then there exist vectors u, v ∈ V such that B(u, v) 6= 0. Since2B(u, v) = Q(u+ v)−Q(u)−Q(v), it follows that there must be a w ∈ V with Q(w) 6= 0.Then the one-dimensional subspace W = Fw is nondegenerate, and by the last corollaryV = W ⊥W⊥. An application of the induction hypothesis to W⊥ completes the proof.2

Let e1, . . . , en be an orthogonal basis for V , and let Q(ei) = ai for all i. For anyv =

∑xiei ∈ V , we have

Q(v) = a1x21 + · · ·+ anx

2n.

In this case, we shall write V ∼= 〈a1, . . . , an〉.Let (V,Q) and (V,Q′) be quadratic spaces over F . A linear map σ : V → V ′ is an

isometry if

(a) σ is a vector space isomorphism;

(b) Q′(σ(x)) = Q(x) for all x ∈ V .

Two quadratic spaces are isometric if there is an isometry from one to the other. The setof all isometries from V to V itself form a group which is called the orthogonal group of V ,denoted O(V ). Suppose that x is an anisotropic vector in V . Then the function τx : V → Vdefined by

τx(y) = y − 2B(y, x)

Q(x)x

is called the symmetry with respect to x, which is an element in O(V ).

Theorem 2.7 (Witt’s Cancellation Theorem) If V, V1 and V2 are nondegenerate quadraticspaces over F such that V ⊥ V1

∼= V ⊥ V2, then V1∼= V2.

8

Proof. Since V is the orthogonal sum of 1-dimensional subspaces, it suffices to consider thecase where dimF (V ) = 1; thus V = Fx. Under an isometry Fx ⊥ V1 → Fx ⊥ V2, x is sentto a vector y ∈ Fx ⊥ V2. Let u = (x + y)/2 and v = (x − y)/2. Then B(u, v) = 0 andQ(x) = Q(u) + Q(v). This implies that either Q(u) or Q(v) is nonzero. If Q(u) 6= 0, then−τu(x) = y; otherwise τv(x) = y. Therefore, there is an isometry Σ : Fx ⊥ V1 → Fx ⊥ V2

such that Σ(x) = x. It is easy to see that Σ must send V1 to V2.2

2.2 The Norm Form

Let H be a quaternion algebra over F . Recall that the reduced norm nr is a quadratic mapon H. So, H equipped with nr is a 4-dimensional quadratic space over F . If 1, i, j, k is a

standard basis for H =(a,bF

), then

nr(x+ yi+ zj + wk) = x2 − ay2 − bz2 + abw2,

and so 1, i, j, k is an orthogonal basis of H. Moreover, since ab 6= 0, H is nondegenerate asa quadratic space. The subspace of pure quaternionsH0 equipped with nr is a nondegenerate3-dimensional quadratic space over F . Note that for any x ∈ H0, x = −x, and thus for allx, y ∈ H0 we have

nr(x) = −x2 and B(x, y) = −1

2(xy + yx).

From now on, when we say that H or H0 is a quadratic space, it will be understood thatthe associated quadratic form is the reduced norm. The main theorem of this subsection isthe following classification theorem of quaternion algebras in terms of quadratic spaces.

Theorem 2.8 Let H and H ′ be quaternion algebras over F . Then H and H ′ are isomorphicif and only if the quadratic spaces H0 and H ′0 are isometric.

Proof. Let nr and nr′ be the reduced norms of H and H ′, respectively. Suppose first thatthere exists an algebra isomorphism φ : H → H ′. Let x be a nonzero vector in H0. Thenx 6∈ F but x2 ∈ F . So, φ(x) 6∈ F and φ(x)2 = φ(x2) ∈ F . Therefore, φ(x) ∈ H ′0,which shows that φ is a vector space isomorphism from H0 into H ′0. For any x ∈ H0,nr′(φ(x)) = −φ(x)2 = φ(−x2) = φ(nr(x)) = nr(x). Thus H0 and H ′0 are isometric.

Now suppose that σ : H0 → H ′0 is an isometry. Let 1, i, j, ij be a standard basis

for H =(a,bF

). Since nr(i) = −i2 = −a, therefore σ(i)2 = −nr′(σ(i)) = a. Similarly, we

have σ(j)2 = b. The elements i and j are orthogonal in H0. Therefore, σ(i) and σ(j) arealso orthogonal. So, σ(i)σ(j) = −σ(j)σ(i). Then one can easily check that σ(i) does notcommute with σ(i)σ(j), which means that σ(i)σ(j) 6∈ F . Also, (σ(i)σ(j))2 = −ab ∈ F .Thus σ(i)σ(j) ∈ H ′0. If rσ(i) + sσ(j) + tσ(i)σ(j) = 0 for some r, s, t ∈ F , then leftmultiplication by σ(i) to this equation forces r = 0. By a similar token, s = t = 0. Thus

1, σ(i), σ(j), σ(i)σ(j) is a standard basis for H ′ so that H ′ =(a,bF

).2

9

3 Quaternion Algebras over Local Fields

In this section, we give a more thorough discussion of quaternion algebras over a localfield. A local field, by our definition, is the completion of a number field with respect toa nontrivial valuation. The complex numbers C and the real numbers R are examples oflocal fields. By Corollary 1.15, every quaternion algebra over C must split. Over R, the

only quaternion algebras are(

1,1R

),(

1,−1R

)and

(−1,−1

R

)= H, since R× has only two square

classes represented by 1 and −1 respectively. The first two are isomorphic to M2(R).

3.1 Local Fields

Let F be a number field, which is just a finite extension of Q. By definition, every element ofF is algebraic over F . In other words, every element α of F is a root of a monic polynomialover Q. If this monic polynomial is over Z, then we say that α is an algebraic integer inF . Let o be the set of all algebraic integers in F . Then o is a ring and we call it the ringof integers of F . It is well-known that the field of fractions of o is F , and o is a Dedekinddomain, that is, it satisfies the following three properties:

(a) it is Noetherian, which means that every ascending chain of ideals must become sta-tionary after finite number of steps ;

(b) it is integrally closed, which means that an element of F that is a root of a monicpolynomial over o is already in o;

(c) all its nonzero prime ideal are maximal.

A nontrivial consequence of these properties is that every nonzero ideal a of o is a productof prime ideals, and these prime ideals, counted with multiplicities, are uniquely determinedby a.

A (multiplicative) valuation v on F is a function v : F → R such that

(1) v(x) ≥ 0 for all x ∈ F , and v(x) = 0 if and only if x = 0.

(2) v(xy) = v(x)v(y) for all x, y ∈ F .

(3) v(x+ y) ≤ v(x) + v(y) for all x, y ∈ F .

There is always the trivial valuation where v(x) = 1 for all x 6= 0. We assume throughoutthat all valuations in the subsequent discussion are nontrivial. Two valuations v and v′ areequivalent if there exists c ∈ R+ such that v′(x) = v(x)c for all x ∈ F . A place of F is anequivalence class of valuations on F . The set of all places of F is denoted by ΩF or simplyΩ if F is understood from the discussion. If v is a valuation on F , we also use v to denotethe place containing v.

A valuation v is called nonarchimedean if it satisfies in addition the ultra triangle in-equality

(3)′ v(x+ y) ≤ maxv(x), v(y)

10

for all x, y ∈ F with equality when v(x) 6= v(y). A place v is called a finite place if itcontains a nonarchimedean valuation. Otherwise it is called an infinite place. The set of allfinite places of F and the set of infinite places of F are denoted by Ωf and Ω∞, respectively.

Let v be a place of F . The function d(x, y) = v(x − y) defines a metric on F . Thecompletion of F with respect to this metric is denoted by Fv, which is a locally compactcomplete metric space. The valuation v extends uniquely to a valuation on Fv which we alsodenote by v. It turns out that Fv is a field, and its addition, subtraction, multiplication andtaking inverse are all continuous operations with respect to the metric topology. Associatedto v is an embedding σv : F → Fv, and we can identify F as a subfield of Fv through σv.We usually make no distinction between F and σv(F ). So, when we write F ⊆ Fv, it isunderstood that F is embedded in Fv through σv. In this way, we identify each elementa ∈ F with its image σv(a) in Fv.

If v is a finite place, then Fv is called a p-adic field. The set x ∈ Fv : v(x) ≤ 1 is asubring of Fv, which is called the ring of integers in Fv. It is a local ring with maximal ideal

pv(or simply p) = x ∈ Fv : v(x) < 1.

Every nonarchimedean valuation of F is coming from a nonzero prime ideal of o in thefollowing way. Suppose that p is a nonzero prime ideal of o. For any nonzero element x ino, the ideal xo has an ideal factorization

xo = pna,

where p - a. Let N(p) := |o/p| be the norm of p and set v(x) = N(p)−n. Then v extends to annonarchimedean valuation on F . It turns out that this way of constructing nonarchimedeanvaluations yields a bijection between Ωf and the nonzero prime ideals of o. The residuefield of Fv is the field ov/pv, which is isomorphic to o/p, a finite extension of the field Z/pZof p elements. A consequence of this is that ov is a compact subset in Fv, which makes Fvlocally compact. The ring ov is a PID, and its maximal ideal pv is generated by any elementπ ∈ p \ p2. In general, a generator of pv is called a uniformizer of Fv, which is an elementin pv of the largest valuation. The nonzero ideals of ov are of the form pnv , n ∈ N. Let π bea uniformizer of Fv. Then every nonzero element in F×v can be written as πmε with m ∈ Zand ε a unit in ov.

Suppose that [F : Q] = n. Then there are n different embeddings of F into C. Anembedding σ : F → C is called a real embedding if σ(F ) ⊆ R; otherwise it is called acomplex embedding. Since complex embeddings occur in pairs, n = r + 2s where r (resp.s) is the number of real (resp. complex) embeddings of F . If σ is an embedding of Finto C, then v(a) = |σ(a)| is a valuation on F . Here | | is the complex modulus. Notethat a complex embedding and its complex conjugation yield the same valuation. So thetotal number of inequivalent valuations obtained in this way is r + s, and they correspondto all the infinite places of F . An infinite place v of F is real if it corresponds to a realembedding; otherwise it is called a complex place. When v is a real (resp. complex) place,Fv is isomorphic to R (resp. C).

11

Example 3.1 There is only one infinite place on Q which contains the usual absolutevalue on R. This is clearly a real place. Let p be a prime number. For any nonzero rationalnumber x, we can write x = pnz, where z is a rational number for which p divides neitherthe numerator nor the denominator. Then v(x) = p−n is called the p-adic valuation on Q.The completion of Q with respect to this valuation is called the field of p-adic numbers,denoted Qp. The ring of integers in Qp is Zp, the ring of p-adic integers.

In this lecture notes, a local field always means the completion of a number field withrespect to a valuation. Let K be a local field and E/K be a finite extension. It turns outthat E is also a local field. If v is a valuation on K, then v extends uniquely to a valuationw on E by

w(x) = v(NE/K(x))1

[E:K] ,

where NE/K is the norm from E to K.

Definition 3.2 A finite extension of p-adic fields E/K is called unramified if a uniformizerin K is also a uniformizer in E. Otherwise E/K is ramified.

Let oK and oE be the ring of integers of K and E, respectively, and let π be a uniformizerof K. Then E/K is unramified if and only if π generates the maximal ideal of oE . In general,if pE is the maximal ideal of oE , then πoE = peE for some positive integer e. This e is calledthe ramification index of E/K. Note that E/K is unramified exactly when e = 1. Theresidue field oE/pE is a finite extension of the residue field oK/pK . Its degree of extension[oE/pE : oK/pK ] is called the residue degree of E/K, usually denoted by f .

Theorem 3.3 (Local Fundamental Identity) If E/K is a finite extension of p-adic fields,then ef = [E : K].

The next theorem is an important result form local class field theory.

Theorem 3.4 (Local Norm Index) Let E/K be a finite abelian extension of local fields.Then

[K× : NE/K(E×)] = [E : K].

Corollary 3.5 Let E/K be a quadratic extension of p-adic fields. Then E/K is unramifiedif and only if NE/K(E×) contains all the units of the ring of integers in K.

Proof. Suppose that E/K is unramified. If π := NE/K(x) is a uniformizer of K, then

w(x) = v(NE/K(x))12 = v(π)

12 . But since π is also a uniformizer of E, we must have

1 > v(π) = w(π) ≥ w(x) = v(π)12 ,

which is impossible. So, NE/K(E×) does not contain any uniformizer of K. Notice thatNE/K(E×) contains K×2, and that every coset in K×/K×2 is represented by an element of

12

K of the form πδε, where δ ∈ 0, 1 and ε ∈ o×K . So, [K× : o×KK×2] = 2. It follows form the

Local Norm Index Theorem that NE/K(E×) is equal to o×KK×2 in this case.

The converse now is obvious.2

We can say more about unramified quadratic extensions of p-adic fields. Inside onealgebraic closure of a p-adic field K, there is only one unramified quadratic extension ofK. This extension is given by K(

√u), where u is some specific chosen unit of oK . If K is

nondyadic, that is when the residue field of K has odd characteristic, then u can be chosento be any nonsquare unit in oK . When K is dyadic, then u is chosen from one specificsquare class of units. In the special case K = Q2, we can choose δ to be 5.

3.2 Quaternion Algebras over p-adic fields

Let F be a p-adic field, with ring of integers o, uniformizer π, p = πo the unique maximalideal and F = o/p the residue class field. We fix a valuation v on F .

Let H be a quaternion division algebra over F . Define

w : H → R

by w(x) = v(nr(x)). Note that w(π) = v(π)2.

Lemma 3.6 The function w is a nonarchimedean valuation on H

Proof. We need to show that w satisfies the following two properties:

(a) w(x) ≥ 0 for all x ∈ H, and w(x) = 0 if and only if x = 0;

(b) w(xy) = w(x)w(y) for all x, y ∈ H;

(c) w(x+ y) ≤ maxw(x), w(y) with equality when w(x) 6= w(y).

Property (a) and (b) follow immediately from the definition of v and the multiplicativeproperty of nr. For (c), let E be a quadratic field extension of F inside H. The restrictionof nr on E is the norm NE/F of the extension E/F . Now v NE/F is a nonarchimedeanvaluation on the p-adic field E. Thus w restricted to such a quadratic extension satisfies(c). So for x, y ∈ H×,

w(x+ y)w(y)−1 = w(xy−1 − 1) ≤ maxw(xy−1), w(1)

with equality if w(xy−1) 6= w(1). Note that w(1) = 1 and that w(xy−1) = w(x)w(y)−1 from(b). Hence w satisfies (c).2

Corollary 3.7 Let O = x ∈ H : w(x) ≤ 1 and P = x ∈ H : w(x) < 1.

(a) O is a ring and P is a two-sided ideal of O. The unit group of O is precisely the setO \ P.

(b) nr(O) ⊆ o and nr(P) ⊆ p.

13

(c) Let z ∈ P be such that w(z) is maximal. Then P = zO = Oz.

(d) P2 ⊆ πO ⊆ P.

(e) The map x+ P 7−→ zx+ P2 is a F -vector space isomorphism from O/P to P/P2.

Proof. Parts (a) and (b) are straightforward. For (c), if y ∈ P, then w(y) ≤ w(z). Hencew(z−1y) = w(yz−1) ≤ 1, which means that y is in zO and Oz.

For (d), it is clear that πO ⊆ P. Note that w(π−1z2) = [v(π)−1v(nr(z))]2 ≤ 1. So,z2 ∈ πO and hence P2 ⊆ πO.

For (e), the map x+P 7−→ zx+P2 is a F -linear map. By (c), it is surjective. Supposethat zx ∈ P2. Then x ∈ P which shows that the map is injective. 2

Proposition 3.8 The quotient O/P is a finite field.

Proof. By part(a) of Corollary 3.7, O/P is a division ring. We proceed to show that O/Pis a finite division ring. Then Wedderburn’s Little Theorem says that every finite divisionring is a field, whence the theorem.

For any x ∈ H, there exists m ∈ Z such that πmx ∈ O. It follows that H = FO. Wechoose a basis x1, x2, x3, x4 of H such that xi ∈ O for all i. Since H, equipped with thequadratic map 2nr, is a nondegenerate quadratic space over F , there exist x∗1, x

∗2, x∗3, x∗4 ∈ H

such thatB(xi, x

∗j ) = δij (Kronecker’s delta)

where B is the symmetric bilinear form associated to 2nr. The x∗i are clearly linearlyindependent over F . If x ∈ O and x =

∑i aix

∗i , then since 2nr(y) ∈ 2o for all y ∈ O, we

have ai = B(x, xi) ∈ o for all i. Thus

4∑i=1

oxi ⊆ O ⊆4∑i=1

ox∗i ,

and O is a (necessarily) free o-module of rank 4. It follows that O/πO is a 4-dimensionalvector space over the finite field F . So, O/πO is a finite set. Since πO ⊆ P, O/P is also afinite set.2

The p-adic field F has a unique unramified quadratic extension K = F (√u), where u is

from a specific square class of units in o. Let K be the residue field of K. Then K can beregarded as a quadratic extension of F . By Corollary 3.5, the norm group NK/F (K×) is asubgroup of index 2 in F× which contains all the units in o, and the nontrivial element inthe quotient group F×/NK/F (K×) is represented by a uniformizer π. Thus the quaternionalgebra

(u,πF

)is a division algebra.

Theorem 3.9 Up to isomorphism,(u,πF

)is the only quaternion division algebra over F .

Proof. Let H be a quaternion division algebra over F . The first step is to show that anunramified quadratic extension of F embeds in H. We have shown that O/P is a finite

14

extension of F . By Corollary 3.7(e), O/P and P/P2 have the same dimension as F -vectorspaces. Therefore dimF (O/P) > 1, in particular, O/P 6= F .

Now choose α ∈ O such that O/P = F (α + P). Then α 6∈ F and hence K = F (α) isa quadratic extension over F . Since K/F is a nontrivial extension, therefore the residuedegree f of K/F is at least 2. It then follows from the Local Fundamental Identity thatf must be exactly 2 and the ramification index of K/F is 1. So, K/F is unramified, andhence there exists i ∈ K such that i2 = u.

The two square roots ±i of u give two embeddings of K into H. By the Skolem-NoetherTheorem, there is a j ∈ H× such that −i = jij−1. Thus 1, i, j, ij is a basis of H (verify!).Since j2 commutes with i, j2 is in center of H and hence j2 ∈ F . This implies that 1, i, j, ijis a standard basis of H.

Let j2 = πmε, where ε ∈ o×. We may assume that m = 0 or 1. Since every ε in o×

is a norm of an element in K,(u,εF

)splits. Thus m = 1, and there exists a, b ∈ F such

that a2 − ub2 = ε. It remains to show that H =(u,πF

). It suffices to show that H0 has an

orthogonal basis e1, e2, e3 such that nr(e1) = −u,nr(e2) = −π and nr(e3) = πu.Since H =

(u,πεF

), H0 has an orthogonal basis f1, f2, f3 such that nr(f1) = −u,

nr(f2) = −πε and nr(f3) = πεu. Now, let e1 = f1, e2 = ε−1(af2 + bf3) and e3 = ε−1(ubf2 +af3). It is direct to check that e1, e2, e3 is the desired orthogonal basis of H.2

Theorem 3.10 Let L/F be a quadratic field extension. Then(u,πF

)splits over L.

Proof. If L is an unramified quadratic extension of F , then L ∼= F (√u) and hence

(u,πL

)splits.

Now suppose that L/F is ramified. Let K = F (√u) and set M = L(

√u). Then

[M : F ] = [M : L][L : F ] = [M : L]

since L/F is ramified. On the other hand,

[M : F ] = [M : K][K : F ] = 2[M : K].

So, [M : L] = 2 and M/L is unramified. Let ρ be a uniformizer for L such that π = ρ2t,where t is a unit of the ring of integers in L. Then(u, π

L

)=

(u, ρ2t

L

)=

(u, t

L

).

But t is a norm of an element in M since M/L is unramified. Thus(u,tL

)splits.2

We can say a bit more when F is a nondyadic p-adic field. The following is a well knownresult in the theory of local fields.

Theorem 3.11 (Hensel’s Lemma) Let f(x) be a monic polynomial in o[x]. Suppose thatf(x) mod p admits a factorization g(x)h(x), where g(x) and h(x) are relatively prime poly-nomials in F [x]. Then f(x) admits a factorization g(x)h(x) in o[x], where g(x) mod p =g(x) and h(x) mod p = h(x).

15

Corollary 3.12 Let F be a nondyadic p-adic field. Then F×/F×2 is a group of order 4whose elements are represented by 1, u, π and πu where u is a nonsquare unit in o.

Proof. By Hensel’s Lemma, an element c ∈ o× is a square if and only if c is a squaremodulo p. Since the residue field o/p is a finite field of odd characteristic, it has exactlytwo square classes. So, o× also has exactly two square classes. The Corollary now followsimmediately.2

Theorem 3.13 Let F be a nondyadic p-adic field. Let H =(a,bF

), where a, b ∈ o.

(a) If a, b ∈ o×, then H splits.

(b) If a ∈ o× and b ∈ p \ p2, then H splits if and only if a is a square.

(c) If a, b ∈ p \ p2, then H splits if and only if −a−1b is a square.

Proof. (a) We may assume that a = u. Then b is a norm of an element in F (√a). Hence

H splits.(b) Once again, we may assume that a = u, which is not a square mod p. Since F (

√u)/F

is unramified, b is not a norm from F (√u). Therefore, H does not split in this case.

(c) Note that(a,bF

)=(a,−a−1b

F

)and −a−1b is a unit in o. Therefore,

(a,bF

)splits if and

only if −a−1b is a square, by part (b).2

4 Quaternion Algebras over Number Fields

In this section, F is a number field and o is the ring of integers of F .

4.1 Local to Global

Let K be a finite extension of F . The restriction of a valuation on K to F is a valuationon F . For every place v of F , there are only finitely many places w of K such that therestriction of any valuation in w on F is a valuation in v. Those w are said to be lyingabove v and we write w | v. Moreover,

(]) K ⊗F Fv ∼=∏w|v

Kw.

This decomposition gives

NK/F (a) =∏w|v

Nw/v(a)

andTK/F (a) =

∑w|v

Tw/v(a).

16

Here Nw/v and Tw/v are the norm and the trace of the extension Kw/Fv. Recall that the

valuation on Kw that extends v on Fv is defined by w(x) = v(Nw/v(x))1/[Kw:Fv ]. It alsofollows from (]) that ∑

w|v

[Kw : Fv] = [K : F ].

If ew and fw are the ramification index and residue degree of Kw/Fv, then the LocalFundamental Identity implies:∑

w|v

ewfw = [K : F ] (Fundamental Identity).

We now mention some results from class field theory. An element a ∈ F is called aglobal norm (of the extension K/F ) if a ∈ NK/F (K). For any v ∈ ΩF , a is called a localnorm at v if a ∈ Nw/v(Kw) for all w | v.

Theorem 4.1 (Hasse’s Norm Theorem) Let K/F be a cyclic extension of number fields,and let a ∈ F . Then a is a global norm if and only if it is a local norm at every v ∈ ΩF .

We shall be interested in the special case when K/F is a quadratic extension. In thiscase, K = F (

√δ) for some δ 6∈ F 2. An element a ∈ F is a global norm if the diophantine

equation x2 − δy2 = a has a solution over F . For every v ∈ ΩF , by (]) there are either oneor two places of w lying above v. The latter occurs exactly when δ is a square in Fv andwe say that v splits in K. If there is only one w lying above v, then Kw/Fv is a quadraticextension and a is a local norm at v if and only if x2 − δy2 = a has a solution over Fv. Ifthere are two places of K lying above v, then every a ∈ F is a local norm at v. At thesame time, the equation x2 − δy2 = a always has a solution over Fv. Hence Hasse’s NormTheorem in the special case can be rephrased as: x2 − δy2 = a has a solution over F if andonly if x2 − δy2 = a has a solution over Fv for every v ∈ ΩF .

Theorem 4.2 (Global Square Theorem) Let δ be an element in a number field F . Then δis a square in F if and only if δ is a square in Fv for almost all v ∈ ΩF .

Recall that a quadratic space (V,Q) is called isotropic if there exists a nonzero vectorv in V such that Q(x) = 0; such a v is called an isotropic vector. The space V is calledisotropic if it has an isotropic vector, and is called nondegenerate if it does not have anynonzero vector that is orthogonal to all vectors in V . If v is a place of F , then Vv denotesthe quadratic space Fv ⊗F V with quadratic form Qv(a ⊗ x) = a2Q(x) for all a ∈ Fv andx ∈ V . We often abuse the notation and use the same Q, instead of Qv, to denote thequadratic form on Vv.

The next theorem is one of the most important theorem in the algebraic theory ofquadratic forms.

Theorem 4.3 (Hasse-Minkowski Theorem)

17

(a) Let V be a nondegenerate quadratic space over a number field F . Then V is isotropicif and only if Vv is isotropic for all places v of F .

(b) Let V and W be nondegenerate quadratic spaces over a number field F . Then V andW are isometric if and only if Vv and Wv are isometric for all places v of F .

Let us look at part (a) of the Hasse-Minkowski Theorem. We may assume, by scalingthe quadratic form on V suitably, that there exists v1 ∈ V with Q(v1) = 1. We can extendv1 to an orthogonal basis v1, . . . , vn of V . If n = 2, then there exists a δ ∈ F× such thatevery Q(v) is of the form x2 + δy2 with x, y ∈ F . So, V is isotropic if and only if −δ is asquare in F . Thus part (a) in this case is just the Global Square Theorem. If n = 3, thenthe corresponding quadratic form on V is of the form x2 + δy2 + γz2. We may assume that−δ is not a square in F . Then V is isotropic if and only if γ is a norm from the quadraticextension F (

√−δ), and part (a) in this case is just Hasse’s Norm Theorem. The rest of

the proof for part (a) is an (nontrivial) induction on the dimension of V ; see page 187 inO’Meara’s book.

For part (b), we first observe that the case n = 1 the theorem is equivalent to the GlobalSquare Theorem. Suppose that n > 1. It suffices to look at the ”if” part of the statement.We may assume that V has a vector x such that Q(x) = 1. Then V ∼= 〈1〉 ⊥ V ′ for somesubspace V ′ of dimension n−1 . At each v, there is a vector wv ∈Wv such that Q(wv) = 1,since Wv

∼= Vv. This means that the space 〈−1〉 ⊥ Wv is isotropic for every v and, by part(a), the space 〈−1〉 ⊥ W is isotropic. Thus there exists a vector w ∈ W with Q(w) = 1.So, W ∼= 〈1〉 ⊥W ′ for some subspace W ′ of dimension n− 1. Now, by Witt’s CancellationTheorem, V ′v

∼= W ′v for all places v. It follows from an induction on the dimension thatV ′ ∼= W ′, whence V ∼= W .

4.2 Classification

Let v be a place of F . For any a, b ∈ F×v , define the Hilbert Symbol

(a, b)v =

1 if ax2 + by2 = 1 has a solution in Fv;−1 otherwise.

By Theorem 1.13, (a, b)v = 1 if and only if the quaternion algebra(a,bFv

)splits. Now,

suppose that a and b are in F×. For almost all finite places v, a and b are units of ov.Therefore, by Theorem 3.13, (a, b)v = 1 for almost all v.

Theorem 4.4 (Hilbert’s Reciprocity Law) Let a, b ∈ F×. Then∏v

(a, b)v = 1,

where the product is taken over all places of F .

18

Let H be a quaternion algebra over F . For any place v of F , let Hv denote the quaternionalgebra Fv⊗F H over Fv. If v is a complex place, then Hv necessarily splits. However, if v isa real place or a finite place, then Hv either splits or is isomorphic to the unique quaterniondivision algebra over Fv. Note that Hv splits for almost all places v.

Theorem 4.5 Let H be a quaternion algebra over a number field F . Then H splits overF if and only if Hv splits over Fv for all places v of F .

Proof. By Theorem 1.13, H splits over F if and only if H0 is isotropic. The theorem nowfollows immediately from the Hasse-Minkowski Theorem.2

Definition 4.6 Let H be a quaternion algebra over a number field F . Then H is said tobe ramified at a place v if Hv is a division algebra. Otherwise, H splits at v. The set ofplaces at which H is ramified is denoted by Ram(H).

Proposition 4.7 The set Ram(H) is a finite set containing even number of places.

Proof. This is a consequence of Hilbert’s Reciprocity Law.2

Theorem 4.8 Let H and H ′ be quaternion algebras over a number field F . Then H ∼= H ′

if and only if Ram(H) = Ram(H ′).

Proof. By Theorem 2.8, H and H ′ are isomorphic if and only if H0 and H ′0 are isometric asquadratic spaces. By the Hasse-Minkowski Theorem, H0 and H ′0 are isometric if and onlyif (H0)v and (H ′0)v are isometric for all places v, which is the same as saying that Hv andH ′v are isomorphic for all places v. But Hv and H ′v are isomorphic if and only if either theyboth split or they are both ramified. Thus H and H ′ are isomorphic if and only if they areramified at the same set of places.2

5 Orders in Quaternion Algebras

5.1 Orders

Throughout this subsection, F is either a number field of a p-adic field. Its ring of integerso is a Dedekind domain. In particular, F is the field of fractions of o, and o is an integrallyclosed Noetherian ring in which every nonzero prime ideal is maximal.

Let IF be the set of nonzero finitely generated o-submodule of F . The elements in IFare called the fractional ideals of F . The nonzero ideals of o are elements of IF , and theyare called the integral ideals of F . Let a, b be two fractional ideals. Their product ab is theo-module generated by the products ab with a ∈ a and b ∈ b. The inverse of a is defined tobe a−1 = x ∈ F : xa ⊆ o. It turns out that ab and a−1 are also fractional ideals. In fact,IF is an abelian group under the multiplication of fraction ideals just defined. The identityelement is o, and a−1 is indeed the inverse of a, that is aa−1 = o. An important result about

19

Dedekind domain is that IF is the free abelian group on the set of nonzero prime ideals ofo. In other words, every fractional ideal a has a unique prime ideal factorization

a = pa11 · · · patt ,

where each pi is a nonzero prime ideal of o and each ai is a nonzero integer.Let PF be the set of principal fractional ideals αo, α ∈ F×. Then PF is a subgroup

of IF , and the quotient IF /PF is called the ideal class group of F . This group is a finitegroup for those fields F we are considering here. The order of this group is called the classnumber of F . When F is a p-adic field, its class number is always 1.

Definition 5.1 Let V be a finite dimensional vector space over F . An o-lattice in V is afinitely generated o-module contained in V . An o-lattice L in V is said to be complete ifFL = V .

From now on, unless stated otherwise, every vector space is finite dimensional over Fand every lattice in V is an o-lattice. Since o is a Dedekind domain, every lattice L in avector space V can be written as L = ox1 ⊕ · · · ⊕ oxk−1 ⊕ axk for some x1, . . . , xk ∈ V anda fractional ideal a. If L is complete, then x1, . . . , xk is necessarily a basis of V .

Theorem 5.2 (Invariant Factor Theorem) Let L and M be two complete lattices in a vectorspace V over F . Then there is a basis x1, . . . , xn of V such that

L = a1x1 + · · ·+ anxnM = a1r1x1 + · · ·+ anrnxn

where a1, . . . , an, r1, . . . , rn are fractional ideals of F with r1 ⊇ r2 ⊇ · · · ⊇ rn. The ridetermined in this way are unique.

The fractional ideals r1, . . . , rn of the last theorem are called the invariant factors of Min L. It is clear that M ⊆ L if and only if all the ri are integral ideals.

Corollary 5.3 Let L be a complete lattice in a vector space V and M be an o-modulecontained in V . Then M is a complete lattice if and only if there exists nonzero a ∈ o suchthat aL ⊆M ⊆ a−1L.

Proof. Suppose that there is an a ∈ o such that aL ⊆ M ⊆ a−1L. Since a−1L is a finitelygenerated o-module and o is Noetherian, M is also finitely generated. Moreover, sinceaL ⊆M , M contains a basis of V . Thus M is a complete lattice.

Conversely, suppose that M is a complete lattice. By the Invariant Factor Theorem,there exists nonzero a ∈ o such that aL ⊆M and aM ⊆ L.2

Definition 5.4 Let H be a quaternion algebra over F . An o-ideal in H is a completeo-lattice in H. An order in H is an o-ideal which is also a ring. A maximal order is an orderwhich is maximal with respect to inclusion.

20

Henceforth, H is always a quaternion algebra over F . We first demonstrate the existenceof an order in H. Unless stated otherwise, an ideal in H is always an o-ideal. If I is anideal in H, then the left order of I and the right order of I are defined respectively by

O`(I) = α ∈ H : αI ⊆ I, Or(I) = α ∈ H : Iα ⊆ I.

Lemma 5.5 If I is an ideal in H, then O`(I) and Or(I) are orders in H.

Proof. We shall show only that O`(I) is an order; the argument for Or(I) will be the same.Clearly, O`(I) is a subring and an o-submodule of H.

Since I is an ideal in H, there exists a nonzero s ∈ o such that s · 1 ∈ I. Therefore,O`(I)(s · 1) ⊆ I; whence O`(I) ⊆ s−1I. This shows that O`(I) is finitely generated as ano-module. So, O`(I) is a lattice.

Now, for any y ∈ H, yI is a lattice in H. Therefore, there exists a nonzero a ∈ o suchthat ayI ⊆ I. Then ay ∈ O`(I) and hence FO`(I) = H. This complete the proof thatO`(I) is an order in H.2

Let O be an order in H. Since O is a finitely generated o-module and o is Noetherian,every element of O is integral over o. More generally, suppose α ∈ H is integral over o.Since α2 − tr(α)α+ nr(α) = 0, it follows that tr(α) and nr(α) are in o.

Lemma 5.6 Let O be a subring of H. Then O is an order in H if and only if O containso, FO = H and O is integral over o.

Proof. It is clear that if O is an order in H, then O has all the properties stated in thelemma.

For the converse, let x1, x2, x3, x4 be a basis of H such that xi ∈ O for all i. It canbe checked readily that H, equipped with the (reduced) trace tr as the symmetric bilinearform, is a nondegenerate quadratic space. Therefore, d = det(tr(xixj)) 6= 0. Let L be theideal spanned by the xi. Then L ⊆ O. Suppose that α ∈ O so that

α =4∑i=1

bixi, bi ∈ F for all i.

For each j, αxj ∈ O and so

tr(αxj) =

4∑i=1

bitr(xixj) ∈ o.

Thus bi ∈ d−1o and O ⊆ d−1L. So, O is a finitely generated o-module which implies thatO is an order.2

Corollary 5.7 Every order in H is contained in a maximal order.

Proof. Apply Zorn’s Lemma and the characterization of orders given in the last lemma.2

21

The set of all elements in H that are integral over o is not necessarily an order. For

example, let H =(−1,−1

Q

)with standard basis 1, i, j, ij. Then α = i and β = (3i+ 4j)/5

are integral over Z, but neither αβ nor α+ β is integral over Z.If O is an order in H and α ∈ H×, then αOα−1 is also an order in H. So, the conjugate

of a maximal order in H is also a maximal order. However, there could be more than oneconjugacy classes of maximal orders in a quaternion algebra.

5.2 Localizations I

In this subsection, F is a number field and o is the ring of integers in F . The symbol palways denote a finite place of F or its associated prime ideal of o. For any p ∈ Ωf , let o(p)be the localization of o with respect to the multiplicative set o \ p. In other words,

o(p) = a/b ∈ F : a ∈ o, b ∈ o \ p.

It is a local ring with maximal ideal

p = a/b ∈ F : a ∈ p, b ∈ o \ p.

Let x be an nonzero element in F . The exponent of p appearing in the prime idealfactorization of the fractional ideal xo is denote by ordp(x). It is our convention thatordp(0) =∞. We claim that

o(p) = α ∈ F : ordp(α) ≥ 0

and sop = α ∈ F : ordp(α) > 0.

It is clear that ordp(α) ≥ 0 for all α ∈ o(p). Conversely, suppose that ordp(α) ≥ 0 and wecan write α = a/b, where a, b ∈ o. Then ordp(a) ≥ ordp(b). Suppose that ordp(b) = n, orequivalently, bo = pnb for some integral ideal b. Let a be an integral ideal belongs to theideal class containing p−1. So, ap is a principal ideal to for some t ∈ o.

If p - a, then a 6⊆ p. Then there exists x ∈ a but x 6∈ p. Therefore, xo = ac, where p - c.This implies

xnbo = (ap)ncb = tncb.

Set b′ := xnb/tn, which is an element in o \ p. Using the same argument we can show thatthe element a′ := xna/tn is in o. Then α = a/b = a′/b′ ∈ o(p).

If p | a, then a ⊆ p. Choose an integral ideal i such that ia is principal generated byδ ∈ o. Then ia 6⊆ δp. Fix an ε ∈ i such that εa 6⊆ δp and set γ = ε/δ. Then γa ⊆ o andγa 6⊆ p. So, γa is an integral ideal in the ideal class containing p−1 and p - γa. We can thenreplace a by γa in the last paragraph.

Let I be a nonzero ideal of o(p). Among all the elements in I, choose one, say x,such that ordp(x) is the smallest. For any a ∈ I, ordp(a) ≥ ordp(x), which implies thatordp(ax

−1) ≥ 0, that is ax−1 ∈ o(p). Thus, a ∈ xo(p) and hence I = xo(p). This shows

22

that o(p) is a PID and its ideals are πno(p) = pn, n ≥ 0. The rings o(p) are subrings of Fand o can be recovered from them as

o =⋂

p∈Ωf

o(p).

Lemma 5.8 Let V be a finite-dimensional vector space over F and let L be an o-lattice inV . Then

L =⋂

p∈Ωf

o(p)L.

Proof. It is clear that L is contained in the intersection. For the converse, let x1, . . . , xkbe a generating set for L as an o-module; thus it is also a generating set for o(p)L as ano(p)-module for each p. Suppose that x is in the intersection. Let

J = y ∈ o : yx ∈ L.

Then J is an integral ideal of o. Fix a p in Ωf . We can write x =∑k

i=1 aixi with ai = bi/ci,where bi, ci ∈ o and ci 6∈ p for all i. Let c = c1 · · · ck so that c 6∈ p. However, c ∈ J . Thus Jis an integral ideal of o which does not lie in any nonzero prime ideal of o. This shows thatJ = o. Thus 1 ∈ J and x ∈ L.2

This result will be applied to the case when V is a quaternion algebra H over F and Lis an o-ideal in H.

Lemma 5.9 Let I be an o-ideal in an quaternion algebra H over F . For each prime idealp ∈ Ωf , let I(p) be an o(p)-ideal in H such that I(p) = o(p)I for almost all p. Then

J =⋂

p∈Ωf

I(p)

is an o-ideal in H such that o(p)J = I(p) for all p.

Proof. Let x1, x2, x3, x4 be a basis of H in I, and let L be the o-ideal ox1 + · · · + ox4.Then L ⊆ I, and there exists an nonzero r ∈ o such that rI ⊆ L. For almost all p, r is aunit in o(p). Therefore, for almost all p,

o(p)L = o(p)I = I(p).

As a result, we can find an nonzero a ∈ o such that

aI(p) ⊆ o(p)L ⊆ a−1I(p) for all p.

ThenJ =

⋂p∈Ωf

I(p) ⊆ a−1⋂

p∈Ωf

o(p)L = a−1L

23

by Lemma 5.8. Thus J is an o-lattice in H. By the same token, aL ⊆ J ; thus J is ano-ideal in H.

Now, for each p ∈ Ωf , o(p)J ⊆ o(p)I(p) = I(p). For the reverse inclusion, let j1, . . . , jkbe a generating set for J as o-module. Let x ∈ I(p) so that x =

∑i aiji with ai ∈ F .

Choose a nonzero s1 ∈ o so that s1a1 ∈ o. Suppose that s1o has the following prime idealfactorization

s1o = pn0qn11 · · · q

ntt

where ni ≥ 1 for 0 ≤ i ≤ t. By the Chinese Remainder Theorem, there exists d1 ∈ o suchthat

d1 ≡s1a1 + s1 mod pn0+1,

s1 mod qni+1i for 1 ≤ i ≤ t.

Then b1 = d1/s1 is such that b1− a1 ∈ o(p) and b1 ∈ o(q) for all prime ideals q 6= p. Repeatthe same for each ai to obtain a bi, and let y =

∑biji. Then y ∈ o(q)J ⊆ I(q) for all

q 6= p. Also, y − x =∑

(bi − ai)ji ∈ o(p)J ⊆ I(p). Thus y ∈ I(p) and so y ∈ J . Hencex = y − (y − x) ∈ o(p)J .2

Note that if O is an o-order in H, then o(p)O is an o(p)-order in H and the above lemmaholds with “ideals” replaced by “orders”.

Lemma 5.10 Let O be an o-order in a quaternion algebra H over F . Then O is a maximalo-order if and only if o(p)O is a maximal o(p)-order for all p ∈ Ωf .

Proof. Suppose that O is maximal but o(p)O ⊆ Λ(p) for some o(p)-order Λ(p). Define anorder O′ by

o(q)O′ =

o(q)O if q 6= p;

Λ(p) if q = p.

Then O ⊆ O′; so O = O′ and hence o(p)O = Λ(p).Conversely, suppose that each o(p)O is maximal and O is contained in a maximal o-order

O′. Then clearly o(p)O ⊆ o(p)O′ for all p. By maximality, we have

o(p)O = o(p)O′ for all p.

The result then follows from Lemma 5.8.2

In later discussion, we will identify o(p)O with o(p)⊗o O.

5.3 Localizations II

In this subsection, we shall interpret the local-global results obtained in the last subsectionin the context of ideals and orders over the p-adics. If H is a quaternion algebra over F ,then Hp denotes the quaternion algebra Fp ⊗F H. For any lattice L in H, Lp denotes theop-lattice op ⊗o L in Hp. If O is an order in H, then Op is an op-order in Hp. Note thatOp = op ⊗o O = op ⊗o(p) (o(p)O).

24

Lemma 5.11 There is a bijection between o(p)-ideals (resp. orders) in a quaternion algebraH over F and the op-ideals (resp. orders) in the quaternion algebra Hp over Fp given bythe map

I 7−→ op ⊗o(p) I

which has the inverse J 7→ J ∩H.

Proof. Since o(p) is a PID, I is free as an o(p)-module. Let x1, x2, x3, x4 be a basis ofI over o(p). Then in Fp ⊗F H, (op ⊗o(p) I) ∩ H consists of the op ∩ F = o(p) linearlycombinations of the xi. Thus (op ⊗o(p) I) ∩H = I.

Now, suppose that J is an op-ideal in Hp and that y1, y2, y3, y4 is a basis of J over op.Let z1, z2, z3, z4 be a basis of H over F so that zi =

∑j bijyj for all i. Then B = (bij)

is an invertible matrix in M4(Fp). Since F is dense in Fp, we can choose cij ∈ F such thatthe entries of C = (cij) is close enough to those of B−1 to make CB to be a unit in M4(op).

Now let z′i =∑

j cijzj =∑

j,k cijbjkyk. Then z′1, z′2, z′3, z′4 is a basis of J over op, whichis also a basis of H over F . Thus J ∩H consists of the op ∩ F = o(p) linear combinationsof the z′i, and so is an o(p)-ideal in H such that op ⊗o(p) (J ∩H) = J .2

Fix an o-ideal I in H. Let I be the set of o-ideals in H, and let T be the set of allsequences (Lp) such that Lp is an op-ideal in Hp for all p ∈ Ωf and Lp = Ip for almost all p.

Lemma 5.12 The map J 7→ (Jp) is a bijection from I to T.

Proof. If J is an o-ideal in H, then there exist nonzero a, b ∈ F such that aJ ⊆ I ⊆ bJ . Foralmost all p, a and b are units in op so that Jp = Ip for almost all p.

Now, suppose that a sequence (Lp) in T is given. Let J(p) = H ∩ Lp, which is an o(p)-ideal in H by Lemma 5.11. Furthermore, J(p) = o(p)I for almost all p. Then J = ∩pJ(p)is an o-ideal in H, and Jp = Lp for all p. Thus the map J 7→ (Jp) is surjective. Now ifideals J and J ′ have the same image under this map, then o(p)J = o(p)J ′ for all p. Then,by Lemma 5.8, J = J ′ and the map is injective.2

Corollary 5.13 Let O be an o-order in the quaternion algebra H over F . Then O ismaximal if and only if Op are maximal op-orders in Hp for all p ∈ Ωf .

Proof. Exercise.2

5.4 Discriminants

In this subsection, F is the field of fractions of a Dedekind domain o, and H is a quaternionalgebra over F .

Definition 5.14 Let O be an o-order in H. The discriminant of O, denoted d(O), is thefractional ideal of o generated by the elements det(tr(xixj)), where x1, x2, x3, x4 ∈ O.

Since O is an o-order, it must contain a basis of H over F . Thus d(O) is nonzero (recallthat H together with tr(x2) as the quadratic form is a nondegenerate quadratic space overF ). Also, since every element in O is integral over o, d(O) is an integral ideal.

25

Proposition 5.15 If an o-order O in H is free with a basis u1, u2, u3, u4 over o, thend(O) is the principal ideal generated by det(tr(uiuj)).

Proof. Let x1, x2, x3, x4 ∈ O so that xi =∑

k aikuk with aik ∈ o for all i, k. Then

det(tr(xixj)) = det(aik) det(tr(uiuj)) det(aik)t

and the result follows.2

Example 5.16 Let O = M2(o). It has a basis Eij : 1 ≤ i, j ≤ 2, where Eij is the matrixwith 1 in the (i, j)-entry and 0 elsewhere. Using this basis one can easily compute d(O) = o.

Example 5.17 Let H be the quaternion algebra(−1,−1

Q

)and O be the Z-order Z + Zi+

Zj + Zij. Then d(O) = 16Z. Let O′ be the Z- order O+ Zα, where α = (1 + i+ j + ij)/2.Then O ⊆ O′, and d(O′) = 4Z.

Lemma 5.18 Suppose that o is a PID. If O1 and O2 are two o-orders in H with O1 ⊆ O2,then d(O2) | d(O1), and O1 = O2 if and only if d(O1) = d(O2).

Proof. The first assertion is clear. For the second assertion, suppose that d(O1) = d(O2).Let u1, u2, u3, u4 be an o-basis of O1, and let v1, v2, v3, v4 be o-basis of O2. SinceO1 ⊆ O2, the matrix T that expresses the ui in terms of the vj has entries in o. But

det(T )2 det(tr(vivj)) = det(tr(uiuj));

thus T ∈ GL4(o) and hence O1 = O2.2

Now, assume that F is a number field and o is its ring of integers. Let O be an o-orderin H. Then it can readily be shown that d(o(p)O) = o(p)d(O) for all p ∈ Ωf . Each o(p) isa PID and we can compute d(o(p)O) using a basis of o(p)O. Then, by Lemma 5.8,

d(O) =⋂

p∈Ωf

d(o(p)O).

Theorem 5.19 Suppose that F is a number field and o is its ring of integers. Let O1 andO2 be o-orders in H with O1 ⊆ O2. Then d(O2) | d(O1), and d(O1) = d(O2) if and only ifO1 = O2. In particular, O is maximal if d(O) = o.

Proof. The first assertion is clear. Suppose that d(O1) = d(O2). Then d(o(p)O1) =d(o(p)O2) for all p ∈ Ωf . It follows from Lemma 5.18 that o(p)O1 = o(p)O2 for all p,and so O1 = O2 by Lemma 5.8.2

Let us continue to assume that F is a number field and o is its ring of integers. Supposethat

d(O) =

r∏i=1

pnii

26

is the prime ideal factorization of d(O). If p is not one of those pi, then d(o(p)O) =o(p)d(O) = o(p). For i = 1, . . . , t, d(o(pi)O) = pni

i o(pi). Thus

d(O) =∏p∈Ωf

(d(o(p)O) ∩ o).

Over the p-adics, one can show that d(Op) = d(O)p for all p. Since the unique primeideal in op is pop, one can, with an abuse of notation, re-write the above product as

d(O) =∏p∈Ωf

d(Op).

5.5 Orders in M2(F )

In this subsection, we discuss the special case when H = M2(F ). Here F is the field offractions of a Dedekind domain o. Let V be a 2-dimensional vector space over F . We fixa basis e1, e2 of V so that M2(F ) is identified with End(V ). The o-lattice oe1 + oe2 isdenoted by L0.

If L is a complete o-lattice in V , define

End(L) = σ ∈ End(V ) : σ(L) ⊆ L.

In particular, End(L0) is identified with the o-order M2(o). It is clear that End(L) is asubring of End(V ) for any L. Moreover, End(L) = End(aL) for all a ∈ F×.

For any complete o-lattice L in V , there exists nonzero a ∈ o such that aL0 ⊆ L ⊆ a−1L0.It follows that

a2End(L0) ⊆ End(L) ⊆ a−2End(L0).

Thus End(L) is an o-order in H.

Lemma 5.20 M2(o) is a maximal o-order in M2(F ).

Proof. This is clear because the discriminant of M2(o) is o.2

Lemma 5.21 Let O be an o-order in End(V ). Then there exists a complete o-lattice L inV such that O ⊆ End(L).

Proof. Let L = ` ∈ L0 : O` ⊆ L0. Then L is an o-submodule of L0. In particular, L isfinitely generated. Also, if 0 6= a in o such that aEnd(L0) ⊆ O ⊆ a−1End(L0), then for all` ∈ L0, we have

Oa` ⊆ End(L0)` ⊆ L0.

Thus aL0 ⊆ L and L is a complete o-lattice in V .Let α ∈ O. For any ` ∈ L, Oα` ⊆ O` ⊆ L0. Therefore, α` ⊆ L and O ⊆ End(L).2

27

Corollary 5.22 Suppose that o is a PID. Then the maximal o-orders in M2(F ) are preciselythe orders End(L). Every maximal o-order in M2(F ) is conjugate to M2(o).

Proof. Let O be a maximal o-order in M2(F ). By Lemma 5.21, there exists a completeo-lattice L in V such that O = End(L). Conversely, if L is a complete o-lattice in V , thenL = of1 + of2 for some basis f1, f2 of V . Let σ ∈ End(V ) defined by σ(ei) = fi withi = 1, 2. Then L = σ(L0) and hence End(L) = σEnd(L0)σ−1 = σM2(o)σ−1. This showsthat End(L) is a maximal o-order, and that every maximal o-order in M2(F ) is conjugateto M2(o).2

When o is not a PID, not every L is free as an o-module. However, since o is a Dedekinddomain, there exists a basis x, y of V and a fractional ideal a such that L = ox + ay.Thus End(L) is a conjugate of

M2(o, a) =

(a bc d

): a, d ∈ o, b ∈ a−1, c ∈ a

.

Now suppose that F is a p-adic field with ring of integers o. Let π be a uniformizer forF and let q be the size of the residue field o/p. Then for any integer n ≥ 0, [o : pn] = qn.

Lemma 5.23 Let O and O′ be maximal o-orders in M2(F ). Then

OO ∩O′

∼=O′

O ∩O′

as o-modules.

Proof. Suppose that O = End(L) and O′ = End(L′), where L and L′ are complete o-latticesin V . By the Invariant Factor Theorem and scaling L′ by an element in F× if necessary,we can find a basis e, f of L and n ≥ 0 such that e, πnf is a basis of L′. Using e, f,we can identify O with M2(o) and O′ with xOx−1 where x is the matrix ( 1 0

0 πn ). Then

O′ =(

o pn

p−n o

). Thus

OO ∩O′

∼=o

pn∼=

O′

O ∩O′.

2

Definition 5.24 Let O and O′ be maximal o-orders in M2(F ). The distance between Oand O′ is defined to be logq[O : O ∩O′]. The orders O and O′ are neighbors if the distancebetween them is 1.

Lemma 5.25 Suppose that the distance between two maximal o-orders O and O′ in M2(F )is n. Then O′ = xOx−1 for some x ∈ GL2(F ) with ordp(det(x)) = n.

Proof. This is clear from the proof of Lemma 5.23.

28

5.6 Orders in the Local Case

In this subsection, we deal with the case when F is a p-adic field and o is its ring ofintegers. So, o is a PID with prime ideal p = πo. Let H is the unique quaternion algebra(u,πF

), where u is a nonsquare unit so that F (

√u)/F is an unramified quadratic extension.

If v is a valuation on F and w = v nr, then w is a valuation on H. Let O be the associatedvaluation ring x ∈ H : w(x) ≤ 1. Note that O is also equal to x ∈ H : nr(x) ∈ o.

Theorem 5.26 The valuation ring O is the unique o-maximal order in H and has discrim-inant d(O) = π2o = p2.

Proof. For any x ∈ H, there exists a nonzero r ∈ o such that rx ∈ O. So, FO = H. Ifx ∈ O, then x ∈ O and so tr(x) = x+ x ∈ O ∩ F = o. But nr(x) ∈ o since x ∈ O. Thus xis integral over o. This shows that O is an o-order in H.

If x ∈ H is integral over o, then nr(x) ∈ o and hence w(x) ≤ 1. Thus x ∈ O, whichmeans that O is precisely the set of elements in H that are integral over o. Hence O is themaximal o-order in H.

Let 1, i, j, ij be a standard basis of H such that i2 = u and j2 = π. Let K be F (√u).

Then H = K + Kj and nr|K is the norm NK/F . Since K/F is unramified, π is also auniformizer for K. Let oK be the ring of integers of K. Then

oK = x ∈ K : nr(x) ∈ o.

Now let α = x + yj ∈ H with x, y ∈ K. Then nr(α) = nr(x) − nr(y)π. Since nr(x) andnr(y) are of the form π2mz, where z ∈ o×, we see that nr(α) ∈ o if and only if nr(x) andnr(y) are in o. Thus α ∈ O if and only if α ∈ oK + oKj. Hence O = oK + oKj.

Now, let y ∈ oK such that 1, y is a basis of oK over o. Then 1, y, j, yj is a basis ofO over o. Note that jy = yj and tr(αj) = 0 for all α ∈ K. From this it follows that

d(O) = (y − y)4π2o.

Since K/F is unramified, the reside field of K is a quadratic extension of the reside field ofF . Thus, if y − y is not a unit, then the images of y and y in the residue field of K wouldbe the same, which means that the image of y is in the residue field of F . However, oK isequal to o[y], and so the residue fields of K and F must be the same which is impossible.Therefore, d(O) = π2o.2

5.7 Orders in the Global Case

In this subsection, F is a number field and o is the ring of integers in F . Let H be aquaternion algebra over F . Recall that Ram(H) is the set of places at which H is ramified.It is a finite set with even number of elements.

Definition 5.27 The discriminant of H, denoted ∆(H), is the product of all the finiteplaces at which H is ramified.

29

Theorem 5.28 Let O be an o-order in a quaternion algebra H over a number field F .Then O is a maximal o-order if and only if d(O) = ∆(H)2. In particular, all maximalo-orders in H have the same discriminant.

Proof. By Lemma 5.13, O is maximal if and only if Op is maximal for every finite placep. By Example 5.16 and Theorem 5.26, the discriminant of a maximal op-order in Hp iseither op or p2 according to whether Hp splits or is ramified. Furthermore, orders with thesediscriminants are necessarily maximal by Theorem 5.19. The result now follows from thefact that d(O) is the product of all d(Op).2

Example 5.29 Let H be the quaternion algebra(−1,−1

Q

). Then Hp splits for all odd

primes p; see Theorem 3.13. Since |Ram(H)| is even and H ⊗Q R = H, H is ramified at 2.Thus ∆(H) = 2Z. The discriminant of the Z-order O′ in Example 5.17 is 4Z. Thus O′ is amaximal Z-order in H.

6 Conjugacy Classes of Maximal Orders

In this section F is a number field. We keep all the relevant notations used in the previoussection. When the ring of integers o of F is a PID, then all the maximal orders in M2(F )are conjugate to M2(o). This does not hold in general and in this section we give a formulafor the number of conjugacy classes of maximal orders in a quaternion algebra over F .

6.1 Idele Group of a Quaternion Algebra

Let H be a quaternion algebra over F , and let O be a maximal o-order in H. The idelegroup of H is the set

H×A =

(xv) ∈

∏v∈Ω

H×v : xv ∈ O×v for almost all finite places v

.

The elements of H×A are called ideles of H. Clearly H×A is a subgroup of the direct product∏v∈Ω

H×v . For each v ∈ Ω, there is an embedding σv : H → Hv = Fv ⊗F H, where σv(x) =

1 ⊗ x for all x ∈ H. Using this embedding, we can identify H× as a subgroup of H×v andwe will not make any distinction between x and σv(x). If x ∈ H×, there exists r 6= 0 in Fsuch that rx−1 ∈ O. Since r ∈ o×v for almost all v, therefore x ∈ O×v for almost all v; henceH× can be identified as a subgroup of H×A .

Let x = (xv) be an idele of H. Define an order xOx−1 in H by specifying its localcompletion at a finite place v as

(xOx−1)v = xvOvx−1v .

This definition is meaningful by Lemma 5.12 since xv ∈ O×v for almost all finite places vso that (xOx−1)v = Ov for almost all v. Since the conjugate of a maximal order in Hv isagain maximal, the order xOx−1 is a maximal order in H by Corollary 5.13.

30

Lemma 6.1 Let O′ be another maximal order in H. Then O′ = xOx−1 for some x ∈ H×A .

Proof. Let S be the set of finite places v for which Ov 6= O′v. Then S is a finite set andS ∩ Ram(H) = ∅. If v ∈ S, then Hv

∼= M2(Fv). Since ov is a PID, O′v is conjugate to Ovand hence there exists hv ∈ H×v such that hvOvh−1

v = O′v. Now define an idele x ∈ H×A by

xv =

1 if v 6∈ S;hv if v ∈ S.

It is clear that xOx−1 = O′.2

Up to this point, we see thatH×A acts on the set of all maximal orders inH by conjugationand the action is transitive. For every v ∈ Ωf , let N(Ov) be the normalizer of Ov, that is

N(Ov) = xv ∈ H×v : xvOvx−1v = Ov.

It is easy to see that N(Ov) is a subgroup of H×v . Moreover, O×v ⊆ N(Ov). Let

N(O)A = x = (xv) ∈ H×A : xv ∈ N(Ov) for all v ∈ Ωf.

Clearly N(Ov) is a subgroup of H×A . In fact, N(O)A is the stabilizer of O in H×A .

Proposition 6.2 The set of conjugacy classes of maximal orders in H is in bijection withthe double coset space H×\H×A /N(O)A.

Proof. Let C be the set of conjugacy classes of maximal orders in H. We shall set up abijection Φ from the double coset space H×\H×A /N(O)A to C.

Suppose that H×xN(O)A = H×yN(O)A. Then x = hyn, where h ∈ H× and n ∈N(O)A. So xOx−1 = h(yOy−1)h−1 and thus xOx−1 and yOy−1 are in the same conjugacyclass. This means that we can define a function Φ : H×\H×A /N(O)A → C such that

Φ(H×xN(O)A) = conjugacy class that contains xOx−1.

Lemma 6.1 implies that Φ is surjective.Now, let x, y ∈ H×A such that xOx−1 and yOy−1 are in the same conjugacy class. Then

xOx−1 = hyO(hy)−1 for some h ∈ H×. So, x−1hy ∈ N(O)A and hence y ∈ H×xN(O)A.This shows that Φ is injective.2

6.2 Theorem on Norms

We continue to assume that H is a quaternion algebra over the number field F . Let v bean infinite place of F . If Hv splits, then clearly nr(H×v ) = F×v . If Hv is ramified, then vmust be a real place and Hv is Hamilton’s quaternions H. If 1, i, j, ij is a standard basis

of H =(−1,−1

R

), then

nr(x1 + x2i+ x3j + x4ij) = x21 + x2

2 + x23 + x2

4

and thus nr(H×v ) = R×2.

31

Lemma 6.3 If v is a finite place of F , then nr(H×v ) = F×v .

Proof. This is clear if Hv = M2(Fv). Thus we assume that Hv is the unique quaternion

algebra(π,uFv

)over Fv, where π is a uniformizer for Fv and u is a nonsquare unit of ov so

that Fv(√u)/Fv is unramified. The restriction of the reduced norm on Fv(

√u) is the usual

norm N of the field extension Fv(√u)/Fv. Since [F×v : N(Fv(

√u)×)] = 2 and N(Fv(

√u)×)

contains all the units of ov, it remains to show that nr(H×v ) contains a uniformizer of Fv.

But this is clear; nr(j) = −π if 1, i, j, ij is a standard basis for(π,uFv

).2

Let Ram∞(H) be the set of infinite places at which H is ramified. It is necessary thatFv ∼= R for all v ∈ Ram∞(H). Let

F×H = a ∈ F× : a is positive in Fv for all v ∈ Ram∞(H).

Proposition 6.4 (Theorem on Norms) Let H be a quaternion algebra over a number fieldF . Then nr(H×) = F×H .

Proof. It is clear that nr(H×) ⊆ F×H . Let a ∈ F×H . Then a ∈ nr(H×v ) for all v ∈ Ω byLemma 6.3. Since H, when equipped with the reduced norm, is a nondegenerate quadraticspace over F , we can apply Hasse-Minkowski Theorem to the present situation and deducethat a ∈ nr(H×).2

The idele group of F is the set

JF = x = (xv) ∈∏v∈Ω

F×v : xv ∈ o×v for almost all finite places v.

The elements of JF are called the ideles of F . It is clear that JF is indeed an abelian groupunder the operation (xv)(yv) = (xvyv). For any a ∈ F×, a ∈ o×v for almost all finite placesv. Therefore, a can be regarded as the idele whose v-component is a itself (of course, herewe identify a with σv(a) where σv : F → Fv is the embedding associated with the place v).Thus we can identify F× as a subgroup of JF . Since nr(O×v ) ⊆ o×v for all finite places v, wecan define the reduced norm nr : H×A → JF by nr((xv)) = (nr(xv)).

For each v ∈ Ω, let

Nv =

nr(N(Ov)) if v ∈ Ωf ;nr(H×v ) if v ∈ Ω∞.

Since Z(Hv) = Fv, Nv contains F×2v . Let

JF (O) = x = (xv) ∈ JF : xv ∈ Nv for all v ∈ Ω.

It is clear that JF (O) is the image of N(O)A under the reduced norm, and JF (O) ⊇ J2F .

Proposition 6.5 Let v be a finite place of F . Then

Nv =

o×v F

×2v if Hv splits;

F×v otherwise.

32

Proof. Suppose that Hv is a division algebra. Then Ov is the unique maximal order in Hv.Thus xOvx−1 = Ov for all x ∈ H×v . So N(Ov) = H×v , and hence Nv = F×v by Lemma 6.3.

Now suppose that Hv = M2(Fv) = End(V ) where V is a 2-dimensional vector spaceover Fv. We may assume that Ov = M2(ov) = End(L), where L is a complete ov-latticein V . In this case, O×v contains all invertible matrices in M2(ov). Therefore, o×v F

×2v ⊆ Nv.

For the other inclusion, let σ be an element in N(Ov). Then

End(L) = σEnd(L)σ−1 = End(σ(L)).

Since End(σL) = End(a σ(L)) for all a ∈ F×v , we may also assume that σ(L) ⊆ L and, bychoosing a suitably, that there exist a basis e, f of L and α ∈ ov such that e, αf is abasis of σ(L). Let τ ∈ End(V ) be the map that sends e to e and f to αf . Then τσ−1 isan element in End(σ(L)). Thus nr(σ) = αu for some u ∈ o×v . We claim that α is also ino×v and this will conclude the proof of this case. For, let β ∈ End(V ) be the element whichswitches e and f . Then β ∈ End(L)× and hence β is also in End(σ(L))×. But then f, αeis also a basis of σ(L), whence α ∈ o×v .2

6.3 Strong Approximation

Let v be a place of F , and let B be a basis of Hv over Fv. Using B, Hv can be identifiedwith F 4

v , and we can make Hv into a topological space by transporting the product topologyfrom F 4

v . It is not hard to see that different bases for Hv produce the same topology on Hv.Moreover, with respect to this topology, Hv becomes a locally compact topological ring,that is, the addition and multiplication in Hv are continuous operations. We impose thesubspace topology on H×v . Let x ∈ H×v . Then x−1 = x/nr(x), where x is the conjugation

on Hv. Let 1, i, j, ij be a standard basis for Hv =(α,βFv

). If x = a0 + a1i + a2j + a3ij,

then nr(x) = a20 − αa2

1 − βa22 + αβa2

3 and x = a0 − a1i − a2j − a3ij. Hence x 7→ x−1 is acontinuous map, whence H×v is in fact a locally compact topological group.

Now suppose that v is a finite place of F and pv is the associated prime ideal. Let Lbe the ov-lattice spanned by B. Since L ∼= o4

v and ov is both compact and open, L itself isalso both compact and open. The collection of compact-open sets pnvL is a fundamentalsystem of compact neighborhoods of 0. For every x ∈ H×v , the set x+ pnvL is contained inH×v for all sufficiently large integers n.

Let O be a maximal order in H. For each finite place v of F , the group O×v is a compact-open subgroup of H×v . It is because nr : Ov → ov is a continuous map and O×v = nr−1(o×v ) isboth closed and open in the compact space Ov. We now make H×A into a topological groupby specifying a fundamental system of neighborhoods of the identity in H×A consisting ofthe sets of the form ∏

v∈Ω

Uv,

where each Uv is an open neighborhood of 1 in H×v with Uv = O×v for almost all finite placesv. We call this topology the restricted product topology of H×A . If S is a finite subset of Ω

33

containing Ω∞, let

H×A (S) =∏v∈S

H×v ×∏v 6∈SO×v ,

which is a subset of H×A . Then the restricted product topology on H×A (S) coincides withthe product topology. Since O×v is compact for all v ∈ Ωf , H×A (S) is locally compact. SinceH×A is the union of all these H×A (S), H×A is a locally compact topological group.

Example 6.6 The restricted product topology on H×A is not the subspace topology inducedby the product topology on

∏v∈ΩH

×v . The set

U =∏v∈Ω∞

H×v ×∏v∈Ωf

O×v

is open in H×A . If the restricted product topology were the subspace topology induced from∏v∈ΩH

×v , then U must contain a set of the form

WS = H×A

⋂∏v∈S

Wv ×∏v 6∈S

H×v

where S is a finite subset of Ω and Wv is an open subset of H×v for each v ∈ S. But it isclear that for U does not contain any such WS .

For each v ∈ Ω, let H1v = xv ∈ H×v : nr(xv) = 1. Define

H1A = x = (xv) ∈ H×A : xv ∈ H1

v for all v ∈ Ω.

It is clear that H1A is the kernel of the reduced norm nr : H×A → JF . Thus H1

A is a normalsubgroup of H×A . Moreover, it contains the commutator subgroup of H×A . We give H1

A thesubspace topology induced by the restricted product topology on H×A . For any finite subsetS of Ω, let

H1S = x = (xv) ∈ H1

A : xv = 1 for all v 6∈ S.

Theorem 6.7 (Strong Approximation Theorem for H1) Let H be a quaternion algebraover a number field F , and let S be a finite subset of places of F containing Ω∞ such thatHv splits for at least one place in S. Then H1H1

S is dense in H1A.

Here is a consequence (in fact, equivalent version) of the Strong Approximation Theoremwhich will be used. Fix a positive integer N and a complete o-lattice L in H. Let T be afinite subset of Ωf which is disjoint from S. Suppose that xv ∈ H1

v is given for each v ∈ T .Let

Uv = z ∈ Hv : z ≡ xv mod pNv Lv,

and set

U = H1A ∩

∏v∈S

H×v ×∏v∈T

Uv ×∏

v 6∈T∪SO×v

,

34

which is an open neighborhood of x in H1A, where x is idele such that its v-th component

is xv for all v ∈ T and 1 eleswhere. So, there exists h ∈ H1 and y ∈ H1S such that hy ∈ U .

This implies

(1) h ≡ xv mod pNv Lv for all v ∈ T ;

(2) h ∈ O×v for all v 6∈ T ∪ S.

We shall apply the Strong Approximation Theorem to the case where S = Ω∞. Thus itis useful to introduce the following standard notion to cover the circumstances under whichthe Strong Approximation Theorem will be applied.

Definition 6.8 A quaternion algebra H over a number field F is said to satisfy the Eichlercondition if there is at least one infinite place of F at which H splits.

There is the restricted product topology on JF which is defined similarly to the one onH×A . For each v ∈ Ω, F×v is a locally compact; and for each v ∈ Ωf , o×v is compact. Therestricted topology on JF is defined by specifying a fundamental system of neighborhoodsof the identity in JF consisting of the sets of the form

∏v∈Ω Uv, where each Uv is an open

neighborhood of 1 in F×v with Uv = o×v for almost all finite places v. With this topology,JF is a locally compact topological group.

6.4 Type Number

If x, y are elements of H×A , then xH1A = H1

Ax, and the fact that H1A contains the commutator

subgroup of H×A implies that xyH1A = yxH1

A; hence the set xyH1A is independent of the order

of H1A, x and y. From this it follows that the set H1

AH×N(O)A is independent of the order

of H1A, H

× and N(O)A, and that this set is actually the group generated by H1A, H

× andN(O)A. This group is a normal subgroup of H×A and we can form the quotient groupH×A /H

1AH×N(O)A.

Lemma 6.9 Let v be a finite place of F , and O′v be a maximal order in Hv. If x ∈ H×v ,there exists an open neighborhood U of x such that yO′vy−1 = xO′vx−1 for all y ∈ U .

Proof. It suffices to prove the lemma for x = 1. For every y ∈ U = 1 + pvO′v ⊆ O′v, nr(y) isa unit. So, y−1 ∈ O′v and yO′vy−1 = O′v.2

Lemma 6.10 If H satisfies the Eichler condition, then

(∗) H×xN(O)A → H×xH1AN(O)A

is a well-defined bijection from H×\H×A /N(O)A to H×A /H1AH×N(O)A.

Proof. Suppose that H×xN(O)A = H×yN(O)A. Then there exist h ∈ H× and n ∈ N(O)Asuch that y = hxn. Thus

H×yH1AN(O)A = H×xnH1

AN(O)A = H×xH1AnN(O)A = H×xH1

AN(O)A.

35

This shows that (∗) is a well-defined function. Clearly it is surjective.Now suppose that H×xH1

AN(O)A = H×yH1AN(O)A. Then there exist h ∈ H×, α ∈ H1

Aand n ∈ N(O)A such that

hx = αyn.

For the sake of convenience, we let [z]O be the order zOz−1 for z ∈ H×A , and [zv]Ov be theorder zvOvz−1

v for all zv ∈ H×v with v ∈ Ωf . Let

T = v ∈ Ωf : [hxv]Ov 6= [yv]Ov

andJ = v ∈ Ωf \ T : [yv]Ov 6= Ov.

Both T and J are finite subsets of Ωf .Let W be an open set of H1

A of the form∏v∈Ω∞

H1v ×

∏v∈T∪J

Uv ×∏

v∈Ωf\(T∪J)

O1v

where Uv is an open neighborhood of αv (resp. 1) in H1v if v ∈ T (resp. v ∈ J). By the

Strong Approximation Theorem, with suitably chosen Uv, there exists σ ∈ H1 such that

(1) [σ]Ov = Ov for all finite places v outside T ∪ J ;

(2) [σyv]Ov = [αvyv]Ov for all v ∈ T ;

(3) [σyv]Ov = [yv]Ov for all v ∈ J .

If v is a finite place outside T ∪ J , then

[σyv]Ov = [σ]Ov = Ov = [hxv]Ov.

If v ∈ T , then[σyv]Ov = [αvyv]Ov = [hxv]Ov,

while if v ∈ J we have[σyv]Ov = [yv]Ov = [hxv]Ov.

Therefore, y−1σ−1hx ∈ N(O)A and thus H×xN(O)A = H×yN(O)A. This proves that (∗)is injective.2

Recall that JF (O) is the image of N(O)A under the reduced norm. Let

θ : H×A → JF /F×JF (O)

be the homomorphism induced by the reduced norm.

36

Theorem 6.11 Let H be a quaternion algebra over a number field F . Suppose that Hsatisfies the Eichler condition. Then the number of conjugacy classes of maximal orders inH is equal to the group index [JF : F×JF (O)].

Proof. It suffices to show that the homomorphism θ is surjective with H×H1AN(O)A as the

kernel.Let a be an element in JF . By the Weak Approximation Theorem for F , there exists

α ∈ F× such that avα is positive in Fv for every v ∈ Ram∞(H). So we can assume thatav > 0 for every v ∈ Ram(H). Thus for every v ∈ Ω∞, av = nr(xv) for some xv ∈ H×v .For almost all finite places v, Hv splits and av is a unit in ov. At any one of these v, themaximal order Ov is isomorphic to M2(ov); hence nr(O×v ) = o×v . Therefore, av = nr(xv) forsome xv ∈ O×v . For the remaining finitely many places v, it follows from Lemma 6.3 thatav = nr(xv) for some xv ∈ H×v . Then x = (xv) is an element in H×A and nr(x) = a. Thisproves that θ is surjective.

It is clear that H×H1AN(O)A is a part of the kernel of θ. Now, suppose that nr(x) ∈

F×JF (O). Then there exists n ∈ N(O)A such that nr(xn) ∈ F×. Since nr(H×v ) = F×2v

whenever v ∈ Ram∞(H), it follows that nr(xn) ∈ F×H . By Theorem 6.4, there existsh ∈ H× with nr(hxn) = 1. As a result, hxn ∈ H1

A and hence the kernel of θ is preciselyH×H1

AN(O)A.2

Definition 6.12 The type number of a quaternion algebra H over a number field is thenumber of conjugacy classes of maximal orders in H.

For our convenience, we let h be the set Ram∞(H). Let

JhF = x ∈ JF : xv > 0 for all v ∈ h and xv ∈ o×v for all v ∈ Ωf.

Let IF be the group of fractional ideals of F , and let P hF be the subgroup of principal

fractional ideals that are generated by a ∈ F× with a positive in Fv for all v ∈ h.Let P+

F be the set of principal fractional ideals that are generated by totally positiveelements in F . Then P 2

F ⊆ P+F , and so PF /P

+F is an elementary 2-group whose order is less

than 2r, where r is the number of real places of F . Thus the quotient IF /P+F is a finite

abelian group, called the narrow class group of F , and its order is the narrow class numberof F . Since P+

F ⊆ PhF ⊆ PF , the quotient IF /P

hF is also a finite abelian group and its order

divides the narrow class number of F .For any x ∈ JF , let (x) be the fractional ideal

(x) =∏p∈Ωf

pordp(xp).

The class of (x) in the quotient IF /PhF is denoted by [x]. Given an x ∈ JF , there exists

a ∈ F× such that axv > 0 for all v ∈ h. If b is another element in F× such that bxv > 0 forall v ∈ h, then [ax] = [bx] in IF /P

hF . Hence we have a well-defined homomorphism

x ∈ JF /F× 7−→ [ax] ∈ IF /P hF .

It is clear that this homomorphism is surjective.

37

Lemma 6.13 [JF : F×JhF ] = |IF /P hF |.

Proof. It suffices to show that the kernel of the above homomorphism is F×JhF /F×. Take

an idele x ∈ JhF . Then xv ∈ o×v for all finite places v, whence (x) is trivial. Therefore,

F×JhF /F× is in the kernel. Conversely, suppose that x is in the kernel. Let a ∈ F× be

chosen so that axv > 0 for all v ∈ h. Then there exists b ∈ F× such that b > 0 in Fv for allv ∈ h and (ax) = (b). Let β = axb−1. Then βv ∈ o×v for all finite places v, and βv > 0 forall v ∈ h. Hence β ∈ JhF and x ∈ F×JhF . Therefore the kernel of the above homomorphism

is F×JhF and the lemma is proved.2

Corollary 6.14 Let H be a quaternion algebra over a number field F . If H satisfies theEichler condition, then its type number is finite; it divides the narrow class number of F .

Proof. Let O be a maximal order in a quaternion algebra H over F . Then JF (O) containsJhF , hence [JF : F×JF (O)] divides [JF : F×JhF ] and the latter divides the narrow classnumber of F .2

Corollary 6.15 Let H be a quaternion algebra over a number field F . If H satisfies theEichler condition, then its type number is a power of 2.

Proof. It is clear because JF (O) contains J2F .2

Corollary 6.16 Let H be a quaternion algebra over Q which splits at the infinite place.Then the type number of H is 1.

Remark 6.17 The type number of a quaternion algebra which does not satisfy the Eichlercondition is also finite.

7 Sum of Three Squares

This section contains Venkov’s proof of the following Theorem of Gauss. However, the proofwe shall present is the modernized version by Rehm.

Theorem 7.1 Let m > 1 be a squarefree integer such that m ≡ 1, 2 mod 4, h(m) be theclass number of the quadratic field Q(

√−m), and ψ(m) be the number of integer solutions

to the equation x2 + y2 + z2 = m. Then ψ(m) = 12h(m).

7.1 The Hurwitz Quaternions

Throughout this section, we let H be the quaternion algebra(−1,−1

Q

). The reduce norm

on H is the sum of four squares, and its restriction on H0 is the sum of three squares. Let1, i, j, k be a standard basis of H such that i2 = j2 = −1. Since it is an orthonormal basisof H, we can identify H, as a quadratic space over Q, with the space Q4 in such a way that1, i, j, k becomes the canonical basis of Q4.

38

Lemma 7.2 Let u be a nonzero element in H. Then u, iu, ju, and ku are mutually orthog-onal.

Proof. Let p and q be two different elements from 1, i, j, k. Then

tr(puqu) = tr(pnr(u)q) = nr(u)tr(pq) = 0.

2

So, when u 6= 0, the set Z[i, j, k]u, which is a complete Z-lattice on H, produces a gridof 4-dimensional cubes in Q4. The side of any one of these cubes is

√nr(u). Let x ∈ H.

Then x must be in one of these cubes. Let su, s ∈ Z[i, j, k], be a corner of this cube that isclosest to x. Then

x− su = (a0 + a2i+ a3j + a4k)u, |ai| ≤1

2for all i.

So, nr(x − su) < nr(u) unless x happens to be the midpoint of the cube, in which case allthe ai are equal to 1

2 and so nr(x − su) = nr(u). This shows that the order Z[i, j, k] doesnot have any division algorithm (with respect to nr).

Now, let O be the order Z[i, j, k, δ], where δ = (1+i+j+k)2 . We have seen in Example 5.29

that O is a maximal order in H. This order O is called the Hurwitz order of quaternions.Note that as a set, O is obtained by adding all the midpoints of the cubes formed by Z[i, j, k].Moreover, nr(δ) = nr(i) = nr(j) = nr(k) = 1. Therefore, the elements in O are vertices ofa grid of 4-dimensional rhombohedrons in H. If u is a nonzero Hurwitz quaternion, thenthe principal left ideal Ou of O produces a grid of 4-dimensional rhombohedrons in H withside length

√nr(u). Let x be an arbitrary element in H. Then x falls into one of these

rhombohedrons. Let su be one of the closest corner. It is easy to see that nr(x−su) < nr(u)and so O has a division algorithm.

Proposition 7.3 Let x ∈ H and u be a nonzero element of the Hurwitz order O. Thenthere exist s, r ∈ O such that x = su+ r with nr(r) < nr(u).

It is clear that in the above discussion one can consider left multiplication of u andobtains an analogous division algorithm.

A nonzero ideal I in H is said to be a fractional O-ideal if its left order is O. In otherwords, I is a fractional O-ideal if

O = x ∈ H : xI ⊆ I.

Corollary 7.4 Every left fractional O-ideal in H is principal, that is, it is of the form Oufor some u ∈ H.

Proof. Let A be a nonzero left fractional O-ideal in H. Since A is a finitely generatedZ-module, there exists a nonzero integer m such that all the elements in mA are integralover Z. So, we may assume at the outset that all the elements in A are integral over Z.In particular, the set of reduced norms of nonzero elements in A contains only positive

39

integers and hence it must have a minimum. Let 0 6= u ∈ A be chosen such that nr(u) isthis minimum.

For any x ∈ A, by Proposition 7.3 there exists s ∈ O such that nr(x − su) < nr(u).Since x− su is in A, therefore x− su must be zero. This shows that A = Ou.2

Lemma 7.5 The unit group of O is generated by i, j, δ, and it contains exactly 24 elements.

Proof. This can be verified directly, using the fact that x ∈ O× if and only if nr(x) = 1.2

Let m > 1 be a positive integer, and K be the quadratic field Q(√−m). If µ ∈ H0 such

that nr(µ) = m, which is equivalent to µ2 = −m, then we have an embedding φµ from Kto Kµ := Q(µ) ⊆ H which maps

√−m to µ.

Now, suppose that m is a positive integer such that m ≡ 1, 2 mod 4. The set of integersolutions of the diophantine equation x2 + y2 + z2 = m is in bijective correspondence withthe set

Rm(or simply R) = µ ∈ O : µ2 = −m

which we call the set of roots of m. So, |R| = ψ(m), which is always positive by a theoremof Legendre.

Let µ ∈ R and let oµ = O ∩ Kµ. Then oµ is an order in Kµ, and all its elements areintegral over Z. Thus oµ is contained in the ring of integers of Kµ. However, since m ≡ 1, 2mod 4, it is not hard to show that the ring of integers of Kµ is Z[µ], and µ is clearly in oµ.Thus oµ is precisely the ring of integers of Kµ.

Proposition 7.6 If a is a fractional ideal of oµ, then Oa ∩Kµ = a.

Proof. We may assume that a ⊆ oµ. It is clear that Oa is a left O-ideal in H. AS 1 ∈ Oand a ⊆ Kµ, we have Oa ⊇ a.

For the other inclusion, note that aa−1 = oµ. Clearly, (Oa ∩Kµ)oµ = Oa ∩Kµ, hence

Oa ∩Kµ = (Oa ∩Kµ)a−1a

⊆ (Oaa−1 ∩Kµa−1)a

= (Ooµ ∩Kµ)a

= (O ∩Kµ)a

= oµa = a.

2

Proposition 7.7 Let µ ∈ H, µ 6∈ Q, then the centralizer of Kµ in H is Kµ itself.

Proof. By the Noether-Skolem Theorem, there exists t ∈ H such that 1, µ, t, µt form astandard basis of H. Then a direct computation shows that the centralizer of Kµ must beKµ itself.2

Corollary 7.8 If µ, η ∈ H and µ2 = η2 = −m, then α ∈ H : αµ = ηα is a onedimensional right Kµ-vector space.

40

Proof. Again, by the Noether-Skolem Theorem, the two embeddings φµ and φη are conju-gate inside H, that is there exists x ∈ H× such that φµ(a) = x−1φη(a)x for all a ∈ K. Inparticular, xµ = ηx, and αµ = ηα if and only if x−1α centralizes Kµ. By Proposition 7.7,x−1α ∈ Kµ, or equivalently, α ∈ xKµ.2

7.2 Class Groups and Root Bundles

We continue to assume that m > 1 is a positive integer congruent to 1 or 2 mod 4. Let IKbe the group of fractional ideals of K = Q(

√m). For any a ∈ IK , let aµ = φµ(a) ⊆ Kµ. By

Corollary 7.4, there exists κ = κ(a, µ) ∈ H such that

Oaµ = Oκ.

Note that κ is determined by a and µ up to left multiplication of units of O. So, κµκ−1 isdetermined up to inner automorphisms induced by units of O.

Moreover,

Oκµκ−1 = Oaµµκ−1 = Oµaµκ−1 ⊆ Oaµκ−1 = Oκκ−1 = O.

Therefore, κµκ−1 ∈ O and hence κµκ−1 is also a root.We call the set B(µ) := εµε−1 : ε ∈ O× bundle of the root µ. For any a ∈ IK , the

root bundle B(κµκ−1) does not depend on the κ we choose for a. Therefore, if we let

W = B(µ) : µ ∈ R

be the set of all root bundles, then we have a map

∆ : IK ×W →W

defined by∆(a, B(µ)) = B(κµκ−1),

where κ = κ(a, µ) is chosen such that Oaµ = Oκ.

Lemma 7.9 The map ∆ defines an action of the group IK on the set W.

Proof. Let a, b be two fractional ideals of K. Let λ = κ(b, µ)µκ(b, µ)−1; that is,

B(λ) = ∆(b, B(µ)).

Then φλ is φµ followed by the inner automorphism given by κ(b, µ). Therefore, it must bethat aλ = κ(b, µ)aµκ(b, µ)−1 and

Oκ(a, λ) = Oaλ= Oκ(b, µ)aµκ(b, µ)−1

= Obµaµκ(b, µ)−1;

41

that is, Oκ(a, λ)κ(b, µ) = O(ba)µ = O(ab)µ. So, we may choose κ(ab, µ) to be the productκ(a, µ)κ(b, µ), and find that

κ(a, λ)λκ(a, λ)−1 = κ(ab, µ)µκ(ab, µ)−1.

This shows that ∆(a,∆(b, B(µ))) = ∆(ab, B(µ)), which proves the proposition.2

Let µ, ν ∈ R, and letTµ,ν = λ ∈ O : λµ = νλ.

By Corollary 7.8, Tµ,ν is the intersection of O with a two dimensional Q-vector space; so itis a rank 2 Z-submodule of O. In particular, it is nonzero and hence OTµ,ν is a completeZ-lattice in H. So, OTµ,ν is an ideal in H. It is easy to see that OTµ,ν is in fact a leftfractional ideal of O.

Lemma 7.10 For any roots µ, ν, we have OTµ,ν = O.

Proof. By Corollary 7.4, there exists ρ ∈ O such that OTµ,ν = Oρ. We claim that ρ is aunit of O. It suffices to show that nr(ρ) is not divisible by any prime in Z. Let us supposethat m ≡ 1 mod 4 for the moment.

We first show that 2 - nr(ρ). It is enough to exhibit an element ω ∈ Tµ,ν such that2 - nr(ω). Since µ2 = ν2 = −m, αµ+ να ∈ Tµ,ν for all α ∈ O. Let µ = x1i+ x2j + x3k andν = y1i + y2j + y3k with xi, yi ∈ Z for all i. Since m = x2

1 + x22 + x2

3 ≡ 1 mod 4, we mayassume that x1 ≡ x2 ≡ 0 mod 2 and x3 ≡ 1 mod 2. As to ν, it is enough to consider twocases

(1) y1 ≡ y2 ≡ 0 mod 2, and y3 ≡ 1 mod 2;

(2) y1 ≡ y3 ≡ 0 mod 2, and y2 ≡ 1 mod 2.

In (1), it is direct to check that nr(µ + ν) ≡ nr(iµ + νi) ≡ 0 mod 4, but nr(µ + ν) −nr(iµ+ νi) ≡ 4 mod 8. Hence either nr(µ+ ν) or nr(iµ+ νi) is not divisible by 8. Then wecan take ω to be either (µ+ ν)/2 or (iµ+ νi)/2. Note that ω will be in O.

For (2), we use γ = (1 + j)µ+ ν(1 + j), which is

−(x2 + y2) + (x1 + x3 + y1 − y3)i+ (x2 + y2)j + (x3 − x1 + y3 + y1)k.

Notice that all the coefficients in this linear combination are odd. Therefore, ω := γ/2 is inO, and hence ω ∈ Tµ,ν . It is also easy to see that 2 - nr(ω).

Now, suppose that nr(ρ) is divisible by an odd prime p. Since ρ is a right divisor ofω0 := µ+ ν and ω1 := iµ+ νi, therefore it is also a right divisor of i(ω0 + i(ω1)) = iν − νiwhose reduced norm is 4(y2

2 + y23). We may conclude that y2

2 + y23 ≡ 0 mod p. Using j,

k instead of i, we obtain y21 + y2

3 ≡ 0 ≡ y21 + y2

2 mod p. These three congruences haveonly one common solution mod p, namely y1 ≡ y2 ≡ y3 ≡ 0 mod p. This shows thatm = y2

1 + y22 + y3

3 ≡ 0 mod p2, which is impossible.The case m ≡ 2 mod 4 can be done by a similar argument.2

42

Proposition 7.11 The action of IK on W is transitive. Moreover, the stabilizer of anyB(µ) ∈ W is the subgroup of principal fractional ideals of K.

Proof. Let µ, ν ∈ R. We have indicated earlier that Tν,µ is a Z-module of rank 2. Let ξ, ηbe a Z-basis of Tν,µ., and set

b := oµξη + oµnr(η).

Note that α ∈ Tν,µ if and only if α ∈ Tµ,ν . Hence ξη centralizes Kµ, and so ξη ∈ Kµ byProposition 7.7. Thus ξη ∈ Kµ ∩ O = oµ. This shows that b is an ideal of oµ. By Lemma7.10,

Ob = O(oµξη + oµηη) = Oξη +Oηη = (Oξ +Oη)η = OTν,µη = Oη.

So, if we use a = φ−1µ (b) and η = κ(a, µ), then aµ = b, Oaµ = Oη, and η ∈ Tµ,ν , that is

ηµη−1. Hence∆(a, B(µ)) = B(ν),

which means that the action of IK on W is transitive.For the second assertion, let α ∈ K× and a be the principal fractional ideal of K

generated by α. Then φµ(a) = oµβ, where β = φµ(α). We may then choose β ∈ Kµ to beκ(a, µ) and obtain

∆(a, B(µ)) = B(βµβ−1) = B(µ).

Conversely, suppose ∆(a, B(µ)) = B(µ). Then we may select κ(a, µ) such that κµκ−1 =µ. It follows from Proposition 7.7 that κ ∈ Kµ. Hence

Oaµ = Oκ = Ooµκ.

By Proposition 7.6, we have

aµ = Oaµ ∩Kµ = Ooµκ ∩Kµ = oµκ,

and so a is principal.2

We have shown that the root bundle ∆(a, B(µ)) depends only the ideal class of a. So,given an ideal class C ∈ IK , we can define a function ΠC : W → W by ΠC(B(µ)) =B(a, B(µ)), where a is any fractional ideal in C. By Proposition 7.11, this function ΠC is apermutation on W, and the map C 7→ ΠC is an isomorphism sending the ideal class groupof K onto a sharply transitive permutation group of W. As a result, |W| = h(m).

Finally, let us give a proof of Theorem 7.1. We need to count the number of rootsbelonging to a bundle B(µ). Let the group O× act on the set of roots R by conjugation.The stabilizer of a root µ is

ε ∈ O× : εµε−1 = O× ∩Kµ = o×µ = ±1.

The last equality is from the hypothesis that m > 1 and m ≡ 1, 2 mod 4. Hence |B(µ)| =|O×|/2 = 12, and thus

ψ(m) = |R| = 12|W| = 12h(m).

43