




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Lecture notes from a Math 614 course taught by Mel Hochster in Fall 2015. The notes cover topics such as rings, ideals, modules, polynomial rings, homomorphisms of rings and modules, and Noetherian rings. The course emphasizes the study of systems of equations and their solutions sets, which have a strong connection to commutative rings. The notes assume familiarity with these concepts and notations.
Typology: Lecture notes
1 / 159
This page cannot be seen from the preview
Don't miss anything!
by Mel Hochster
Lecture of September 9
We assume familiarity with the notions of ring, ideal, module, and with the polynomial ring in one or finitely many variables over a commutative ring, as well as with homomor- phisms of rings and homomorphisms of R-modules over the ring R.
As a matter of notation, N ⊆ Z ⊆ Q ⊆ R ⊆ C are the non-negative integers, the integers, the rational numbers, the real numbers, and the complex numbers, respectively, throughout this course.
Unless otherwise specified, all rings are commutative, associative, and have a multiplica- tive identity 1 (when precision is needed we write 1R for the identity in the ring R). It is possible that 1 = 0, in which case the ring is { 0 }, since for every r ∈ R, r = r · 1 = r · 0 = 0. We shall assume that a homomorphism h of rings R → S preserves the identity, i.e., that h(1R) = 1S. We shall also assume that all given modules M over a ring R are unital, i.e., that 1R · m = m for all m ∈ M.
When R and S are rings we write S = R[θ 1 ,... , θn] to mean that S is generated as a ring over its subring R by the elements θ 1 ,... , θn. This means that S contains R and the elements θ 1 ,... , θn, and that no strictly smaller subring of S contains R and the θ 1 ,... , θn. It also means that every element of S can be written (not necessarily
uniquely) as an R-linear combination of the monomials θk 1 1 · · · θ nkn. When one writes S = R[x 1 ,... , xk] it often means that the xi are indeterminates, so that S is the polynomial ring in k variables over R. But one should say this.
The main emphasis in this course will be on Noetherian rings, i.e., rings in which every ideal is finitely generated. Specifically, for all ideals I ⊆ R, there exist f 1 ,... , fk ∈ R such that I = (f 1 ,... , fk) = (f 1 ,... , fk)R =
∑k i=1 Rfi. We shall develop a very useful theory of dimension in such rings. This will be discussed further quite soon. We shall not be focused on esoteric examples of rings. In fact, almost all of the theory we develop is of great interest and usefulness in studying the properties of polynomial rings over a field or the integers, and homomorphic images of such rings.
There is a strong connection between studying systems of equations, studying their solutions sets, which often have some kind of geometry associated with them, and studying commutative rings. Suppose the equations involve variables X 1 ,... , Xn with coefficients in K. The most important case for us will be when K is an algebraically closed field such as the complex numbers C. Suppose the equations have the form Fi = 0 where the Fi are polynomials in the Xj with coefficients in K. Let I be the ideal generated by the Fi in the polynomial ring K[X 1 ,... , Xn] and let R be the quotient ring K[X 1 ,... , Xn]/I. In R, the images xj of the variables Xj give a solution of the equations, a sort of “universal” 1
solution. The connection between commutative algebra and algebraic geometry is that algebraic properties of the ring R are reflected in geometric properties of the solution set, and conversely. Solutions of the equations in the field K give maximal ideals of R. This leads to the idea that maximal ideals of R should be thought of as points in a geometric object. Some rings have very few maximal ideals: in that case it is better to consider all of the prime ideals of R as points of a geometric object. We shall soon make this idea more formal. Before we begin the systematic development of our subject, we shall look at some very simple examples of problems, many unsolved, that are quite natural and easy to state. Suppose that we are given polynomials f and g in C[x], the polynomial ring in one variable over the complex numbers C. Is there an algorithm that enables us to tell whether f and g generate C[x] over C? This will be the case if and only if x ∈ C[f, g], i.e., if and only if x can be expressed as a polynomial with complex coefficients in f and g. For example, suppose that f = x^5 + x^3 − x^2 + 1 and g = x^14 − x^7 + x^2 + 5. Here it is easy to see that f and g do not generate, because neither has a term involving x with nonzero coefficient. But if we change f to x^5 + x^3 − x^2 + x + 1 the problem does not seem easy. The following theorem of Abhyankar and Moh, whose original proof was about 150 pages long, gives a method of attacking this sort of problem.
Theorem (Abhyankar-Moh). Let f , g in C[x] have degrees d and e respectively. If C[f, g] = C[x], then either d | e or e | d, i.e., one of the two degrees must divide the other.
Shorter proofs have since been given. Given this difficult result, it is clear that the specific f and g given above cannot generate C[x]: 5 does not divide 14. Now suppose instead that f = x^5 + x^3 − x^2 + x + 1 and g = x^15 − x^7 + x^2 + 5. With this choice, the Abhyankar-Moh result does not preclude the possibility that f and g generate C[x]. To pursue the issue further, note that in g − f 3 the degree 15 terms cancel, producing a polynomial of smaller degree. But when we consider f and g −f 3 , which generate the same ring as f and g, the larger degree has decreased while the smaller has stayed the same. Thus, the sum of the degrees has decreased. In this sense, we have a smaller problem. We can now see whether the Abhyankar-Moh criterion is satisfied for this smaller pair. If it is, and the smaller degree divides the larger, we can subtract off a multiple of a power of the smaller degree polynomial and get a new pair in which the larger degree has decreased and the smaller has stayed the same. Eventually, either the criterion fails, or we get a constant and a single polynomial of degree ≥ 2, or one of the polynomials has degree 1. In the first two cases the original pair of polynomials does not generate. In the last case, they do generate.
This is a perfectly general algorithm. To test whether f of degree d and g of degree n ≥ d are generators, check whether d divides n. If so and n = dk, one can choose a constant c such that g − cf k^ has degree smaller than n. If the leading coefficients of f and g are a 6 = 0 and b 6 = 0, take c = b/ak. The sum of the degrees for the pair f, g − cf k^ has decreased.
Continue in the same way with the new pair, f , g − cf k. If one eventually reaches a pair in which one of the polynomials is linear, the original pair were generators. Otherwise, one reaches either a pair in which neither degree divides the other, or else a pair in which one
One then defines the derivative
dF dx
or F ′(x) to be P (x, 0), the result of substituting h = 0
in P (x, h). This is the algebraist’s method of taking a limit as h → 0: just substitute h = 0.
Given a polynomial F ∈ R[x 1 ,... , xn] we may likewise define its partial derivatives
in the various xi. E.g., to get (^) ∂x∂F n
we identify the polynomial ring with S[xn] where
S = R[x 1 ,... , xn− 1 ]. We can think of F as a polynomial in xn only with coefficients in
S, and (^) ∂x∂F n
is simply its derivative with respect to xn when it is thought of this way.
The Jacobian conjecture asserts that F 1 ,... , Fn ∈ C[x 1 ,... , xn] generate (note that the number of the Fi is equal to the number n of variables) if and only if the Jacobian determinant det
∂Fi/∂xj
is identically a nonzero constant. This is true when n = 1 and is known to be a necessary condition for the Fi to generate the polynomial ring. But even when n = 2 it is an open question!
If you think you have a proof, have someone check it carefully — there are at least five published incorrect proofs in the literature, and new ones are circulated frequently.
It is known that if there is a counter-example one needs polynomials of degree at least
Algebraic sets The problems discussed above are very easy to state, and very hard. However, they are not close to the main theme in this course, which is dimension theory. We are going to assign a dimension, the Krull dimension, to every commutative ring. It may be infinite, but will turn out to be finite for rings that are finitely generated over a field or the integers.
In order to give some idea of where we are headed, we shall discuss the notion of a closed algebraic set in Kn, where K is a field. Everyone is welcome to think of the case where K = C, although for the purpose of drawing pictures, it is easier to think about the case where K = R.
Let K be a field. A polynomial in K[x 1 ,... , xn] may be thought of as a function from Kn^ → K. Given a finite set f 1 ,... , fm of polynomials in K[x 1 ,... , xn], the set of points where they vanish simultaneously is denoted V (f 1 ,... , fm). Thus
V (f 1 ,... , fm) = {(a 1 ,... , an) ∈ Kn^ : fi(a 1 ,... , an) = 0, 1 ≤ i ≤ n}.
If X = V (f 1 ,... , fm), one also says that f 1 ,... , fm define X.
Over R[x, y], V (x^2 + y^2 − 1) is a circle in the plane, while V (xy) is the union of the coordinate axes. Note that V (x, y) is just the origin.
A set of the form V (f 1 ,... , fm) is called a closed algebraic set in Kn. We shall only be talking about closed algebraic sets here, and so we usually omit the word “closed.”
For the moment let us restrict attention to the case where K is an algebraically closed field such as the complex numbers C. We want to give algebraic sets a dimension in such
a way that Kn^ has dimension n. Thus, the notion of dimension that we develop will generalize the notion of dimension of a vector space.
We shall do this by associating a ring with X, denoted K[X]: it is simply the set of functions defined on X that are obtained by restricting a polynomial function on Kn^ to X. The dimension of X will be the same as the dimension of the ring K[X]. Of course, we have not defined dimension for rings yet.
In order to illustrate the kind of theorem we are going to prove, consider the problem of describing the intersection of two planes in real three-space R^3. The planes might be parallel, i.e., not meet at all. But if they do meet in at least one point, they must meet in a line.
More generally, if one has vector spaces V and W over a field K, both subspaces of some larger vector space, then dim(V ∩W ) = dim V +dim W −dim(V +W ). If the ambient vector space has dimension n, this leads to the result that dim(V ∩ W ) ≥ dim V + dim W − n. In the case of planes in three-space, we see that that dimension of the intersection must be at least 2 + 2 − 3 = 1.
Over an algebraically closed field, the same result turns out to be true for algebraic sets! Suppose that V and W are algebraic sets in Kn^ and that they meet in a point x ∈ Kn. We have to be a little bit careful because, unlike vector spaces, algebraic sets in general may be unions of finitely many smaller algebraic sets, which need not all have the same dimension. Algebraic sets which are not finite unions of strictly smaller algebraic sets are called irreducible. Each algebraic set is a finite union of irreducible ones in such a way that none can be omitted: these are called irreducible components. We define dimx V to be the largest dimension of an irreducible component of V that contains x. One of our long term goals is to prove that for any algebraic sets V and W in Kn^ meeting in a point x, dimx(V ∩ W ) ≥ dimx V + dimx W − n. This is a beautiful and useful result: it can be thought of as guaranteeing the existence of a solution (or many solutions) of a family of equations.
We conclude for now by mentioning one other sort of problem. Given a specific algebraic set X = V (f 1 ,... , fm), the set J of all polynomials vanishing on it is closed under addition and multiplication by any polynomial — that is, it is an ideal of K[x 1 ,... , xn]. J always contains the ideal I generated by f 1 ,... , fm. But J may be strictly larger than I. How can one tell?
Here is one example of an open question of this sort. Consider the set of pairs of commuting square matrices of size n. Let M = Mn(K) be the set of n × n matrices over K. Thus,
W = {(A, B) ∈ M × M : AB = BA}.
The matrices are given by their 2n^2 entries, and we may think of this set as a subset of K^2 n
2
. (To make this official, one would have to describe a way to string the entries of the two matrices out on a line.) Then W is an algebraic set defined by n^2 quadratic equations. If X = (xij ) is an n × n matrix of indeterminates and Y = (yij ) is another n × n matrix
Lecture of September 11
The notes for this lecture contain some basic definitions concerning abstract topological spaces that were not given in class. If you are not familiar with this material please read it carefully. I am not planning to do it in lecture.
———————— We mention one more very natural but very difficult question about algebraic sets. Suppose that one has an algebraic set X = V (f 1 ,... , fm). What is the least number of elements needed to define X? In other words, what is the least positive integer k such that X = V (g 1 ,... , gk)?
Here is a completely specific example. Suppose that we work in the polynomial ring in 6 variables x 1 ,... , x 3 , y 1 ,... , y 3 over the complex numbers C and let X be the algebraic set in C^6 defined by the vanishing of the 2 × 2 subdeterminants or minors of the matrix
( x 1 x 2 x 3 y 1 y 2 y 3
that is, X = V (f, g, h) where f = x 1 y 2 − x 2 y 1 , g = x 1 y 3 − x 3 y 1 , and h = x 2 y 3 − x 3 y 2. We can think of points of X as representing 2 × 3 matrices whose rank is at most 1: the vanishing of these equations is precisely the condition for the two rows of the matrix to be linearly dependent. Obviously, X can be defined by 3 equations. Can it be defined by 2 equations? No algorithm is known for settling questions of this sort, and many are open, even for relatively small specific examples. In the example considered here, it turns out that 3 equations are needed. I do not know an elementary proof of this fact — perhaps you can find one!
One of the themes of this course is that there is geometry associated with any commu- tative ring R. The following discussion illustrates this.
For an algebraic set over an algebraically closed field K, the maximal ideals of the ring K[X] (reminder: functions from X to K that are restrictions of polynomial functions) are in bijective correspondence with the points of X — the point x corresponds to the maximal ideal consisting of functions that vanish at x. This is, essentially, Hilbert’s Nullstellensatz, and we shall prove this theorem soon. This maximal ideal may also be described as the kernel of the evaluation homomorphism from K[X] onto K that sends f to f (x).
If R is the ring of continuous real-valued functions on a compact (Hausdorff) topological space X the maximal ideals also correspond to the points of X.
A filter F on a set X is a non-empty family of subsets closed under finite intersection and such that if Y ∈ F, and Y ⊆ Y ′^ ⊆ X, then Y ′^ ∈ F. Let K be a field. Let S be the ring of all K-valued functions on X. The ideals of S correspond bijectively with the filters on X: given a filter, the corresponding ideal consists of all functions that vanish on some set in the filter. The filter is recovered from the ideal I as the family of sets of
the form F −^1 (0) for some f ∈ I. The key point is that for f and g 1 ,... , gk ∈ S, f is in the ideal generated by the gk if and only if it vanishes whenever all the gi do. The unit ideal corresponds to the filter which is the set of all subsets of X. The maximal ideals correspond to the maximal filters that do not contain the empty set: these are called ultrafilters. Given a point of x ∈ X, there is an ultrafilter consisting of all sets that contain x. Ultrafilters of this type are called fixed. If X is infinite, there are always others: the sets with finite complement form a filter, and by Zorn’s lemma it is contained in an ultrafilter. For those familiar with the Stone-Cech compactification, the ultrafilters (and, hence, the maximal ideals) correspond bijectively with the points of the Stone-Cech compactification of X when X is given the discrete topology (every set is open).
We shall see that even for a completely arbitrary commutative ring R, the set of all maximal ideals of R, and even the set of all prime ideals of R, has a geometric structure. In fact, these sets have, in a natural way, the structure of topological spaces. We shall give a brief review of the notions needed from topology shortly.
Categories
We do not want to dwell too much on set-theoretic issues but they arise naturally here. We shall allow a class of all sets. Typically, classes are very large and are not allowed to be elements. The objects of a category are allowed to be a class, but morphisms between two objects are required to be a set.
A category C consists of a class Ob (C) called the objects of C and, for each pair of objects X, Y ∈ Ob (C) a set Mor (X, Y ) called the morphisms from X to Y with the following additional structure: for any three given objects X, Y and Z there is a map
Mor (X, Y ) × Mor (Y, Z) → Mor (X, Z)
called composition such that three axioms given below hold. One writes f : X → Y or
X f −→ Y to mean that f ∈ Mor (X, Y ). If f : X → Y and g : Y → Z then the composition is denoted g ◦ f or gf. The axioms are as follows:
(0) Mor (X, Y ) and Mor (X′, Y ′) are disjoint unless X = X′^ and Y = Y ′. (1) For every object X there is an element denoted 1X or idX in Mor (X, X) such that if g : W → X then 1X ◦ g = g while if h : X → Y then h ◦ (^1) X = h. (2) If f : W → X, g : X → Y , and h : Y → Z then h ◦ (g ◦ f ) = (h ◦ g) ◦ f (associativity of composition).
The morphism 1X is called the identity morphism on X and one can show that it is unique. If f : X → Y then X is called the domain of f and Y is called the codomain, target, or range of f , but it is preferable to avoid the term “range” because it is used for the set of values that a function actually takes on. A morphism f : X → Y is called an isomorphism if there is a morphism g : Y → X such that gf = 1X and f g = 1Y. If it exists, g is unique and is an isomorphism from Y → X. If there is an isomorphism from X → Y then X and Y are called isomorphic.
A topological space is called quasi-compact if every open cover has a subcover containing only finitely many open sets, i.e., a finite subcover.
A family of sets is said to have the finite intersection property if every finite subfamily has non-empty intersection. Being quasi-compact is equivalent to the condition that every family of closed sets with the finite intersection property has non-empty intersection. (This is only interesting when the family is infinite.) A quasi-compact Hausdorff space is called compact. We assume familiarity with the usual topology on Rn, in which a set is closed if and only if for every convergent sequence of points in the set, the limit point of the sequence is also in the set. Alternatively, a set U is open if and only if for any point x in the set, there exists a > 0 in R such that all points of Rn^ within distance of a of x are in U.
The compact subspaces of Rn^ are precisely the closed, bounded sets. A topological space is called connected if it is not the union of two non-empty disjoint open subsets (which will then both be closed as well). The connected subsets of the real line are identical with the intervals: these are the subsets with the property that if they contain a and b, they contain all real numbers in between a and b. They include the empty set, individual points, open intervals, half-open intervals, closed intervals, and the whole line.
A function f from a topological space X to a topological space Y is called continuous if for every open set V of Y , f −^1 V = {x ∈ X : f (x) ∈ V } is open. It is an equivalent condition to require that the inverse image of every closed set be closed.
We are now ready to continue with our discussion of examples of categories. (f) Topological spaces and continuous maps give a category. In this category, isomor- phism is called homeomorphism.
We now consider some examples in which composition is not necessarily composition of functions.
(g) A partially ordered set (or poset) consists of a set P together with a relation ≤ such that for all x, y, z ∈ P , (1) if x ≤ y and y ≤ x then x = y and (2) if x ≤ y and y ≤ z then x ≤ z. Given a partially ordered set, we can construct a category in which the objects are the elements of the partially ordered set. We artificially define there to be one morphism from x to y when x ≤ y, and no morphisms otherwise. In this category, isomorphic objects are equal. Note that there is a unique way to define composition: if we have a morphism f from x to y and one g from y to z, then x ≤ y and y ≤ z. Therefore, x ≤ z, and there is a unique morphism from x to z, which we define to be the composition gf. Conversely, a category in which (1) the objects form a set, (2) there is at most one morphism between any two objects, and (3) isomorphic objects are equal is essentially the same thing as a partially ordered set. One defines a partial ordering on the objects by x ≤ y if and only if there is a morphism from x to y.
(h) A category with just one object in which every morphism is an isomorphism is essentially the same thing as a group. The morphisms of the object to itself are the elements of the group.
Lecture of September 14
Given any category C we can construct an opposite category Cop. It has the same objects as C, but for any two objects X and Y in Ob (C), Mor (^) Cop (X, Y ) = Mor (^) C (Y, X). There turns out to be an obvious way of defining composition using the composition in C: if f ∈ Mor (^) Cop^ (X, Y ) and g ∈ Mor (^) Cop^ (Y, Z) we have that f : Y → X in C and g : Z → Y , in C, so that f ◦ g in C is a morphism Z → X in C, i.e., a morphism X → Z in Cop, and thus g ◦Cop f is f ◦C g.
By a (covariant) functor from a category C to a category D we mean a function F that assigns to every object X in C an object F (X) in D and to every morphism f : X → Y in C a morphism F (f ) : F (X) → F (Y ) in D such that
(1) For all X ∈ Ob (C), F (1X ) = 1F (X) and (2) For all f : X → Y and g : Y → Z in C, F (g ◦ f ) = F (g) ◦ F (f ).
A contravariant functor from C to D is a covariant functor to C to Dop. This means that when f : X → Y in C, F (f ) : F (Y ) → F (X) in D, and F (g ◦ f ) = F (f ) ◦ F (g) whenever g ◦ f is defined in C.
Here are some examples.
(a) Given any category C, there is an identity functor 1C on C: it sends the object X to the object X and the morphism f to the morphism f. This is a covariant functor.
(b) There is a functor from the category of groups and group homomorphisms to the category of abelian groups and homomorphisms that sends the group G to G/G′, where G′^ is the commutator subgroup of G: G′^ is generated by the set of all commutators {ghg−^1 h−^1 : g, h ∈ G}: it is a normal subgroup of G. The group G/G′^ is abelian. Note also that any homomorphism from G to an abelian group must kill all commutators, and factors through G/G′, which is called the abelianization of G.
Given φ : G → H, φ automatically takes commutators to commutators. Therefore, it maps G′^ into H′^ and so induces a homomorphism G/G′^ → H/H′. This explains how this functor behaves on homomorphisms. It is covariant.
(c) Note that the composition of two functors is a functor. If both are covariant or both are contravariant the composition is covariant. If one is covariant and the other is contravariant, the composition is contravariant.
(d) There is a contravariant functor F from the category of topological spaces to the category of rings that maps X to the ring of continuous R-valued functions on X. Given a continuous map f : X → Y , the ring homomorphism F (Y ) → F (X) is induced by composition: if h : Y → R is any continuous function on Y , then h ◦ f is a continuous function on X.
(e) Given a category such as groups and group homomorphisms in which the objects have underlying sets and the morphisms are given by certain functions on those sets, we
subsets of Spec (R) of the form V (I) as its closed sets. Note that V (0) = Spec (R), that V (R) = ∅, and that for any family of ideals {Iλ}λ∈Λ,
⋂
λ∈Λ
V (Iλ) = V (
λ∈Λ
Iλ).
It remains only to show that the union of two closed sets (and, hence, any finite number) is closed, and this will follow if we can show that for any two ideals I, J, V (I) ∪ V (J) = V (I ∩ J) = V (IJ). It is clear that the leftmost term is smallest. Suppose that a prime P contains IJ but not I, so that u ∈ I but u /∈ P. For every v ∈ J, uv ∈ P , and since u /∈ P , we have v ∈ P. Thus, if P does not contain I, it contains J. It follows that V (IJ) ⊆ V (I) ∪ V (J), and the result follows.
The Zariski topology is T 0. If P and Q are distinct primes, one of them contains an element not in the other. Suppose, say, that u ∈ P and u /∈ Q. The closed set V (u) contains P but not Q.
It is easy to show that the closure of the one point set {P }, where P is prime, is the set V (P ). The closure has the form V (I), and is the smallest set of this form such that P ∈ V (I), i.e., such that I ⊆ P. As I gets smaller, V (I) gets larger. It is therefore immediate that the smallest closed set containing P is V (P ).
It follows that {P } is closed if and only if P is maximal. In general, Spec (R) is not T 1. Spec becomes a contravariant functor from the category of commutative rings with identity to the category of topological spaces if, given a ring homomorphism f : R → S, we define Spec (f ) by having it send Q ∈ Spec (S) to f −^1 (Q) = {r ∈ R : f (r) ∈ Q}. There is an induced ring homomorphism R/f −^1 (Q) → S/Q which is injective. Since S/Q is an integral domain, so is its subring R/f −^1 (Q). (We are also using tacitly that the inverse image of a proper ideal is proper, which is a consequence of our convention that f (1R) = 1S .) f −^1 (Q) is sometimes denoted Qc^ and called the contraction of Q to R. This is a highly ambiguous notation.
We want to talk about when two functors are isomorphic and to do that, we need to have a notion of morphism between two functors. Let F, G be functors from C → D with the same variance. For simplicity, we shall assume that they are both covariant. The case where they are both contravariant is handled automatically by thinking instead of the case of covariant functors from C to Dop. A natural transformation from F to G assigns to every object X ∈ Ob (C) a morphism TX : F (X) → G(X) in such a way that for all morphisms f : X → Y in C, there is a commutative diagram:
F (f ) −−−−→ F (Y )
TX
y
yTY
G(X) −−−−→ G(f )
The commutativity of the diagram simply means that TY ◦ F (f ) = G(f ) ◦ TX.
This may seem like a complicated notion at first glance, but it is actually very “natural,” if you will forgive the expression.
This example may clarify. If V is a vector space write V ∗^ for the space of linear functionals on V , i.e., for HomK (V, K), the K-vector space of K-linear maps from V → K. Then ∗^ is a contravariant functor from K-vector spaces and K-linear maps to itself. (If θ : V → W is linear, θ∗^ : W ∗^ → V ∗^ is induced by composition: if g ∈ W ∗, so that g : W → K, then θ∗(g) = g ◦ θ.)
The composition of ∗^ with itself gives a covariant functor ∗∗: the double dual functor. We claim that there is a natural transformation T from the identity functor to ∗∗. To give T is the same as giving a map TV : V → V ∗∗^ for every vector space V. To specify TV (v) for v ∈ V , we need to give a map from V ∗^ to K. If g ∈ V ∗, the value of TV (v) on g is simply g(v). To check that this is a natural transformation, one needs to check that for every K-linear map f : V → W , the diagram
f −−−−→ W
TV
y
yTW
V ∗∗^ −−−−→ f ∗∗
commutes. This is straightforward. Note that the map V → V ∗∗^ is not necessarily an isomorphism. It is always injective, and is an isomorphism when V is finite-dimensional over K.
commutative algebra and in many other parts of mathematics. If we fix an object Z in a category C then we get a covariant functor hZ mapping C to the category of sets by letting hZ (X) = Mor (Z, X). If f : X → Y we let hZ (f ) : Mor (Z, X) → Mor (Z, Y ) be the map induced by composition — it sends g to f ◦g. A covariant functor G from C to sets is called representable in C if it is isomorphic to hZ for some Z ∈ Ob (C). We say that Z represents G. Similarly, we can define a contravariant functor hZ^ to sets by hZ^ (X) = Mor (X, Z) while hZ^ (f ) : Mor (Y, Z) → Mor (X, Z) sends g to g ◦ f. A contravariant functor is representable in C if it is isomorphic with hZ^ for some Z.
Examples. (a) Let C be the category of abelian groups and group homomorphisms. Let G be any group. We can define a functor F from abelian groups to sets by letting F (A) = Hom(G, A), the set of group homomorphisms from G to A. Can we represent F in the category of abelian groups? Yes! Let G = G/G′, the abelianization of G. Then every homomorphism G → A factors uniquely G → G → A, giving a bijection of F (A) with Hom(G, A). This yields an isomorphism of F ∼= hG.
(b) Let R be a ring and and I be an ideal. Define a functor from the category of commutative rings with identity to the category of sets by letting F (S) be the set of all ring homomorphisms f : R → S such that f kills I. Every homomorphism R → S such that f kills I factors uniquely R R/I → S, from which it follows that the functor F is representable and is ∼= hR/I.
(c) In this example we want to define products in an arbitrary category. Our motivation is the way the Cartesian product Z = X ×Y behaves in the category of sets. It has product projections πX : Z → X sending (x, y) to x and πY : Z → Y sending (x, y) to y. To give a function from W → X × Y is equivalent to giving a pair of functions, one α : W → X and another β : W → Y. The function f : W → X × Y then sends w to (α(w), β(w)). The functions α and β may be recovered from f as πX ◦ f and πY ◦ f , respectively.
Now let C be any category. Let X, Y ∈ Ob (C). An object Z together with morphisms πX : Z → X and πY : Z → Y (called the product projections on X an Y , respectively) is called a product for X and Y in C if for all objects W in C the function Mor (W, Z) → Mor (W, X) × M or (W, Y ) sending f to (πX ◦ f, πY ◦ f ) is a bijection. This means that the functor sending W to Mor (W, X) × Mor (W, Y ) is representable in C. Given another product Z′, π X′ , π′ Y , there are unique mutually inverse isomorphisms γ : Z → Z′^ and δ : Z′^ → Z that are compatible with the product projections, i.e., such that πX = γ ◦ π′ X πY = γ ◦ π′ Y (the existence and uniqueness of γ are guaranteed by the defining property of the product) and similarly for δ. The fact that the compositions are the appropriate identity maps also follows from the defining property of the product.
Products exist in many categories, but they may fail to exist. In the categories of sets, rings, groups, abelian groups, R-modules over a given ring R, and topological spaces, the product turns out to be the Cartesian product with the usual additional structure (in the algebraic examples, operations are performed coordinate-wise; in the case of topological spaces, the product topology works: the open sets are unions of Cartesian products of open sets from the two spaces). In all of these examples, the product projections are the usual set-theoretic ones. In the category associated with a partially ordered set, the product of
two elements x and y is the greatest lower bound of x and y, if it exists. The point is that w has (necessarily unique) morphisms to both x and y iff w ≤ x and w ≤ y iff w is a lower bound for both x and y. For z to be a product, we must have that z is a lower bound for x, y such that every lower bound for x, y has a morphism to z. This says that z is a greatest lower bound for x, y in the partially ordered set. It is easy to give examples of partially ordered sets where not all products exist: e.g., a partially ordered set that consists of two mutually incomparable elements (there is no lower bound for the two), or one in which there are four elements a, b, x, y such that a and b are incomparable, x and y are incomparable, while both a and b are strictly less than both x and y. Here, a and b are both lower bounds for the x, y, but neither is a greatest lower bound.
The product of two objects in Cop^ is called their coproduct in C. Translating, the coproduct of X and Y in C, if it exists, is given by an object Z and two morphisms ιX : X → Z, ιY : Y → Z such that for every object W , the map Mor (Z, W ) → Mor (X, W ) × Mor (Y, W ) sending f to (f ◦ ιX , f ◦ ιY ) is bijective. This means that the functor sending W to Mor (X, W ) × Mor (Y, W ) is representable in C. Coproducts have the same sort of uniqueness that products do: they are products (in Cop).
In the category of sets, coproduct corresponds to disjoint union: one takes the union of disjoint sets X′^ and Y ′^ set-isomorphic to X and Y respectively. The function ιX is an isomorphism of X with X′^ composed with the inclusion of X′^ in X′^ ∪ Y ′, and similarly for ιY. To give a function from the disjoint union of two sets to W is the same as to give two functions to W , one from each set.
In the category of R-modules over a commutative ring R, coproduct corresponds to direct sum. We shall discuss the existence of coproducts in the category of commutative rings later on. In the category associated with a partially ordered set, it corresponds to the least upper bound of the two elements.
homomorphism θ : A → R, the ring R becomes an A-algebra if we define ar as θ(a)r. That is, to give a ring R the structure of an A-algebra is exactly the same thing as to give a ring homomorphism A → R. When R is an A-algebra, the homomorphism θ : A → R is called the structural homomorphism of the algebra. A-algebras form a category: the A-algebra morphisms (usually referred to as A-algebra homomorphisms) from R to S are the A-linear ring homomorphisms. If f and g are the structural homomorphisms of R and S respectively over A and h : R → S is a ring homomorphism, it is an A-algebra homomorphism if and only if h ◦ f = g.
Note that every commutative ring R with identity is a Z-algebra in a unique way, i.e., there is a unique ring homomorphism Z → R. To see this, observe that 1 must map to (^1) R. By repeated addition, we see that n maps to n · (^1) R for every nonnegative integer n. It follows by taking inverses that this holds for negative integers as well. This shows uniqueness, and it is easy to check that the map that sends n to n · (^1) R really is a ring homomorphism for every ring R.
By a semigroup S we mean a set together with an associative binary operation that has a two-sided identity. (The existence of such an identity is not always assumed. Some people use the term “monoid” for a semigroup with identity.) We shall assume the semigroup operation is written multiplicatively and that the identity is denoted 1S or simply 1. A group is a semigroup in which every element has a two-sided inverse.
By a homomorphism of semigroups h : S → S′^ we mean a function on the underlying sets such that for all s, t ∈ S, h(st) = h(s)h(t) and such that h(1S ) = 1S′^.
The elements of a commutative ring with identity form a commutative semigroup under multiplication.
The set of vectors Nn^ with nonnegative integer entries forms a semigroup under addition with identity (0,... , 0). We want to introduce an isomorphic semigroup that is written multiplicatively. If x 1 ,... , xn are distinct elements we can introduce formal monomials xk 11 · · · xk nn in these elements, in bijective correspondence with the elements (k 1 ,... , kn) ∈ Nn. (We can, for example, make all this precise by letting xk 11 · · · xk nn be an alternate notation for the function whose value on xi is ki, 1 ≤ i ≤ n.) These formal monomials form a multiplicative semigroup that is isomorphic as a semigroup with Nn: to multiply two formal monomials, one adds the corresponding exponents. It is also innocuous to follow the usual practices of omitting a power of one of the xi from a monomial if the exponent on xi is 0, of replacing x^1 i by xi, and of writing 1 for x^01 · · · x^0 n. With these conventions, xki is the product of xi with itself k times, and xk i 1 · · · xk nn is the product of
n terms, of which the i th term is xk i i.
We can likewise introduce the multiplicative semigroup of formal monomials in the elements of an infinite set: it can thought of as the union of what one gets from its various finite subsets. Only finitely many of the elements occur with nonzero exponents in any given monomial.
Not every commutative semigroup is isomorphic with the multiplicative semigroup of a ring: for one thing, there need not be an element that behaves like 0. But even if
we introduce an element that behaves like 0, this still need not be true. The infinite multiplicative semigroup of monomials in just one element, {xk^ : k ∈ N}, together with 0, is not the multiplicative semigroup of a ring. To see this, note that the ring must contain an element to serve as −1. If that element is xk^ for k > 0, then x^2 k^ = 1, and the multiplicative semigroup is not infinite after all. Therefore, we must have that −1 = 1, i.e., that the ring has characteristic 2. But then x + 1 must coincide with xk^ for some k > 1, i.e., the equation xk^ − x − 1 = 0 holds. This implies that every power of x is in the span of 1, x,... , xk−^1 , forcing the ring to be a vector space of dimension at most k over Z 2 , and therefore finite, a contradiction.
Given a commutative semigroup S and a commutative ring A we can define a functor G from the category of A-algebras to sets whose value on R is the set of semigroup ho- momorphisms from S to R. If we have a homomorphism R → R′^ composition with it gives a function from G(R) to G(R′). In this way, G is a covariant functor to the category of sets. We want to see that G is representable in the category of A-algebras. The con- struction is as follows: we put an A-algebra structure on the free A-module with free basis S by defining the product of
∑h i=1 aisi^ with^
∑k j=1 a
′ j s ′ j , where the^ ai, a ′ j ∈^ A^ and the si, s′ j ∈ S, to be
i, j (aia
′ j )(sisj^ ) where^ aia
′ j is calculated in^ A^ and^ sis
′ j is calculated in^ S. It is straightforward to check that this is a commutative ring with identity 1A (^1) S This ring is denoted A[S] and is called the semigroup ring of S with coefficients in A. We identify S with the set of elements of the form 1As, s ∈ S. It turns out that every semigroup homomorphism φ : S → R (using R for the multiplicative semigroup of R), where R is an A-algebra, extends uniquely to an A-algebra homomorphism A[S] → R. It is clear that to
perform the extension one must send
∑h i=1 aisi^ to^
∑h i=1 aiφ(si), and it is straightforward to verify that this is an A-algebra homomorphism. Thus, restriction to S gives a bijection from HomA(A[S], R) to G(R) for every A-algebra R, and so A[S] represents the functor G in the category of A-algebras.
We can now define the polynomial ring in a finite or infinite set of variables {xi : i ∈ I} over A as the semigroup ring of the formal monomials in the xi with coefficients in A.
We can also view the polynomial ring A[X ] in a set of variables X as arising from representing a functor as follows. Given any A-algebra R, to give an A-homomorphism from A[X ] → R is the same as to give a function from X → R, i.e., the same as simply to specify the values of the A-homomorphism on the variables. Clearly, if the homomorphism is to have value ri on xi for every xi ∈ X , the monomial xk i 11 · · · xk inn must map to rk i 11 · · · rk inn , and this tells us as well how to map any A-linear combination of monomials. If for example, only the indeterminates x 1 ,... , xn occur in a given polynomial (there are always only finitely many in any one polynomial) then the polynomial can be written uniquely as∑
k∈E akx
k (^) where E is the finite set of n-tuples of exponents corresponding to monomials
occurring with nonzero coefficient in the polynomial, k = (k 1 ,... , kn) is a n-tuple varying in E, every ak ∈ A, and xk^ denotes xk 11 · · · xk nn. If the value that xi has is ri, this
polynomial must map to
k∈E akr
k, where rk (^) denotes rk 1 1 · · ·^ r
kn n. It is straightforward to check that this does give an A-algebra homomorphism. In the case where there are n variables x 1 ,... , xn, and every xi is to map to ri, the value of a polynomial P under this homomorphism is denoted P (r 1 ,... , rn), and we refer to the homomorphism as evaluation