Register  
 
About Us | Help | Sign in
 
   

Revision:Foundations

From The Student Room

(Redirected from Revision Notes: Foundations)

TSR Wiki > Study Help > Subjects and Revision > Revision Notes > Mathematics > Foundations


This page is supposed to be an introduction to some of the most fundamental ideas that you might see if you do a university-level mathematics course. On this page you're not going to find out how to do calculations with complex numbers or vectors, and you're not going to discover anything new about the calculus. Instead you'll learn about some of the most basic questions that have stimulated mathematicians (and philosophers) over the years - questions like "What is a number?" and "What is proof?".

The ideas here are by no means the most important ones you should take in if you intend to study mathematics at a higher level, but they are still important, and will form a part of any good mathematics education.

These notes are in part based on a course that I took in my first year of studying mathematics at Cambridge, entitled 'Numbers and Sets' and lectured by Prof. Tim Gowers, in part based on the book An Introduction to Mathematical Philosophy by Bertrand Russell, and in part are completely from my own mind. My thinking about logic was greatly clarified by reading the introductory books A Very Short Introduction to Logic by Graham Priest, and Logic by Wilfrid Hodges.

Contents

Basic Ideas

What is a number?

This is a question that has commanded the attention of mathematicians for millenia. The answer turns out to be more complicated than you might think at first sight. It's easy to think that we "just know" what a number is, and therefore that the question doesn't need answering. This turns out not to be good enough for doing mathematics with, though - if we want to be able to prove things about numbers then we need to have a more rigorous definition of what they really are.

This can only come a bit later, when we've developed some more terminology. For the moment we'll just describe some of the most commonly used systems of numbers, many of which you will already be familiar with from A Level.

  • The natural numbers are the counting numbers, i.e. they are 0, 1, 2, 3, etc. It's important to realise that we can go on counting forever without reaching a highest number. We use the symbol \mathbb{N} to mean 'all of the natural numbers'.
  • The integers are the whole numbers, including zero and negative numbers. They can be thought of as an extension of the natural numbers. We use the symbol Z to mean 'all of the integers'.
  • The rational numbers are the fractions. We can think of these of any number of the form p/q, where p and q are integers and q is not zero. We hit upon a technicality here -- we find that two rational numbers that look different can be equal. For example, the number 1/2 is equal to 3/6. How do we decide if two rational numbers are equal? We deal with this by saying that two rational numbers p/q and r/s are equal if, and only if, ps = qr. We use the symbol Q to mean 'all of the rational numbers'.
  • The real numbers are the rational numbers, plus the irrational numbers. For example, &sqrt;2 is not a rational number, but it is a real number. We can think of the real numbers as points on a number line. It can be helpful to think of the real numbers as strings of decimals, but this can sometimes be misleading. We have a similar problem as the one we had with the rational numbers - two real numbers with different decimal representations can be equal. For example, a common mistake is to think that the two real numbers 1 and 0.999... (0.9 recurring) are different real numbers. In fact they are the same, although a completely rigorous proof of this can only come later. We use the symbol R to mean 'all of the real numbers'.
  • Something that's not so nice about the real numbers is that we can't square root every real number - we can never find a real number that squares to give a negative number. The complex numbers are formed by introducing the symbol i, which satisfies i2 = -1. A general complex number z is then written as z = x + iy, where x and y are real numbers. We do algebra with complex numbers by treating them as two real numbers added together, and remembering that i * i = -1. We use the symbol C to mean 'all of the complex numbers'.

Proofs

To say anything useful about numbers we have to be able to prove things about them. In any proof (whether it's a mathematical proof or not) it's important to have several things. First we have to know exactly what we are trying to prove, so that we will know when we've proved it! For example, suppose we want to prove that every natural number is either even or odd. We have to know what we mean when we say 'even' and 'odd'.

Fortunately the answer to this isn't too complicated. We say that a natural number n is even if it can be written as n = 2k for some other natural number k. We have to be careful when we define what an odd number is - it is tempting to say that any number that isn't even is odd, but that is just assuming what we're trying to prove! Instead we say that a natural number is odd if it can be written as n = 2k + 1 for some other natural number k.

We also need to know that our conclusion follows logically from our starting points, i.e. we need to have a sound argument. To prove that all natural numbers are either even or odd, we could argue as follows. If n is even, then n = 2k. Then n+1 = 2k+1, so n+1 is odd. On the other hand, if n is odd then n = 2k+1, so n+1 = 2k+2 = 2(k+1), so n+1 is even. Thus no matter whether n is even or odd, n+1 is always either even or odd. But we know that 0 is either even or odd (in fact it's even, since 0 = 2.0, but that doesn't matter) and thus 1 must be even or odd, since 1 = 0+1. But then 2 must be even or odd, since 2 = 1+1. We know that we can go through all of the natural numbers by adding 1 each time, and so all of the natural numbers are either even or odd.

This kind of argument is called mathematical induction. We show that if what we want to prove is true for n, then it must be true for n+1. We then show that it is true for 0, and thus it must be true for all of the natural numbers.

Often we set out our proofs more formally, by stating what we want to prove as a theorem. For example,

Theorem: Let n be a positive integer. If n2 is even, then n is even.
Proof: If n is not even then it must be odd, so n=2k+1. Then n2 = (2k+1)2 = 4k2 + 4k + 1 = 2(2k2 + 2k) + 1, which is odd. So if n2 is even then n must be even.

This kind of argument is called reductio ad absurdum, or proof by contradiction. We assume the opposite of what we're trying to prove, and show that this leads to a contradiction. Thus what we want to prove must be true. In the argument above we assumed that n2 was even and n was odd, and showed that this meant that n2 must be odd. This is a contradiction, and so n can't be odd.

Logic

It's important to know a little bit of logic to be able to follow mathematical proofs. When we're writing about logic we use letters as a shorthand for statements. For example, we might use P for the statement "n2 is even" and Q for the statement "n is even". In our example above we showed that if P is true, then Q is also true. In our shorthand we write this as P \Rightarrow Q, which you can read as "if P then Q", or alternatively as "P implies Q".

We use the symbol '¬' to mean "the negation of", or "the opposite of". For example, the statement ¬P means "the opposite of the statement P". If P is true then ¬P is false, and vice versa.

We also want to be able to talk about multiple statements being true at the same time. For this we form a combined statement P\wedgeQ, which we read as "P and Q". We say that the statement P\wedgeQ is true if, and only if, P is true AND Q is true. Otherwise P\wedgeQ is false.

Similarly we can think about the statement P\veeQ, which we read as "P or Q". We say that P\veeQ is true if P is true, or if Q is true, or if they are both true. If neither P nor Q is true, then we say that P\veeQ is false.

Now we can combine some of our symbols, for example we could look at the statement ¬(P\veeQ), which we read as "not (P or Q)". The only way for this to be true is if P\veeQ is false, i.e. if P is false AND Q is false. But we can write this in shorthand as ¬P\wedge¬Q. This shows that the two statements ¬(P\veeQ) and ¬P\wedge¬Q are equivalent, i.e. they mean the same thing. We write this in our shorthand by putting an equals sign between them:

¬(P\veeQ) = ¬P\wedge¬Q .

Similarly we can look at the negation of the statement P\wedgeQ. We write this as:

¬(P\wedgeQ) = ¬P\vee¬Q .

You should be able to work out why this is.

We can look at the truth or falsehood of implications. For example, when is the statement P\RightarrowQ true? It is an important convention that we count this statement as true, except when P is true AND Q is false. This seems weird at first. For example, following this convention, then statement "if n is even and odd, then n = 17" is counted as true, because it is never the case that n is both even and odd. Similarly, the statement "if pigs fly then I am the Queen" is counted as true, because it is never the case that pigs fly AND I am not the Queen.

The contrapositive of the statement P \Rightarrow Q is the statement ¬Q \Rightarrow ¬P. In fact this is equivalent to the original statement, AS LONG AS we use the convention that P \Rightarrow Q is only false when P is true and Q is false. You should be able to convince yourself of this by thinking about it carefully.

Sets

The idea of a set is very important in mathematics. In the late 19th and early 20th centuries a group of mathematians developed set theory, which was an attempt to prove all of mathematics from the most very basic starting points. In the end this was shown to be unsuccesful, when a logician called Kurt Gödel showed that no matter what you take your starting points to be, there will always be things that you cannot prove using them. However, set theory is still the basis for much mathematics today, and is also important in fields such as computer science, so it's important to know something about it.

A set is a collection of objects. The objects can be anything; for example, I could consider the set of all objects on my desk, or the set of all atoms in the universe. However, we will mostly want to think about sets which are collections of mathematical objects, such as sets of numbers. When we want to specify a set, we have several options. We almost always use curly brackets { and } to enclose the objects in our set. For example, we write {1,2,3} to mean the set consisting of the numbers 1, 2 and 3. We say that the numbers 1, 2 and 3 are members or elements of the set {1,2,3}. If an object a is a member of the set A, then we write aA. For example, 1 is a member of the set {1,2,3} so we can write 1 ∈ {1,2,3}. We can also use the symbol '∉' to mean "is not a member of". This is a general principle in mathematical notation - by putting a slash through a symbol, we mean the opposite of what that symbol normally means.

It is important that each of the objects in the set be different. If there are any repeats then we simply ignore them. Thus the set {1,2,2,3,3} is the same as the set {1,2,3}. The set clearly has three members, so we say that the cardinality of the set is 3. We can also think about the set with no objects, which we call the empty set and write as \emptyset. This has cardinality 0.

There is no reason for us to restrict ourselves to thinking about sets with only a finite number of members. It gets tricky to write out a set with an infinite number of members, though! We get around this by writing "..." to mean "and so on". For example, when we write {1,2,3,...} we are supposed to understand that the set consists of the rest of the natural numbers, i.e. it is the set N. This can be a bit ambiguous however, especially when it's not obvious how to continue the series. We can thus write a set by specifying conditions for membership, where we use the symbol ' :' to mean "such that". For example, the set {n : nN and n is prime} can be read as "the set of all numbers n such that n is a member of the natural numbers, and n is prime". This is just the set of all prime numbers.

We want to be able to talk about two sets being equal. We say that A = B if they have the same members, or more formally, if every element of A is also an element of B,and every element of B is also an element of A. Using this definition we see that the order of objects inside a set doesn't matter; for example {1,2,3} = {3,2,1}. If every element of A is also an element of B then we say that A is a subset of B, and we write AB. This gives a quicker way to say that two sets are equal: We say that A = B if AB and BA.

Working with sets

For sets to be useful, we have to be able to do stuff with them! We want to have ways of combining sets together, for example, and counting how many members a combined set has. One of the most basic operations that you can do with two sets A and B is to take their union, that is, form a bigger set A\cupB which includes all of the members of A and all of the members of B. Using our notation for sets, we write A\cupB = {x : xA or xB}. Thus x is in A\cupB if it is in either A, or B, or both.

We can also form a set which contains only those things which are members of both A and B. This is called the intersection of A and B, and is written as A\capB. Using our notation for sets we write A\capB = {x : xA and xB}. If A and B don't have any of the same members, i.e. if A\capB = ø, then we say that A and B are disjoint.

Finally, we can consider the set which consists of all members of A which are not members of B. This is called the difference of A and B, and is written A\B. Using set notation, we write A\B = {x : xA and xB}, remembering that '∉' means "is not a member of". Note that A\B is not the same set as B\A, just as a - b is not the same number as b - a when we are working with the integers.

How do we prove that two sets are equal? One way is to remember our definition of equality: two sets A and B are equal if every member of A is a member of B and vice-versa. The following theorem will be your first taste of a mathematical argument that makes heavy use of all of our set notation:

Theorem:  A\cup(B\cap C) = (A\cup B)\cap(A\cup C) \,.
Proof: Let x\in A\cup(B\cap C). Then either x\in A, or x\in B\cap C. We treat these two cases separately:
  (i) If x\in A then x\in A\cup B and x\in A\cup C, and thus x\in (A\cup B)\cap (A\cup C).
  (ii) If x\in B\cap C then x\in B and x\in C, and thus x\in A\cup B and x\in A\cup C.
This shows that every member of  A\cup(B\cap C) is also a member of (A\cup B)\cap(A\cup C). We also need to show that the converse holds. If x\in(A\cup B)\cap(A\cup C), then x\in A\cup B and x\in A\cup C. If x\notin A, then x\in B and x\in C, and thus x\in B\cap C. Therefore x\in A\cup(B\cap C).

Don't be put off by the notation here! It can be a little tricky to get used to, but if you work through it step-by-step, making sure that you understand the argument at each stage, then you will be well on your way to understanding arguments about the equality of sets.

Ordered pairs and Cartesian products

An ordered pair is a little like a set which has only two members, but the order of the elements matters. We write an ordered pair using round brackets, as (x,y) rather than {x,y}, to make it clear that we're talking about an ordered pair and not a set. Because the order matters, the two members are allowed to be equal, which you will remember was not allowed for sets. For example, (1,1) is perfectly fine as an ordered pair. We say that two ordered pairs (x,y) and (z,w) are equal if, and only if, both of their elements are equal, i.e. if x = z and y = w. For example, the ordered pairs (3,4) and (4,3) are not equal, which is something different from sets.

As a taster of why sets are so important, notice that we can express an ordered pair in terms of a set, by defining (x,y) to be the set {x, {x,y}}. This is a set whose members are the number x, and the set {x,y}. You should be able to convince yourself that the ordered pairs (x,y) and (w,z) are equal (according to the definition in the previous paragraph) if and only if their respective sets are equal (according to the definition of set equality given earlier).

Now that we know that an ordered pair is, we can define the Cartesian product of two sets A and B, which we write as A \times B. This is the set which consists of all ordered pairs (a,b) such that aA and bB. Using our set notation, we can write A \times B = {(a,b) : aA and bB}. For example, we could define the set \mathbb{Z}^2 = \mathbb{Z}\times\mathbb{Z}, which would be the set of all ordered pairs of integers. If we think of the integers as points in a line, then our new set would be the coordinates of the points in a square grid.

Functions

A function from a set X to a set Y is a way of associating an element yY to every element xX. If we give the function a name, say we call it f, then we often write y = f(x). This notation should be familiar from your school-level studies.

For f to be a function from X to Y, a few conditions have to be satisfied. Firstly, f(x) has to be a member of Y for every xX. For example, "take the square root" is not a function from the real numbers \mathbb{R} to \mathbb{R}, since there is no real number which is the square root of a negative number.

Also, each xX can only be associated to one yY. With the example of "square root", we might think that we can define \mathbb{R}^* to be only the positive real numbers, and then "square root" would be a function from \mathbb{R}^* to \mathbb{R}. However, for any real number x there are two real numbers which are its square root, namely +√x and -√x. Each x can only go to one y, and so this isn't a function either. However, "square root" IS a function from \mathbb{R}^* to \mathbb{R}^*, since every positive real number has a unique positive square root.

As always, we need some notation if we're to talk about functions efficiently. We make the following definitions:

  • Given f from X to Y, we say that X is the domain of f and Y is the range of f. We write f : X \to Y.
  • If xX and yY, with y = f(x), we say that y is the image of x, and x is the preimage of y. We write f : x \mapsto y.

This allows us to define functions economically. For example, to define the square root function we could write f : \mathbb{R}^* \to \mathbb{R}^* and f : x \mapstox. As another example, we could define the function 'sine' by f : \mathbb{R} \to \mathbb{R} and f : x \mapsto sin (x).

We now want to talk about some specific kinds of functions. Assuming that we have a function f from X to Y, we say that

  • f is an injection if no element of Y has more than one preimage. This means that there is at most one element of X associated to each element of Y, or that no two elements of X are associated to the same element of Y.
  • f is a surjection if every element of Y has at least one preimage. This means that there is at least one element of X which is associated to each element of Y. In this case we could also say that f is onto.
  • f is a bijection if it is both an injection and a surjection. In this case, each element of x is associated with exactly one element of Y. We could also say that f is one-to-one.

Composition and inverse of functions

If we have two functions f : X \to Y and g : Y \to Z then we can think about joining these functions together in a chain. First we use f to associate each element of X to an element of Y, and then we use g to associate the resulting elements of Y to elements of Z. The result is a function from X to Z which we call the composition of f and g, and which we write as g \circ f. Notice that we always think about what f does first, and then what g does - this is reversing the order of how you would read g \circ f in English. We remember the order by reading g \circ f as "g of f".

We can use our notation for functions to write the composition of f and g as

g \circ f : X \to Y ,
[Unparseable or potentially dangerous latex formula. Error 5 : 1190x1540] x \mapsto g(f(x)) .

You should notice that the function f \circ g doesn't exist at all, unless Z is a subset of X.

Every set X has a special function from X back to X, which is called the identity function. This simply associates every element of X to itself. We write the identity function for X as idX, and we can use our notation for functions to express it concisely as idX : X \to X and idX : x \mapsto x.

Now we can think about the 'inverse' of a function f. Informally, the inverse function f-1 is the function which 'undoes' the action of f, i.e. it takes f(x) to x. We want to say that if we have two functions f : X \to Y and g : Y \to X, then g is the inverse of f if composing f and g takes each element x back to itself. We have to be careful with this, however, since the order in which we compose f and g will make a difference. More to the point, if f is a surjection then two different elements of X can be mapped to the same element of Y. Which of those two elements is g supposed to map back to?!

We can approach this by treading carefully. Say we have two functions f : X \to Y and g : Y \to X. We say that:

  • g is a left inverse for f if the composition g \circ f = idX. This means that g is a left inverse if g(f(y)) = y for each element yY.
  • g is a right inverse for f if the composition f \circ g = idX. This means that g is a right inverse is f(g(x)) = x for each element xX.
  • g is an inverse for f if it is both a left inverse and a right inverse.

We can now prove a theorem about inverses of functions:

Theorem: Let f be a function from X to Y. Then
  (i) f is an injection if and only if it has a left inverse.
  (ii) f is a surjection if and only if it has a right inverse.
  (iii) f is a bijection if and only if it has both a left and a right inverse.
Proof: We deal with each of these one at a time.
  (i) First we do the 'if' part. We need to show that if g is a left inverse for f, then f(x) = f(y) implies that x = y (since this is what it means to be an injection). But if we just apply g to both sides of f(x) = f(y), then we get g(f(x)) = g(f(y)), which is the same as saying that x = y, since g is a left inverse. To do the 'only if' part, consider yY. If there is an xX such that f(x) = y then define g : Y \to X so that g(y) = x. If not, then let g(y) be anything. Then g(f(x)) = x for every xX.
  (ii) First we do the 'if' part. Let g be a right inverse, and take yY. Then g(y) is a preimage for y, so y has at least one preimage. For the 'only if' part, we define g(y) to be any preimage of y. Then f(g(y)) = y for each yY.
  (iii) If f has both a right and left inverse then it is an injection and a surjection (from (i) and (ii)) and so it is a bijection, and vice-versa.

Binary operations

A binary operation \circ on a set A is a way of combining any pair of elements (x,y)A \times A to form a new one, z = x \circ y. More formally, we define a binary operation to be a function f : A \times A \to A. We make the following definitions:

  • \circ is commutative if x \circ y = y \circ x for all x,yA.
  • \circ is associative if x \circ (y \circ z) = (x \circ y) \circ z for all x,y,zA.
  • eA is an identity if x \circ e = x = e \circ x for all xA.
  • If \circ has an identity e, and xA, then y is an inverse for x if x \circ y = e = y \circ x.

Binary operations will be most important in the context of groups.

Relations

A relation R on a set A is a potential relationship between two elements a, b of A. If a is related to b by the relation R, then we write xRy. Some well-known examples of relations are "=" (equal to), ">" and "<" (greater than or less than). However, relations can be much more arbitrary than this. We will see this later.

Some of the most interesting relations have one or all of the following properties:

  • R is reflexive if xRx for all xA.
  • R is symmetric if whenever xRy we also have yRx, for all x,yA.
  • R is transitive if whenever xRy and yRz, we have xRz for all x,y,zA.

One important example of a relation is one which gives a sense of when two elements of A are 'the same', or have similar properties. This is called an equivalence relation, and it is defined to be any relation which is reflexive, symmetric and transitive. The most obvious example of an equivalence relation is "=", and it might be helpful to think why each of the three properties of equivalence relations are possessed by "=".

Another example comes from defining the parity of x to be 0 if x is even, and 1 if x is odd. Then we can define a relation R on the natural numbers by "xRy if and only if y has the same parity as x". This is an equivalence relation. It is obviously reflexive, since x has the same parity as itself. It is symmetric, since if y has the same parity as x then x has the same parity as y. It is transitive, since if y has the same parity as x and z has the same parity as y, then z has the same parity as x.

We define the equivalence classes of an equivalence relation R by saying that the equivalence class of x is the set E(x) = {yA : yRx}. For example, if our relation is parity equivalence (as defined in the previous paragraph) then the equivalence class of 0 is all of the even numbers, and the equivalence class of 1 is all of the odd numbers. Note that every natural number is a member of either the equivalence class of 0, or the equivalence class of 1, and that no number is a member of both equivalence classes. We might like to ask if this is a general principle.

In fact it is. We define a partition P of the set A to be a set of subsets of A, which satisfies:

  • every element xA is an element of some subset EP, and
  • if EP and FP, then either E\capF = \emptyset, or E = F.

This can be thought of as splitting the set A into subsets which don't have any overlap with each other, and which together cover the whole of A. Every member of A is a member of exactly one of the subsets of the partition. Now we can prove a theorem:

Theorem: If R is an equivalence relation on A then the equivalence classes of R form a partition of A.
Proof: Take x,yA. We must show that either E(x)\capE(y) = \emptyset, or E(x) = E(y). Suppose that E(x)\capE(y) is not empty, so that there is an element zE(x)\capE(y). Then zRx, since zE(x), and since R is symmetric we have xRz.. But z is also in E(y), so we have zRy. Since xRz and zRy we must have xRy, since R is transitive. Now if wE(x) then wRx, and thus wRy by transitivity, so that w is also a member of E(y). Thus E(x) \subset E(y). Similarly we have E(y) \subset E(x), and so E(x) = E(y).

Relations and functions

A more formal way of thinking about relations is to think of the relation R on A as a subset of A \times A, i.e. a set of ordered pairs. We then say that xRy if and only if (x,y)R. For example, consider the set {1,2,3,4}. We want to construct a subset which will give us the relation "less than", or "<". But this is easy - we just choose R to be {(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)}. Similarly we could construct subsets which give us the relations "equal to" or "greater than or equal to".

Under this definition of a relation, an equivalence relation on A is a subset R which has the following properties for all x,y,zA:

  • (x,x)R,
  • If (x,y)R then (y,x)R, and
  • If (x,y)R and (y,z)R, then (x,z)R.

Notice how this definition is more 'basic' than our previous one. We don't even need to know what a relation is - we just need to know what an ordered pair is, and what a subset is.

A natural extension of this is to think about relations between different sets - i.e. the relations of objects xX to objects yY. For example, if X is the all of the natural numbers and Y is the even numbers, then we can think about the relation between X and Y called "is the double of" by considering the subset R of X [Unparseable or potentially dangerous latex formula. Error 1 ]x, i.e y is not related to x.

  • R is called total if for every x,y we either have xRy or yRx.

A relation R on A which is transitive, antisymmetric and total is called an order relation. Notice that a relation which is transitive and non-reflexive must be antisymmetric. If we assume that there are two elements x,y such that xRy and yRx, then we can apply the transitive property to get xRx -- but we said that R was non-reflexive, so we can't ever have both xRy and yRx, and thus R is antisymmetric. This means that we can give an equivalent definition of an order relation, as a relation R which is transitive, non-reflexive and total.


Comments

collapse
Recent Threads
 
collapse If you wanted your gf to lose a bit of weight
started by: Anonymous
replies: 20
last post: 1 Minute Ago
collapse January Exams
started by: Da_hopeful_1
replies: 13
last post: 2 Minutes Ago
collapse Strangest/cheekiest things you've said to a teacher
started by: Kezzi
replies: 99
last post: 2 Minutes Ago
collapse X-Box 360 v. PS3
started by: MichaelScofield
forum: Gaming
replies: 9
last post: 3 Minutes Ago
collapse Will I get into Warwick / St Andrews / Bristol?
started by: joshmohoboob
forum: Economics
replies: 1
last post: 3 Minutes Ago
 
Article Updates