So far I’ve just talked about mathematical objects in general, not just numbers. But if you take the existence of sets for granted and assume some commonly used axioms of most versions of set theory, then the non-negative integers can be constructed. The idea is basically to think of each number as the set of all finite sets that have the same size. The trick is to capture the notion of size without evoking numbers in the first place.
The Axiom of the Empty Set asserts the existence of the empty set { }, the set with no elements. If a set exists, the set containing that set also exists. Thus the set containing the empty set {{ }} exists, as does {{{ }}} and so forth. These are distinct objects in set theory even though that may seem odd. The most controversial axiom we need to accept is the Axiom of Infinity, which says that there exists a set S that contains all sets we can build this way. Explicitly, S = {{ },{{ }},{{{ }}},…}. The elements of sets aren’t ordered, but I did write this one in a suggestive order. Let’s call the subset containing just the “first” element of S A0, the subset containing the first and second elements A1, the subset containing the first, second, and third elements A2 and so forth. The Axiom of Power Sets says that the set of all subsets of any given set (its power set) exists. Note that T = {A0,A1,A2,…} is a subset of this power set.
Intuitively, T is important because its first element contains no elements, its second element contains one element, its third contains two elements, and so forth.
Now we want a way of comparing size without using numbers. Two sets have the same size if their elements can be put in one-to-one correspondence. A rule that determines such a correspondence is called a bijection. Formally, we start with any two sets P and Q and construct their Cartesian product PXQ, the set of all ordered pairs such that the first element is from P and the second from Q. Any subset of PXQ is called a relation between P and Q. Specifically, we want a relation so that for any x in P, there is exactly one element y in Q so that (x,y) is in PXQ and for any y in Q there is exactly one element x in P so that (x,y) is in PXQ.
A set is called finite if there exists a bijection between that set and one of the elements of T. Let’s call the set of all finite sets F.
Next we consider another type of relation called an equivalence relation, but this is only defined for Cartesian products of a set with itself. An equivalence relation on P is a subset of PXP so that for any elements x, y, and z in P, (x,x) is in PXP, (y,x) is in PXP if (x,y) is in PXP, and (x,z) is in PXP if both (x,y) and (y,z) are in PXP. These properties are called reflexivity, symmetry, and transitivity, respectively. You can rewrite (x,y) as x = y for convenience. For example, the usual equality of numbers is an equivalence relation since for any numbers x, y, and z we have x = x, y = x if x = y, and x = z if x = y and y = z.
If two elements are paired together by an equivalence relation then we call them (shockingly) equivalent, and the set of all elements of P equivalent to a given element is called an equivalence class. It can be shown that equivalence relations partition sets; that is, every element of P lies in exactly one equivalence class. We can define an equivalence relation on F, the set of all finite sets, as follows: For any elements x and y in F, x = y if there exists a bijection between x and y. This effectively sorts all finite sets based on their size. All sets that have a bijection with A1 will belong to the same equivalence class, and we call this equivalence class the number 2. So “2” is just the name we give to the set of all sets that have as many elements as A1.
You can continue to build arithmetic and larger number systems by using set operations from here. The only catch is that some mathematicians (though a small minority) do not accept versions of set theory that allow for uncountable sets, so you can’t build the real numbers, the complex numbers, etc., in this way.