On Leinster's "Rethinking set theory"
There are 48 comments on this post.There has been a lot of recent discussions regarding Tom Leinster's paper "Rethinking set theory" (arXiv). Being an opinionated person, I only found it natural that I had an opinion on the paper. Now that I have a blog, I have a place to write this opinion as well.
The paper challenges the hegemony of \(\ZFC\) as the choice set theory. It offers an alternative in the form of \(\newcommand{\ETCS}{\axiom{ETCS}}\newcommand{\ETCSR}{\axiom{ETCS+R}}\ETCS\), a categories based set theory. The problem with \(\ETCS\) is that it is slightly weaker than \(\ZFC\). But we also know how much weaker: it lacks the expressibility of the full replacement schema. In this case we can just add a replacement schema-like list of axioms to have \(\ETCSR\).
The list of axioms of \(\ETCS\) is not surprising at all, much like the list of axioms of \(\ZFC\) should not be very surprising. Foundational set theories should, in my opinion, be very "obvious" in formalizing what is a set. We all have this vague idea about sets, and this should be enough. From this point of view \(\ETCS\) and \(\ZFC\) are very similar and that's good.
Leinster's opinion is that \(\ETCS\) is much more natural to work with because it allows to put an emphasis on the universal property, being a structural property rather than material, which is a very important feature in many places in modern mathematics.
The argument presented by Leinster is that this should clear the so-called circularity from \(\ZFC\) (using sets to describe sets -- the circularity, I think, exists only in the natural language level), and it will help to clear out the obvious issues with having every object as a set. The question \(x\in\pi\) seems ridiculous to mathematicians in most cases (except if one is writing an exercise about Dedekind cuts, perhaps). With structural set theory we don't have this issue, it seems. Surely not because one can't express such equation, but because the language makes it much more difficult to do so. He continues to argue, most mathematician has a very naive concept of sets, and they use it without much thinking. Sets of numbers, sets of functions, and so on. \(\ETCS\) have this, and it should allow everyone to keep doing what they did so far, only without seeing the questions "If \(\sqrt2\) is a set, because everything is a set, what are its elements?" or "\(\langle 1,2\rangle = 3\setminus1\)?"
The main problem right now, Leinster writes at the end of the paper, is that there aren't enough books written on the topic and not enough capable teachers. And of course if someone would like to research set theory then learning \(\ZFC\) would be a better choice anyway because much of the current research is done through the eyes of \(\ZFC\).
I have a problem with this approach, and I am certain that I am not the only one. The first, and the most personal problem, I don't like diagrams. When people draw what they mean I lose my ability to understand. People also tell me that diagrams are simpler to follow, but when you draw a diagram with ten points and fifteen arrows (and those are the simpler ones) it would be a lie to yourself to say this is easy to understand. It is also my impression that this gives some sort of false sense of security into the [much loathed] sentence "It is easy to verify that this diagram is commutative.". But that is my problem with categories, and I accept that many people would find it easier to approach mathematics like that.
But it brings me to the first of my two main points. People who are used to diagrams could easily adjust to \(\ETCS\) (if they haven't done so thus far), and that's fine. But often those people would forget that sometimes diagrams are not suitable. I recall an algebraic geometry grad student where I've studied for my M.Sc. who proudly showed me a diagram which his advisor proved to be commutative. What's the big deal? Well apparently it generalizes a lot of theorems from analysis. But to many analysts this diagram is going to be helpful as explaining \(2_\mathbb Q\in\pi\). To people with a hammer everything looks like a nail.
The second point, in which my strong take on material vs. structural set theory is going to kick in (at long last!), is the fact that much of the mathematics being done has nothing to do with set theory. Leinster notes that as well. For most mathematicians there are sets of functions and so on. For some of them the functions are not even sets, they are as atomic as the real numbers are. For these people \(\langle\pi,0\rangle\in\sin\) makes little sense, even though nowadays we all agree that functions can be clearly presented as ordered pairs. Many of these people have so little use of more than naive set theory, that the materialistic approach is better. You have a concrete \(f\in L^2\), or a particular sequence of sequences \(\{x_n\}\subseteq\ell^\infty\).
True, you can write all these things with \(\ETCS\) and you could easily keep a short dictionary for translating arguments back and forth from one representation to another. But for people who are interested in concrete sets. In specific kind of spaces, or even a particular vector space, there is an amount of abstraction which is going to do more harm than good. And there are people like that. There are mathematical physicists, and there are people interested in the particular space \(L^2\) over the \(p\)-adic numbers. There are people interested in \(\mathbb R^3\). Mathematics should give them the tools to deal with those problems.
It is often the case that mathematicians forget that in other fields (physics, chemistry, etc.) people see them as developing tools for others. Mathematicians are blacksmiths. We make hammers, and we make knives. These people often forget that there are blacksmiths who make hammers and anvils for other blacksmiths, and there are people who make the hammer and anvil for those blacksmiths, and so on. Eventually you reach down to the core argument and find someone trying to understand how forcing with amorphous sets into \(L\) works. And yes, we're still in the blacksmith metaphor.
So it is my opinion that challenging the hegemony of \(\ZFC\) is fine, and suggesting a more structural alternative which is "in sync" with the current fashionable field of algebraic geometry is just as fine. But I have serious doubts that many analysts would start to fiddle with categories and diagrams, and with "Why is this even better than using naive and material set theory? Oh, just because it is a "junk theorem" that \(1\in\pi\)? Who cares, we're trying to solve actual problems here. You boys, go play outside".
(Note that I hardly mentioned any set theory based arguments like Francois Dorais did, on that I have a whole other post planned.)
Some relevant things to read through:
- "Rethinking set theory" arXiv link to the paper
- the n-Category cafe "Rethinking set theory" post.
- Francois Dorais' reply to the n-Category post, "Back to Cantor?".
- Mathoverflow discussion about the paper.
- Two Mathoverflow discussion relevant to this point, Set theories without "junk theorems" and Can ZFC prove “false theorems”, and still be consistent? (was “Junk Theorems” follow up).
There are 48 comments on this post.
(Jan 05 2013, 19:41)
Hi, again. I'd like to reply briefly.
And that's fine; of course you're not alone. (It's probable that you could learn to like them, but that's not the point.) I think Tom is really trying to reach out to people who, like you, don't like diagrams all that much. (It's arguable that he wasn't 100% successful -- yet -- you can see that he's not completely happy about how his axiom 7 was written out, for instance.)
But I really think the commutative diagram presentation is something essentially optional, and people shouldn't get too hung up on it. Ultimately the point is that most proofs that we write in mathematics (maybe not set theorists who are working explicitly with models of ZFC in its standard signature) -- proofs in analysis, number theory, algebraic geometry, etc. -- make no reference to membership trees and make perfect sense as structural mathematics. (Meaning nothing particularly political; it just means that what mathematics we care about here is invariant with respect to isomorphism; you know, chairs, tables, beer mugs, and all that.) We should be able to found this natural mode of expression which has isomorphism-invariant meaning in some way, and ETCS (or ETCS R if you'd rather) is one way to do that. However, that particular presentation is by category theorists and maybe for those who don't mind category theory (and who even find it useful in their lives).
Have you had a look at Mike Shulman's SEAR? It's another way to give a foundation to structural or isomorphism-invariant ways of speaking, but the language doesn't require commutative diagrams. The essential content is the same as ETCS R (well, that's true of SEAR C), but the mode of presentation is different and you personally might find it more congenial.
Final comment: around where you say, "Many of these people have so little use of more than naive set theory, that the materialistic approach is better", I get the impression that you misunderstand. I and every use elements, membership notation, etc. too -- all the time -- as does any category theorist, as does any mathematician. Often I strongly prefer such notation to commutative diagram notation; it depends on circumstances. But the overwhelming tendency is to use such notation in structural ways, and that -- not commutative diagrams per se -- is the real issue in Tom's paper as I see it.
(Jan 06 2013, 08:45 In reply to Todd Trimble)
Thank you for the comment. Let me answer it in parts.
First, it's not just that I dislike diagrams - I have a problem with categorical approach to mathematics. Not because I am such a materialistic person, I just feel that elements are more important than functions. True, one cannot completely abandon categories when doing mathematics, it is perfectly reasonable to find yourself in terms of one category or another. But the approach feels fake to me.
To your main point about the proofs of isomorphism-invariant properties, I agree completely. But I feel that \(\ZFC\) is more or less an assembler which allows you to access the CPU directly. Most proofs in mathematics are written informally "We do this, we do that..." and even in set theory when one is talking about ordered pairs, one hardly refers to their Kuratowski-based definition. In mathematics we give "second-order proofs" in the sense that the statements are often schemes of claims which we can apply to any object with such-and-such properties. Nobody cares what the real numbers are, and set theorists often jump around between the Baire space, the Cantor space, plain power set of \(\omega\), and sometimes directly reference \(\mathbb R\) when talking about the real numbers.
But it seems to me that Tom's issue, as well many of the people participated in the MathOverflow discussions linked in the last reference, with \(\ZFC\) is that it allows things like \(\langle 0,1\rangle\subseteq 3\) which make no sense mathematically. I can assure you, though, that after you write one piece of code in x86 assembler which rewrites itself as it runs that manipulating all objects as binary and manipulating all objects as sets make perfect sense. This is what annoys me, in some sense, that people have an issue that in a system which allows you to encode everything as sets you can write some weird junk. Equivalently, this would be baffling to those people that if you write a piece of code in Common Lisp then the code itself is just a list and the program can manipulate itself (easier than to do that in assembler, I have to admit).
To your two parts, I have yet to read about \(\axiom{SEAR}\) (but I saw you mentioning it in the comments to Tom's original post on the n-Cafe); and I might be misunderstanding. I really feel that sometimes I am misunderstanding (which is not shocking because many mathematicians live in their own private worlds, and I am certainly one of those), but it seems to me that sometimes people from a strong category background try to sell it to others as a better foundation than set theory (even if they do start with the sentence "this is not instead of set theory, this is set theory" -- which makes it worse in my opinion). Even though I get the impression that Tom is a really nice guy, it feels kind of condescending when the only problem with \(\ETCS\) is that there aren't enough books about it. In this aspect, it seems, there should be absolutely no problem with \(\ZFC\).
(Jan 07 2013, 00:14 In reply to Todd Trimble)
I am going to revert to the philosophical style of using superscripts to distinguish different meanings of the same symbol. There are three different objects denoted by \(\pi\) in my previous comment: \(\pi^{(N)}\) is the pre-formal number \(\pi\) that we learned in grade school, which is neither a function nor a set; \(\pi^{(S)}\) is a particular set in the standard model of \(\ZFC\); and \(\pi^{(F)}\) is a certain element of the set \(\mathbb{R}\) in the standard model of \(\ETCS\). (For the sake of this argument it makes no difference which isomorphic copy is arbitrarily chosen to be called \(\mathbb{R}\), either in \(\ZFC\) or in \(\ETCS\).)
So my previous comment should read: If it is legitimate to ask in \(\ETCS\) whether \(\pi^{(F)}\) is a bijection, even though \(\pi^{(N)}\) is not a function, it’s equally legitimate in \(\ZFC\) to ask whether \(1^{(S)}\in\pi^{(S)}\) even though \(\pi^{(N)}\) is not a set.
Some people use the terminology "junk theorem" to refer to a result in \(\ZFC\) such as "\(1 \in \pi\)", which is to say \(1^{(S)}\in\pi^{(S)}\). Leinster does not use that term in his paper, but at the beginning of page 2 he makes the argument that the ability of \(\ZFC\) to express things such as \(\pi\cap \mathbb{R}\) is at odds with mathematical practice. If I can try to flesh out this argument, the issue is that in normal practice we are concerned with \(\pi^{(N)}\) which is not a set, and which cannot be intersected with \(\mathbb{R}\).
However, in normal mathematical practice, \(\pi^{(N)}\) is not a function either, so it would be equally at odds with normal mathematical practice to ask "Is \(\pi\) a bijection?". However, \(\ETCS\) uses \(\pi^{(F)}\), and so \(\ETCS\) does allow us to ask "unnatural" questions of that sort.
Thus, if our goal is to avoid "junk theorems", \(\ETCS\) does not remedy the problems of \(\ZFC\), it only replaces them with new problems. The existence of junk theorems is the natural result of trying to represent the vast multitude of types of mathematical objects using only a handful of types of objects.
(Jan 07 2013, 00:15 In reply to Todd Trimble)
This blog appears to strip the HTML: superscript tag. In my other comment the symbols (F), (N), and (S) should be superscripts.
(Jan 07 2013, 00:25 In reply to Carl Mummert)
Carl, I converted your notation to MathJax, which is active in the comments and can be useful.
(Jan 06 2013, 23:34 In reply to Todd Trimble)
@Carl: similar to a query below, I ask you why you think it is "pathological" to ask whether an element, where in ETCS we define an element of \(X\) to be a function \(x: 1 \to X\), is a bijection. All it would mean is that \(X\) is terminal; this is a perfectly sensible thing to ask.
(Jan 06 2013, 23:28 In reply to Carl Mummert)
Sorry: what do you mean that an element \(x: 1 \to X\) in ETCS is "not really a function"?
(Jan 06 2013, 17:28 In reply to Todd Trimble)
Dear Todd, thank you for the comment. I am happy to see that you admire my honesty, I always thought that if people from algebraic geometry can say they don't care and don't like logic and set theory, then I should have all the permission in the world to state as clearly as possible that diagrams and universal properties are overrated. :-)
I agree with you very much. Nobody really cares if the natural numbers are the finite ordinals, or some sets of sets of sets which appear in the embedding of \(\omega\) into the construction of \(\mathbb R\) by Dedekind-cuts over the rationals (which are a quotient space of the integers, which themselves are a quotient space of \(\omega\)). But this is sort of a similar case to what happens sometimes with mathematicians which argue for and against the following claim: \[\mathbb{N\subseteq Z\subseteq Q\subseteq R\subseteq C}.\] One can say that the integers are really complex numbers, and one could say that we may identify them within the complex numbers via a canonical embedding. But if you really sit to think about it you should, probably, come to the conclusion that this really depends on how you view the universe of mathematics. What is the canonical and atomic start that you assume to exist.
If you assume that everything you could wish for (consistently, anyway) exists, then you can talk about subsets or inclusion maps without caring (either for or against), but such approach would make the whole "Let's construct the real numbers from the rationals" a bit irrelevant. If the real numbers already exist, why do I have to construct them again? And if I do, does it matter if I use the copy made from real real numbers, or the copy made from the sets of rationals? That's an irrational question, of course nobody gives a rats toches. And I think that everyone in mathematics would agree to that points: (1) it doesn't matter how you construct the real numbers; (2) there is little point in constructing something which already exists.
But why, if so, we even bother with these constructions? It is to show that if we have the rationals then we really have the real numbers just at the tip of our Dedekind completion; and if we have the integers then we have the rationals; and if we have the natural numbers then we have the integers... and apparently in set theory if we have the empty set and then we have it all. I find this much more philosophically pleasing, to know that from the assumption that one thing exists I can deduce so much more. It has this glowing and inherent beauty which I am sure that many people could appreciate. But I digress.
Let me return to the point in your comment. I think that the complaints that "encoding everything via sets and model theory generates a lot of junk" is a bad argument against using set theory as a foundation. In particular when you already know that the alternatives are equivalent to it. This would be like saying "I don't like inaccessible cardinals. I don't understand them, so I use Tarski-Grothendiek set thery instead". You can lie to yourself, but you can't lie to the mathematics.
But there is more. Set theorists care about membership elements, but mainly because this is the language of set theory. Of course that if I have a transitive model of \(\ZF\) and I extend it by forcing then I want to make sure that the extension is a transitive model of \(\ZF\) again. So I do care what are the elements. But at the same time, it is not uncommon (at all!) to see the following written in advanced set theory texts something of the following form:
What does that mean? We took a model which mimics something quite large, and we insisted that it would contain an uncountable set, but then we replaced everything by countable sets. Certainly \(\omega_1^N\neq\omega_1\), the former is countable and the latter is not. But they function the same from the point of view of the theory. Which is what we care about. We don't even care what sort of countable set this is.
But set theorists don't complain about that. They know what is important. It seems to me, somehow, that most mathematicians don't care at all about that. They want to know that there's someone down there in the basement office which ensures that whatever they do is consistent, and often they don't even care about that unless someone points out a mistake in their work. But some people are unhappy about this, they feel that a different foundation would be better, and that's fine. But the problem is that their arguments against set theory are often lacking. They focus on the unimportant rather on the important. They focus on the fact that by considering only the membership relation we have a lot of "noise" which most people don't even notice.
I feel that I'm rambling and that I completely lost all track of my argument. I may return and revisit it in another comment later. But my point is that of course we think in a structural way about mathematics. Even about set theory. Insisting that a foundation is bad just because you dig deep enough to run into strange fossils in the ground... sounds ridiculous to me.
[As for your question about TeX, you can use all the usual MathJax code, and in addition \ZF, \ZFC, and \ETCS. I will remove your first comment and my reply, because there is content worth reading in the comments now.]
(Jan 06 2013, 21:26 In reply to Todd Trimble)
Okay, after taking a nap I have two new points to add:
The first is that sometimes we do care about the elements, because it makes things easier. For example when talking about countable models we can insist that they are all just relations on \(\omega\), and this way we may classify some sort of sets. This is useful in descriptive set theory. And descriptive set theory itself has useful applications, so maybe sometimes it does make sense to insist that the elements of something are going to be natural numbers. Even if that thing is a model of a subtheory of \(\ZFC\).
The second point which is what Carl brought elsewhere on this page. The same arrows aimed at \(\ZFC\) can be easily turned at \(\ETCS\). It is equally easy to ask about "senseless things" which happen in \(\ETCS\) but not in general mathematics. And this is not a critique against \(\ETCS\), but rather against the critique aimed at \(\ZFC\). It seems to me that the only reason people insist that set theory (in the classical sense, that is) has more junk than, say a category-influenced theory, is because a lot of people nowadays talk in arrows and diagrams (with or without this being applicable directly to \(\ETCS\)) and set theorists avoid diagrams like the plague, not all of them but some. This means that set theorists now speak in another language, and people always prefer to talk in a closer language to their native tongue. Suddenly \(\ETCS\) sounds much more appealing, it is closely related to the language of arrows and diagrams, whereas set theory has this complex way to represent everything by sets and so on.
As for your last remark, I did think about it recently. How an approach for combining both aspects of set theory can be done. I realized that I know far too little about algebraic set theories, and I can't really give serious thought to this question because of that. I will add that I am planning on a post or two regarding my venture into the world of model categories during September, and what it had taught me about an arrow-y approach to set theory.
(Jan 06 2013, 03:58)
Where have you seen gut-churning commutative diagrams, Asaf? The sorts of diagrams that turn up in material for beginners are never more than a few vertices, and certainly not so visually confusing it takes awesome powers of category-fu to figure them out. Certainly there are some intimidating examples out there (Igor Bakovic holds the record in my books for crazy diagrams), but these aren't necessary for introducing category theory/structural set theory/geometry/what have you.
But Todd put it better than me.
(Jan 06 2013, 15:22)
I wonder whether part of the issue is that set theorists are usually prepared with at least a short course in formal logic, while general mathematicians often don't have any exposure to formal logic. In particular, the issue is that first-order logic is not "type safe", and I would suspect that people who have had exposure to first-order logic take this for granted, while those who have not may find it surprising. The reason is that syntax, in first-order logic, is defined independently of semantics, so that whether a formula is well-formed (syntactically correct) is independent of what the formula means. In the usual single-sorted first order logic, any function symbol in the signature can be applied to any element of any model. This is a consequence of the way that "atomic formulas" are defined in first-order logic.
In ETCS as presented by Leinster, if it is formalized in the most direct and obvious way in two-sorted first-order logic, with one sort for "sets" and one sort for "functions", and the symbol \(\circ\) is part of the signature as an operation taking two functions and giving a third function, then any two functions can be composed, for example the complex logarithm can be composed with the function that squares a \(2\times 2\) matrix of integers. This will cause no problems whatsoever, of course, but it is another of the "pathological" formulas, like \(1 \in \sin\), that trouble some.
Perhaps Leinster has a different formal logic in mind for his definitions - I don't think he mentions the formal syntax of his system at all. Such an omission is natural for someone not thinking about formal logic, and the point of Leinster's paper is just to give an informal, "naive", treatment. To completely avoid pathological formulas seems to require either not including \(\circ\) as a syntactical symbol at all, and always "talking around it," or moving to a complex type theory where the signature itself depends on the universe of the model being considered. The latter loses the elegant separation of syntax from semantics which was one of the earliest developments in model theory.
This issue of "type safety" arises also in programming languages. Some languages, such as C, have little run-time type checking. I can take a 64 bit integer and treat it as an array of eight 8-bit characters - with implementation-defined results. Other languages are much more strict, not allowing access to the individual bytes of an integer. The benefits of each type of programming language have been extensively discussed in the computer science literature. In working mathematics we have humans to check that the formulas we actually write make sense - nobody accidentally writes \(1 \in \sin\) in an analysis paper - so it is not clear that there is any real advantage to stronger "type checking" in the foundations.
(Jan 06 2013, 15:40)
On a somewhat separate note, I am not completely convinced that Leinster's development in Section 2 solves the problems he describes in the introduction under "Why rethink?". In the definitions presented, π is a function (because every "element" is a function with domain 1). Thus (as with any function) we can legitimately ask whether π is injective, surjective, invertible, etc., which would seem equally pathological in a calculus text as asking what its members are. Of course π is a collection of different functions (one for each set that contains π, naively), and so the answers will actually vary depending on which copy of π we look at.
(Jan 06 2013, 15:47 In reply to Carl Mummert)
Well, as much as I understand elements are functions from a terminal object into arbitrary sets, such function has to be injective because its domain is a singleton. But I don't think that it is going to be surjective. But I might be misunderstanding, as pointed out by Todd (and I admit to not overthinking this too much).
(Jan 06 2013, 21:02 In reply to Todd Trimble)
Surely, if it is legitimate to ask in ETCS whether π is a bijection, even though π is not "really" a function, it's equally legitimate in ZFC to ask whether \(1 \in \pi\), even though π is not "really" a set. Here "really" indicates our pre-formal understanding of the number π.
Leinster writes, "However, our understanding that the encoding is not to be taken too seriously does not alter the bare facts: that in ZFC, it is always valid to ask of a set ‘what are the elements of its elements?’, and in ordinary mathematical practice, it is not. Perhaps it is misleading to use the same word, ‘set’, for both purposes."
One could give a parallel argument that the encoding of ETCS should not be taken too seriously: in ETCS it is always valid to ask which elements of a set (say, a set of real numbers) are bijections, but in ordinary mathematical practice it is not always valid.
This is in no way a criticism of ETCS, or a criticism of ZFC. Every foundational system that I am aware of adds some of its own baggage to the bare mathematical objects that are being formalized. I am only criticizing the argument that is presented in the section "Why rethink?" of the arXiv paper by arguing that the same sort of criticism aimed there at ZFC also applies to ETCS.
(Jan 06 2013, 20:23 In reply to Carl Mummert)
Yes. Of course, it's no sin to want to be able to formalize theories with partially defined functions (with domains defined by e.g. equations between total functions). If this is felt to be syntactically awkward, requiring workarounds, maybe that's saying more about the partial awkwardness of traditional forms of logical syntax, making some things look harder than they actually are. In retrospect, for the purpose of reader-friendly exposition, Tom was wise to avoid getting into such details.
As you were perhaps alluding to (?), another tack to take is to define the theory of categories as a dependent type theory, where we have \(\; \vdash O: \mathrm{Type}\), dependent types \(x, y: O \vdash \hom(x, y): \mathrm{Type}\), and judgments like
\[x, y, z: O, f: \hom(x, y), g: \hom(y, z) \vdash c(f, g): \hom(x, z).\]While complex, the syntax of dependent type theory is cleanly separated from its semantics (which generally takes place in a locally cartesian closed category such as \(Set\)).
(Jan 06 2013, 20:50 In reply to Todd Trimble)
If the goal is to avoid being able to do "pathological" things such as asking whether \(1 \in \sin\) or ask whether the integer \(2\) is a bijection, then a strict type theory is likely to be the best way to achieve that. Neither ZFC (every element is a set) nor ETCS as Leinster describes it (every element is a function) has that sort of type theory, so they each allow their own sort of "pathological" questions to be expressed.
In the arXiv paper, Leinster defines "for each set X, set Y and set Z, an operation assigning to each f:X→Y and g:Y→Z a function g ◦ f : X → Z". The literal reading of that merges of syntax and semantics: for each semantic triple \((X,Y,Z)\) the syntax includes a function symbol \(\circ_{(X,Y,Z)}\). This is (in a crucial way) not the same as "a partial operation \(\circ\) which, for all sets \(X,Y,Z\), takes a function f:X→Y and a function g:Y →Z to produce a function g ◦ f : X → Z".
(Jan 06 2013, 18:50 In reply to Asaf Karagila)
Don't worry, Asaf -- I would just having some slight fun at your expense, taking your "boys" comment in your post and turning it against you. :-) Please don't take it too seriously.
Personally, I think the cumulative hierarchy in set theory is a quite legitimate and intricate object of study in its own right, and God bless -- I can understand how many people would find it fascinating. Yes, you do think in a structural way about structures \((M, \in)\) -- of course you do! (With the same remarks as before: you don't need to care what the elements of \(M\) are "actually made of", do you? That's a distraction.)
Let me reply to the stuff about constructions. In ETCS, the type \(\mathbb{R}\) is not given at the outset. (We are given a type of natural numbers \(\mathbb{N}\); in some sense it plays a role of "axiom of infinity", although that's really not the best way to think of it: it would be better to think of it as a structure which enables recursive constructions.) Instead, we construct such a type -- and as usual, there are various ways of doing it. For instance, we can construct a type of Dedekind cuts by defining a suitable subobject \(\mathbb{R} \to P(\mathbb{Q}) \times P(\mathbb{Q})\). It's not so different from what you're used to. (What is much more interesting is doing this in more generality, not in ETCS but in a general topos, and examining what this gives you. Not just as mental masturbation -- pardon me -- but because it leads to mathematics of real interest to others.)
With reference to the basement metaphor,
I think that many mathematicians might very well have an image of foundations as a junk-strewn basement down there somewhere, and don't want to visit particularly. And as I tried to illustrate, from the standpoint of ordinary mathematics, there is a lot of junk about, that mathematicians rightly ignore! But in Tom's rethinking, it doesn't have to be this way -- one should be able to give a guided tour, proudly inviting mathematicians in to see a nice clean environment which is well-adapted to their purposes. I don't want to overreach too much, but that's the dream anyway.
Could it be that what they are focusing on is something that the set theorists would rather not have them notice? (Nothing to see here, folks -- let's move along!) If there are unimportant or junky features about, maybe we should trim them out, not by simply ignoring their presence, but by cleaning up the joint!
Let me not be too polemical, though. The more I think of this, the more I'm thinking that François has a good idea. After all, I did say that I thought the cumulative hierarchy was a fine and legitimate and rich and intricate object of mathematical study. (As you might expect, the categorists have their own way of viewing it, via Algebraic Set Theory.) I should think it would be a tricky project, merging the two foundations into one, but it's probably worth a try. Don't you think?
(Jan 06 2013, 17:57 In reply to Carl Mummert)
Sure you can legitimately ask! And nothing pathological about asking, although it's probably not a question that a calculus student would think to ask. But since you did ask: any morphism out of \(1\) is injective (vacuously, because \(1\) is terminal). Such a morphism is surjective if and only if it's an isomorphism.
Much more seriously: the #1 difference between how ZF thinks about elements and how ETCS thinks about elements is that when we interpret the notation \(x \in A\) in ETCS, we mean \(x\) is a morphism \(1 \to A\), so that elementhood is not so much a relation as it is a typing declaration ("the element \(x\) is of type \(A\)"). Thus, in ETCS we do not have a "\(\pi\)" as something that exists in free-floating isolation; an element is always attached to and considered within an ambient context (e.g., a type of real numbers).
If you want to form a "set" like \(\{0, \pi\}\), you can do that (in a manner of speaking) provided that the elements \(0\) and \(\pi\) are of the same type (say \(\mathbb{R}\) for the sake of discussion). But that's an abuse of language: we are really forming a subset \(i: \{0, \pi\} \hookrightarrow \mathbb{R}\). Similarly, we often abuse language as follows: given \(x\) of type \(\mathbb{R}\) and a subset \(i: A \hookrightarrow \mathbb{R}\), we write "\(x \in A\)" to mean that the morphism \(x: 1 \to \mathbb{R}\) factors through the map \(i\) (by definition of "subset", the map \(i\) is injective, so the factoring will be through a unique arrow \(1 \to A\)).
(Jan 06 2013, 17:34 In reply to Todd Trimble)
Of course, everyone who has had modest exposure to logic knows it. The option of replacing the function symbol \(\circ\) in the signature with a relation symbol as you describe is exactly what I mean by "talking around it." This is the standard way to eliminate any function symbol from a signature, with the usual drawbacks.
(Jan 06 2013, 17:17 In reply to Asaf Karagila)
Treating the functions naively, the π in {π} is surjective but the π in {0,π} is not.
(Jan 06 2013, 17:34 In reply to Carl Mummert)
Depending on your definition, a function may required to have a specified co-domain as well. But then \(\pi\) of the real numbers is not the same as \(\pi\) which is in \(\{\pi\}\) anyway.
(Jan 06 2013, 17:04 In reply to Carl Mummert)
Nah, composition is a partially defined operation. One way of formalizing is to introduce source and target functions \(s\) and \(t\) and a ternary composition relation \(c\) where \(c(f, g, h)\) means "\(f \circ g = h\)", together with some obvious axioms that spell out the domain of the composition function, the uniqueness of such \(h\) given \(f\) and \(g\), etc. You can do this in a singly-typed way (where the type is that of morphisms) if you want.
Of course, all this is routine, and rest assured Tom knows it.
(Jan 06 2013, 16:55 In reply to David Roberts)
I don't know if you've seen some of the early books by Peter May, but he has some amazing whoppers of commutative diagrams. Is it his book on \(E_\infty\) ring spaces and \(E_\infty\) ring spectra?
(Jan 06 2013, 16:46 In reply to Asaf Karagila)
Obviously I can't make you like categories! But I think to myself, "what a pity -- you're missing out on a lot of fun!" :-) I do admire your honesty, though, in coming straight out admitting your visceral dislike of them, even if I don't see anything to admire in the admission itself.
In view of such strong dislike, I might be wasting my breath, but more optimistically it might be interesting to see what we can agree on. I guess we can agree, just by looking at the logician's notion of theory and model, on the basic sorts of stuff that mathematicians talk about. You have a domain (or domains = types) to serve as "universes of discourse", and constituting the discourses you have certain functions, certain relations, certain constants (which can be considered functions of arity 0), subject to some specified axioms. So the basic stuff in mathematical structures is relations and functions -- those are the main actors. The basic structuralist observation then is that mathematics consists of a vast network of structures and ways of mediating between such structures. For example, we have a notion of model homomorphism or structure homomorphism which is essentially a structure-preserving function between sets equipped with structure. (And this network of structures and maps between structures is what a really categorical foundations would be trying to get at -- we have not talked at all about Lawvere's "Category of categories as a foundation of mathematics").
Against this conception, where what matters in a structure is not what the elements of the structure are "made of" (according to Hilbert, the elements could be tables, chairs, and beer mugs for all we care) but how they are being related to each other within the structure. Thus maps between structures correspondingly don't care about the substance of the elements, but only about preserving the relations between them.
I think that's really the main problem that some people have with the ZFC picture. It's partly this \(\langle 0, 1 \rangle \subseteq 3\) business, yes (by the way, can you tell me how to write math here? and would you mind fixing this post if it doesn't come out right?). But it's perhaps more the fact that no one really takes the membership relation seriously, and thus it's mostly just extraneous baggage or junk. What I mean is: take say the determinant mapping \(M_n(\mathbb{C}) \to \mathbb{C}\). Does this preserve elements of elements of elements? The correct answer is: nobody cares! We only care about preservation of relevant structure.
(Of course, I immediately have to qualify that. Set theorists, studying models of ZFC and homomorphisms, embeddings etc. between them, do of course care about preservation of the elementhood relation through all levels. Well, to hark back precisely to the end of your own post: you boys have fun doing that. Go on outside and play -- we mathematicians have work to do! (-: )
Seriously, I think that's the point of Lawvere's objection: that the reduction of mathematics to a global membership relation -- which we then proceed to ignore in the actual practice of mathematics! -- means that this reduction is laden and riddled with irrelevancies. (I think that's what he means by "a speculative foundations".) What if we tried a different tack, a foundations that really takes itself seriously and is not riddled with structural irrelevance for mathematics?
There are various ways of doing that. ETCS is one, but apparently not one that is very congenial to many traditional set theorists, for a variety of reasons, maybe one being that many set theorists "don't like categories" (well, okay, but again I think that's a pity). Maybe some set theorists don't like the privileging of functions over elements (although one take is that functions are generalized elements). But there are many ways to skin a cat. SEAR (Sets, Elements, And Relations) is another way to found mathematics in a structurally sensitive way; as the title implies, it privileges elements and relations. (Being a dependent type theory, it might take a little getting used to; the big difference from ZF is that \(x \in A\) is not read as a general relation between elements and sets, but as a typing declaration: each element \(x\) is assigned a type or set \(A\).) But anyway, I definitely recommend it to your attention (link).
(Jan 11 2013, 18:52 In reply to Asaf Karagila)
Wow, I missed a lot of interesting conversations. I went to Berlin to isolate myself so that I could concentrate on preparing this semester's teaching: so, I was thinking about Fourier series, not sets. Although come to think of it, there are some excellent historical connections between the two subjects...
Other people have raised lots of big issues, but I'm just going to comment on some small things to do with my paper. Asaf wrote:
> even if they do start with the sentence “this is not instead of set theory, this is set theory” — which makes it worse in my opinion
I can more or less understand you simply disagreeing: e.g. elsewhere, François explained that he prefers to reserve the term "set theory" for theories that have an extensional membership relation. But why does it make things worse?
> it feels kind of condescending when the only problem with ETCS is that there aren’t enough books about it
Sorry that came out as condescending.
That remark of mine was in a paragraph where I listed what I saw as the disadvantages of teaching ETCS as an axiomatic set theory, relative to teaching ZFC as an axiomatic set theory. As Todd has pointed out, such discussions are hampered by the fact that very few of us have either taught or been taught such a class. (Here I'm imagining a group of students who haven't previously been exposed to ZFC or variants.) I don't know anyone who has taught it other than Bill Lawvere and presumably Bob Rosebrugh. So pronouncements on this tend to be rather speculative, and necessarily mine are too.
I have just two significant pieces of experience teaching set theory. One was last year, when I supervised an advanced undergraduate project (or reading course) based on Lawvere and Rosebrugh's book. The student had seen naive set theory before, but not axiomatic set theory of any kind. This made me acutely aware of the lack of resources on categorical set theory. I think I gave her a couple of other things too: maybe pointers to one of Todd's posts, or just possibly a paper of Colin McLarty? I don't remember. But it was evidently frustrating for her that when she was having trouble with parts of Lawvere and Rosebrugh, there was no obvious other place to turn.
The other experience is a bit less relevant to the current conversation, but maybe still interesting. Also last year, I lectured an introductory undergraduate course called "Foundations of Pure Mathematics", which covered all sorts of basic things like modular arithmetic (starting from the definition of "\(a \equiv b\ (\mod n)\)"), symmetries of plane figures (in preparation for the definition of group), and naive set theory. I'd inherited lecture notes from the previous lecturers, and what struck me about the part on set theory was that all ten ETCS axioms made some kind of appearance. They weren't done formally, and that wouldn't have been appropriate for that low-level a course. But the thing is, the previous lecturers hadn't deliberately put in diluted versions of the ETCS axioms, and indeed didn't know ETCS; they'd just put in whatever they thought students needed to know.
(Jan 07 2013, 04:31 In reply to Todd Trimble)
Well, the empty set is a set, and it can't be the element of another set, in ETCS. And also, you are asking to compare a set and an element for equality, which is a type error in ETCS.
(Jan 07 2013, 04:35 In reply to Todd Trimble)
And recall that given a one-element set, it is essentially anonymous, and it makes to sense to ask a question which isn't invariant under the unique isomorphism between any pair of one-element sets.
(Jan 07 2013, 04:35 In reply to Todd Trimble)
And recall that given a one-element set, it is essentially anonymous, and it makes no sense to ask a question which isn't invariant under the unique isomorphism between any pair of one-element sets.
(Jan 07 2013, 05:27 In reply to Todd Trimble)
When you ask, "is there a difference between \(\emptyset\) and the unique element of \(\{\emptyset\}\), you're obviously looking through the lens of how you've always considered sets. Formal ETCS works a little differently, and for people entrenched (sorry, can't think of a more polite word right now) in ZFC, it could take some getting used to, and it could at times make communication a little difficult. Some will decide they hate ETCS and always will; others will strive to maintain an open mind and eventually come to appreciate it (that and more elaborated forms of categorical logic and type theory), at least to some degree, because some of these ideas turn out to be quite amazing and powerful. But I'll try to answer any questions you may have as honestly as I can.
First, in ETCS, by \(\emptyset\) we mean an initial object. I mean any initial object (there can be lots!), although any two are uniquely isomorphic. Depending on how ETCS is presented, either one assumes an initial object exists as part of the axioms, or under a more parsimonious list of axioms, one constructs it.
In the very strictest sense, \(\{\emptyset\}\) isn't a well-formed type/set in ETCS. But if we relax a little bit, I'd say the most natural way to interpret the naive expression is as \(P(\emptyset)\), the power set of \(\emptyset\). This is a terminal set, i.e., a singleton. Being terminal, it has a unique element \(1 \to P(\emptyset)\). To answer your question, for any set \(X\) there is a natural bijection between elements \(1 \to P(X)\) and subsets of \(X\), and so on such general grounds the element \(1 \to P(\emptyset)\) corresponds to the unique subobject \(\emptyset \subseteq \emptyset\). But we'd never say that the element "is" \(\emptyset\), because it is a gross typing error to ask whether an element equals a set! That said, and in view of the natural bijection, for any subset inclusion \(i: A \hookrightarrow X\) we might sensibly write \([i]: 1 \to P(X)\) (or just \([A]: 1 \to P(X)\) if the inclusion is understood), and in view of that, I'd write \([\emptyset]: 1 \to P(\emptyset)\) to denote the formal interpretation of the unique element of "\(\{\emptyset\}\)".
On other matters...
Well, I disagree. (Also, "sticks and stones...")
For now I'm happy to continue this discussion, but it would probably be better for the sake of discussion not to make a sweep-of-the-hand condemnation like "preposterous". Maybe you'd like to sleep on it, or mull over it a while? If you have well-thought out objections, please do share them.
(Jan 07 2013, 05:30 In reply to Todd Trimble)
Huh, my last reply got cut off somehow. In brief, I said that I found the first sentence confusing, but that the penultimate sentence is more or less the idea.
(Jan 08 2013, 01:29 In reply to Asaf Karagila)
Are you thinking of taking a skeleton of a model of ETCS?
(Jan 08 2013, 01:49 In reply to David Roberts)
Not exactly. All singletons would be isomorphic, and we don't want that. We want only to reverse a part of the isomorphisms, essentially those which violate the extensional part.
We want that \(1\to\{\pi\}\) and \(1\xrightarrow{\pi}\mathbb R\) to be the same thing. In particular we want \(\mathbb R\) to be equal to the union of the singletons of its elements.
(Jan 08 2013, 02:42 In reply to Asaf Karagila)
Well \(\mathbb{R}\) is the union of the singletons of its elements, every set (i.e. object in a model of ETCS) is so, in that it is colimit over all functions to it from a terminal object. However, it gets fiddly if you think of this in a material way because the set indexing the union is the set you are trying to form, so this doesn't really tell you much. Actually in a skeleton all singletons would be equal, in any old model of ETCS all singletons are uniquely isomorphic anyway.
I'm not sure I understand what you mean when you say "We want only to reverse a part of the isomorphisms".
(Jan 09 2013, 02:41 In reply to Asaf Karagila)
I replied to this but it got lost, and I can't remember all I said. Not sure what happened. But yes, the class of cardinals in the category of ZFC sets (and possibly ZF sets) is a skeleton.
The other stuff I can't think how to do it, at least at present, but it doesn't seem 'right'. The model category stuff wouldn't help, I don't think.
(Jan 08 2013, 02:56 In reply to David Roberts)
Correct me if I'm wrong but:
So I want not to identify all the singletons. I want to identify the elements by some "natural" equivalence (I'm using quotation marks because I still haven't completely figured out what natural means in category theory :-)). So when taking equivalence classes I am not going to collapse all the singletons into one equivalence class. Instead I am going to collapse all the elements which are \(\pi\) of the real numbers into one class.
Kind of like how in a model category you reverse the (w) arrows and close it up to consider the skeleton, but since we are sort of free to declare what is a (w) arrow, up to some axioms of course, this can be done quite loosely. So we say that an arrow is (w) if the generalized elements represent the same "extensional" element. So to speak. Now when we collapse the category to its skeleton... can we say something meaningful about the result?
(Jan 08 2013, 01:51 In reply to Carl Mummert)
See also http://ncatlab.org/nlab/show/pure set for the construction of material sets out of structural ones.
(Jan 07 2013, 02:26 In reply to Asaf Karagila)
What do you mean by 'may' be required to have a specified codomain? (tongue removed from cheek)
In ETCS, and category theory more generally, functions/maps/arrows always come specified with their codomain. In some settings, one cannot even compare functions for equality if their domain and codomain don't match, and in cases when we ask, 'is \(x = 2\)?' where \(x\) is a real number i.e. an arrow \(1\to \mathbb{R}\) then, we have the implicit retyping of \(2\) via the canonical inclusion.
(Jan 07 2013, 04:28 In reply to Asaf Karagila)
I think that this having functions 'know' what their codomain is one of the major points of difference between material and structural set theories. In essence, elements of sets in ZFC (and related systems) are also functions from the one-element set containing that element, but with no need to come equipped with a codomain, they are in essence 'free living', in that they can be elements of a whole bunch of sets, or none, until we decide where they need to live for the purposes of a proof or construction - much unlike elements in structural set theories, where elements can only live in exactly one set.
(Jan 07 2013, 11:43 In reply to David Roberts)
When we think about sets we think about them so differently, it's interesting. Also I should point that in \(\ZFC\) every set is an element of a larger set, so no element can float "outside all sets".
(Jan 07 2013, 11:25 In reply to Todd Trimble)
I apologize for the use of the word "preposterous". I have edited my comment and changed it. I didn't mean to offend anyone.
As for the empty set not being an element, I am starting to get the hang of it. Okay, so sets and elements are completely separated. But this is again counterintuitive (perhaps because I was indoctrinated with classical set theory) because now we don't have a set which is a \(\sigma\)-algebra, or a topology, or even a power set. Yes, we have representations of those, but if we are going to be foundational then at least partially I would expect the philosophical views I have to be expressed by the theory. Much like many people see real numbers as objects which are not sets, and have a bit of a problem with the whole \(\ZFC\) encoding of everything, I would feel uncomfortable replacing the notion of a power set by the notion of a generalized collection of functions and some natural morphisms into it.
Not to mention that having many empty sets is even stranger. If anything then I expect sets to be extensional.
But let me raise the following question: suppose we are given a model of \(\ETCS\) and we come up with this crazy equivalence relation, something like being invariant under some kind of transformations, so for example all the \(\pi\)'s in the model will be equivalent, and all the empty sets will be equivalent, and so on. Certainly this is stronger than just "being isomorphic" (which would correspond, as I understand, to equicardinality).
What happens then? Do we have some sort of extensional and reasonably-material set theory at our disposal?
(Jan 07 2013, 15:53 In reply to Asaf Karagila)
@Asaf: it is indeed possible to create a model of ZF from a model of ETCS Replacement. There is some description at http://ncatlab.org/nlab/show/SEAR of how to do this for the related theory SEAR . I recommend reading the axioms for SEAR, separately, for a different perspective on ETCS.
(Jan 07 2013, 03:33 In reply to David Roberts)
Well, coming from a set theoretical mind insisting on contextualizing with codomains sounds particularly burdening to me. Would \(\sin(x)\) be different on the real numbers as a function whose codomain is \(\mathbb R\) and as a function whose codomain is \([-1,1]\)?
I do understand that sometimes it's more natural to think that way, and I accept that it might be the case for category theory. This brings me back to what I wrote Todd in one of the recent replies about type casting.
(Jan 07 2013, 01:39 In reply to Todd Trimble)
I see; thanks for explaining. Interesting point! But I don't see the situation as symmetric.
I certainly agree that if some Joe stopped me on the street and asked me, "is \(\pi\) a bijection?", I'd find it a weird question. Let's assume I have the wherewithal to say not, "what, are you crazy? that doesn't make any sense!" but rather, "what do you mean?" If Joe follows up by asking, "if you consider the real number \(\pi\) as a function \(1 \to \mathbb{R}\), according to how you construct \(\mathbb{R}\) in ETCS, then is it a bijection?", then I would smile and say, "of course not". I'd still think it was a weird question, or more precisely a really stupid question, but at least the question would have invariant meaning (i.e., independent of how I construct \(\mathbb{R}\)), and has only one possible answer. Under any construction of the reals that you could possibly name, whether by Dedekind cuts, Cauchy sequences, what have you, the answer is definitely "no", without any pause for thought, because this just asks whether there is more than one real number.
Now replay the same situation with "is \(1\) an element of \(\pi\)?" If he makes the question more precise to an analogous degree by following with, "if you consider the real numbers \(1\) and \(\pi\) as sets, according to how you construct real numbers as ZFC-sets, then is \(1 \in pi\)?", then the question does not have a definite answer since it would depend on the construction. I could just say that: "it would depend on the construction, and I don't have a fixed preferred construction".
The difference is precisely in the fact that answers to questions expressible in ETCS are invariant with respect to isomorphism. I consider that a crucial difference.
(Jan 07 2013, 03:10 In reply to Todd Trimble)
Asaf, not quite. The point is that by precisifying the pre-formal question as a formal question in ETCS, "the element \(\pi\)", as an morphism with domain \(1\), automatically becomes attached to a context (namely, the codomain of the morphism).
Thus, if Joe specifies a context "\(\pi\) qua real number", then we interpret that as meaning a morphism \(1 \to \mathbb{R}\), however I choose to construct \(\mathbb{R}\). This may seem slightly loose, since it's true that there are different constructions of \(\mathbb{R}\) which might not be equal on the nose as objects. But this makes no difference in answering questions that can be asked in suitably formalized ETCS: the ordered field of real numbers \(\mathbb{R}\) is uniquely specified up to unique isomorphism (by an appropriate universal property), and that's all we need to answer the question definitively.
In ETCS, there is no such thing as an element \(\pi\) free floating in space, so to speak, unattached to a context. Just as in type theory, every term is of a specific declared type. This point cannot be emphasized strongly enough for structural mathematics.
(Jan 07 2013, 03:29 In reply to Todd Trimble)
So there is a difference between \(\varnothing\) and the unique element of \(\{\varnothing\}\)? Where I agree that by assigning a type to the parameter you invariably ensure that the context is valid for the question, it becomes philosophically or informally confusing.
This is actually why I like the set theory encoding better. It frees you up from type checking and type casting all the time, and whenever you want you are free to insist on a particular feature. You see this often when talking about trees, you can always assume the underlying set is an ordinal and that the tree ordinal is a subset of the \(\in\) order of the said ordinal. This is magnificently great because it makes it very simple to understand what and how you are going about the tree order.
I mentioned before that there is some philosophical difference between starting with \(\mathbb C\) and assuming \(\mathbb{N,Z,Q,R}\) are all subsets, or starting with the empty set and generating all the rest. Your answer tells me which side you (or at least the philosophy behind \(\ETCS\)) take. \(\pi\) the real number is not \(\pi\) the transcendental number and it is not \(\pi\) from the singleton \(\{\pi\}\) and it is not \(\pi\) from the complex numbers. All these sets are just ideas, it seems, and they are connected by some type-casting functions.
To return, if so, to the programming and encoding analogy, \(\ETCS\) optimizes for speed whereas \(\ZFC\) optimizes for memory... :-)
(Jan 07 2013, 02:21 In reply to Todd Trimble)
But if Joe would just ask "Is \(\pi\) a surjective function?", then your reply would have to be something along the lines "Well, do you mean \(\pi\) as in the element of \(\{\pi\}\) or \(\pi\) as in the real number \(\pi\)?", and there is a difference between them. For some strange reason. How is this different than \(\pi\) in the Dedekind cuts representation vs. \(\pi\) as the Cauchy completion representation vs. other ways to have \(\pi\) in \(\ZFC\)?
(Jan 07 2013, 17:55)
No problem about "preposterous". I actually had slightly guilty conscience after bringing it up, because this is after all your blog, and I needn't be so sensitive!
I've just been reading Mike Shulman's excellent new post at the Café on type theory, and it made me more sympathetic toward the discomforts you feel about ETCS. In particular, in dealing with power sets in ETCS, there is a lot of shifting back and forth between actual subsets and their parametrizations as elements \(1 \to P(X)\), and a certain amount of notational bureaucracy as a result. Also, he made some pertinent remarks about sets in ZFC as data structures, which struck a chord in me. For example, in ZFC we can define the concept of group by a single class formula (i.e., a certain class of ordered quadruples \((G, m, e, \mathrm{inv})\)) whereas in ETCS one has to say, "a group is a set equipped with..." -- see especially his comment here.
In his new post, I think Mike has put forward a really persuasive argument that dependent type theory with a univalence axiom handles these infelicities of ETCS very well, on top of being a robust foundations for structural set theory (and more, since first-order logic is not something "meta" to DTT, but built right in, since propositions are themselves types in this set-up, and proving a proposition simply means constructing a term of that type).
So in summary, categorical approaches to foundations appear to have moved on, and pretty soon debates between ZFC and ETCS R will come to seem like relics from an earlier time, I predict. It might be true, however, that ETCS can still provide a useful stepping stone for people to move conceptually from ZFC (or material set theory) to dependent type theory, intensional dependent type theory, homotopy type theory, etc. The key step is to understand the big difference in how structural approaches treat elementhood as opposed to how material approaches treat it (and so I hope that in a small way, this discussion has been useful! it's been useful to me).
I'll just take up one point from your last comment:
Right; I recall François made a similar comment at the Café, that he felt that extensionality was the hallmark of a set theory. I had responded to him there there is in fact a kind of extensionality principle in ETCS, not in the global form that given any two sets are equal if they have the same elements (which you can't even express in suitably formalized ETCS), but rather in the form that given two subsets \(A\), \(B\) of a given set, they are equal as subsets if and only if they have the "same elements". I should probably explain this more carefully, although Mike touched upon this as well in his post, where he explains how material membership is reflected in ETCS is reflected in passing to subset lattices.
So, as we've been saying, in ETCS the expression "\(x \in X\)" is interpreted as saying that \(x\) is a morphism of the form \(1 \to X\), so that \(x \in X\) is essentially treated as a typing declaration \(x: X\). But relative to a given set \(X\), there is in ETCS a canonical membership relation which is a subset \(\in_X \subseteq X \times P(X)\). Thus, given an element \(x \in X\) and a subset \(A \hookrightarrow X\), it makes sense to ask whether the morphism \(x: 1 \to X\) factors into the form \(1 \to A \hookrightarrow X\), and this is what it means for the relation \(x \in_X A\) to hold (it is the same thing as \((x: 1 \to X, [A]: 1 \to P(X)) \in X \times P(X)\) to belong to \(\in_X\) in the same sense of belonging). So while in ETCS we don't have extensionality for a global membership relation \(\in\), there is a local extensionality principle for local membership relations \(\in_X\).
(Jan 11 2013, 19:26)
I don't completely disagree with this prediction, but I have my doubts. Simple type theory was proposed as an alternative to set theory in the early 20th century and failed. Today, the much improved dependent type theory (with univalence) is proposed as an alternative. Unfortunately, the main reason why simple type theory failed is still there: type theory is not sufficiently accessible. I predict that type theory will not succeed until the accessibility problem is addressed.
I didn't reply at the Café since your observation is correct, but let me point out that ETCS has a still broader extensionality axiom — Axiom 4:
However, this is extensionality for functions not extensionality for sets. Subsets of \(A\) in ETCS have three forms: (a) monic maps \(B \to A\), (b) maps \(A \to 2\), (c) elements of \(P(A)\). Only (b) and (c) have extensionality in the sense you say and in both cases \({\in}\) has to be slightly reinterpreted and this slight twist reduces extensionality for subsets to extensionality for functions. This is one of the many reasons why I suggested (and still suggest) that ETCS is a theory of functions rather than a theory of sets. This is just an insignificant nomenclature issue. I think it is part of a broader trend in category theory where categories are usually named after their objects instead of their arrows, which is strange since the arrows are usually the main protagonists while the objects only play supporting role...