Asaf Karagila
I don't have much choice...

How to prove theorems?

There are no comments on this post.

Oddly enough, one of the questions I hear from starting Ph.D. students is "how do you prove theorems?", so let's talk about that.

I'm saying "oddly enough", by the way, because first of all, I am someone they come to with this question, and in my mind I had just finished my Ph.D. (no, do not tell me it's been six years since I submitted my thesis), and secondly I remember having similar thoughts when I started, and I look back and find them odd. Let me also clarify, that when I ask the person what do they mean, they usually say the same thing "how do you know that you've proved a theorem and not a proposition or a corollary?", which I usually understand as "what results are worth publishing?"

Let me start with a story. If you go to the Papers page and open my M.Sc. thesis, you'll find a theorem: for any field \(F\) and any \(\lambda\) it is consistent with \(\ZF+\DC_\lambda\) that there is an \(F\)-vector space whose proper subspaces are all have dimension \(\lambda\), but the space itself does not have a basis.

As far as M.Sc. theses go in Israel, it was a fairly decent one. I took the original proof, due to Läuchli, done in the context of set theory with atoms and over a countable field, I understood it, I understood the technical basis for symmetric extensions, and I made the obvious generalisation by simply repeating that proof of Läuchli, but in a broader context and with a slightly different toolkit.

When I finished my thesis, Matti Rubin, who was one of the examiners told me that he will not read the thesis. Instead, I will come to his office and for as long as it takes, we will toil over all the details. It took us two days, each with a three hours session. But it was great. At the end of it, Matti was very pleased with my work. And if you've never seen Matti smiling his nicotine-stained smile, and saying with his characteristics lisp "Very good work", you have missed something truly great. I have many stories from these six hours, but, that's for another day.

I want to focus on one. Many students at Ben-Gurion (at least those who would go on to pursue a Ph.D.) would do something similar for their thesis, in the sense of a slight generalisation, and many of them would publish that as a small paper. That's great practice, both in understanding how to write the same result in two different contexts, as well as in preparing a result for publication in general. Matti told me at some point that I should write this as a paper. But I declined. I decided it is too trivial, after all, I haven't done any new mathematics. I have not proved any theorems. All I did was to regurgitate a proof from 1963 in a slightly different context.

Matti said that I'm making a mistake. It's a result that I already have, that some people might find interesting, and it's a first paper, practically for free. I didn't care. I had moved on to prove a new thing that was more interesting to me, about decreasing sequences of cardinals, which led me to embedding partial orders into the cardinals. Well, you can find the paper.

Years have passed, and I've mentioned my result here and there, often on MathOverflow or Mathematics Stack Exchange, mainly as a curiosity with a smidgen of pride. In 2018 most of the work from that chapter, and a bit more, was published by a colleague. I don't think they were aware of my work, and I may have emailed them after the fact. Matti was right. I could have published that six years earlier. But I didn't, because I felt that wasn't a theorem.

The last chapter of my thesis was also published, even more recently. Let me digress and tell you about that one as well, since we're here, and it's my blog, and I've got time. When I finished my work on the main result, mentioned above, Uri [Abraham] told me that it's fine, but it would be nice, since I have a few weeks left in the summer, to contact one of the people I know from MathOverflow and ask for some suggestion for a recent paper and just describe a short result, and if I can add an epsilon to it, even better. I was referred to a preprint by David Feldman and Mehmet Orhon, which proved that for any fixed \(n<\omega\), if given \(n+1\) sets, two of them are comparable (by injections), then the axiom of choice holds. They sent the paper to Andreas Blass who offered a different, and perhaps simpler proof. I read the paper, I've understood it, and I've noticed we can push it by an epsilon: the theorem holds for surjections as well. While researching, I have come across a footnote in one the Rubin–Rubin books mentioning a similar result due to Tarski (I mean, who else). It was in Notices of the American Mathematical Society. Luckily, the library had a copy, and so I could go and dig it out of there. It was a three sentences proof that was effectively Andreas' proof, stripped of the details, down to the bone. He even included the obvious question about infinite families of sets.

Again, I did not feel like that merits a publication of any sort. First and foremost, since it was, in a sense, undermining Feldman and Orhon, but more importantly, since the result was practically proved by Tarski in his notice (he did not use the surjection order, but the proof goes through practically the same way). This, again, was nothing but a gleam in my eye when I mentioned that result in passing over a drink. But, again, this was published recently.

Do I regret "losing" a publication or two? Not at all. I have always felt that by not insisting to stop and publish every single result, I allowed myself more breadth and more breathing room to progress and find newer and more exciting ideas. It is true that in some cases I did publish results of which I did not think too highly, but those were always new, to the extent possible. If you look at the very early papers about forcing, quite a few of them were only a couple of pages long, published in rapid succession, simply because the authors were excited to release these. But I digress.

So. How do you know what are theorems, and which of those are new, and not a corollary of a proposition of Tarski from half a century ago or some such? Well. Experience, experience, experience. Of course. What else? This is why we have supervisors, advisors, and mentors. To help and guide those who are incoming into the research and understand what are the important bits. And sometimes, you will look back and realise that something that you did a few years ago and never published is important enough, and sometimes someone else had noticed that already. But that is okay. Regretting things you can't fix anymore is a waste of energy. When Yair [Hayut] and I found out that the original work on successors of measurable cardinals was actually done by Hugh Woodin some 35 years earlier, Uri told me that we should be proud that we came up with a proof of Woodin all by ourselves. So, instead of feeling bad about it, it is a good idea to move on. And Yair and I did, and we managed to salvage the result by connecting it to the work on critical cardinals. But, I digress again.

In the other direction, perhaps, I remember noticing at some point that the absoluteness trick that we use to prove certain statements from \(\ZF\) by proving them from \(\ZFC\) and finding suitable inner models can be extended slightly (e.g., "If \(\omega_1\) and \(\omega_2\) are singular, then \(0^\#\) exists" is proved by essentially identifying an inner model of \(\ZFC\) wherein Jensen's Covering Lemma fails and pointing out that "\(0^\#\) exists" is upwards absolute). How slightly? Very. Very slightly. I've called these "ordinal bounded quantifier", and I showed that a universal ordinal bounded quantifier on top of an upwards absolute sentence is provable from \(\ZF\) if and only if it is provable from \(\ZFC\). I wrote that up and sent it to a journal. Luckily, I got an email from a colleague who suggested that we can perhaps expand this idea and use it to prove some interesting theorems. We never did, but I did put a hold on the refereeing process. After some six months the editor asked me to make a decision, do I withdraw the paper or do they send it to a reviewer? I had asked Assaf Rinot for advice, and he pointed out that nobody will blame me for having a crummy paper from when I was a Ph.D. student, this is part of the experience that Ph.D. students go through. But at the same time, I already had some papers out and some papers coming together, and it seemed like my publications list is going to look good for a Ph.D. in set theory, even without that one. Perhaps, then, he suggested, I shouldn't have it published. I ended up withdrawing the paper, and it will remain as a note on arXiv forevermore.

Now. Some of you have been reading thus far, and still have no bloody idea how to tell if something you've proved is a theorem. I gave you nothing. I just told you a long story about my two theorems that were stolen from me, or two propositions that I felt were unworthy and other people felt were good enough for a paper. Or something in between. And one more story about me almost publishing a proposition as a theorem. Well. The point I am trying to make is that it's just a matter of experience.

Sometimes, it is the folly of youth, when we feel that the work we did is worthless, where in fact it is worth something. On other occasions, it is the other way around, and we feel like something could be of great interest, but it is nothing more than a curiosity.

My suggestion, therefore, is to just hold on to these curiosities. Talk to colleagues. Talk to your mentor, if the result is somehow novel in an area you feel the supervisor isn't an expert (say, cardinal characteristics, when your supervisor is an expert on the axiom of choice) they can reach out to another expert, or you can too. At the end of the day, set theory is a fairly small community and we're all fairly friendly with one another. Ask if the statement is known, if it connects to any problems, ask for feedback. Just be nice about it, and remember that academics tend to end up super busy sometimes (oh lord, have mercy on my soul and merci on my desk).

Over time you will build the confidence. There is no other way. This is just like cooking. You want to make some pumpkin gnocchi with a browned butter and sage sauce. You consult a recipe, you practice, you ask people for advice, and you give it a go. Eventually you get the hang of it, and you understand what you did wrong, and where you got lucky in the previous attempts. True story.

So, how do you prove theorems? I don't know. You just prove things, and eventually you can start making the distinctions. If nothing else, consider this a preemptive lesson on the importance of a good mentor or three.


There are no comments on this post.

Want to comment? Send me an email!