for academia, mathematics, music, art and philosophy

A surprising story of how a computer was taught to prove some theorems in finitely complete categories

Notes for the talk given at the 2021 Congress of South African Mathematical Society on 29 November. 

1. Finitely complete categories

A finitely complete category is a category that has finite products and equalizers (and hence, all finite limits). Not every category is finitely complete, but most categories of mathematical structures are. 

There is a representation theorem for finitely complete categories (Yoneda embedding), which allows to present any category as a full subcategory of a (larger) category of presheaves of sets, which is closed under all limits that exist in the category. This means that a lot of times, proving a theorem in a finitely complete category involving finite limits, reduces to proving the same theorem in the category of sets. 

For instance, the fact that the product of objects is commutative, up to a canonical isomorphism, can be deduced from the fact that the same is true for the cartesian product of sets. Or, the fact that the composite of two monomorphisms is a monomorphism can be deduced from the fact that composite of two injective functions is injective.

2. Mal'tsev conditions

What would we want to prove in a finitely complete category that is not easily true in the category of sets? 

A lot of Mal'tsev conditions encountered in universal algebra can be reformulated as properties expressible in any category. Historically first Mal'tsev condition, stating that there is a ternary term "p" satisfying the following identities, is such.
The "x"-s here represent variables of the algebraic theory of a variety of universal algebras. The condition above states that it is possible to express an operator "p" using basic operators of the theory, such that the identities become theorems of the algebraic theory (once the "x"-s are universally quantified); equivalently, these identities hold in every universal algebra of the variety. 

Such term "p" can be found in varieties of group-like structures, where we can set:
An abstract finitely complete category, however, possesses no algebraic theory and so it becomes impossible to talk about the term "p" (at least, not quite, see this paper from 2008, by D. Bourn and myself). Nevertheless, it is possible to reformulate the property of the existence of "p" as a property of the category of algebras in the variety. And there are many other Mal'tsev conditions which can be so reformulated!

3. Matrix properties

What we may be interested in, then, is when does one such reformulated Mal'tsev condition imply another, in any finitely complete category? 

The reformulated Mal'tsev conditions that we consider can be described by matrices of integers that appear as subscripts of the corresponding system of term identities. For instance, in the case of the system considered above, the matrix would be
Note that we can disregard the RHS of the equalities in the system if we agree to always use the subscript "0" for the variable there. 

To describe the corresponding condition on a category, we first need the following notion:
Such notion can be produced for every matrix M of integers, of arbitrary dimensions. The number of rows of the matrix determines the arity of the "internal binary relation" for which "strict M-closedness" will be defined. 

The condition on a finitely complete category corresponding to a matrix M states that every internal relation in it (of suitable arity) is strictly M-closed. In the case of the matrix M considered above, this is the familiar property that defines a Mal'tsev category as a finitely complete category in which every internal relation is difunctional, considered here:
To prove that one matrix property implies another in any finitely complete category, we would have to begin with an internal relation R for which we want to establish a property N, and build from R, using finite limits, another internal relation S for which we would apply a property M to conclude that R has the property N. Thanks to the representation theorem above, this reduces to doing the same inside the category of sets. However, it does not really simplify the task: finding the construction of a suitable relation S to get the desired property of R may be tricky.

4. Michael's work

A surprising insight comes from Michael Hoefnagel's work on "majority categories". These are categories defined by the matrix    
The corresponding term can be found in varieties of lattice-like structures, where
In attempting to show that protoarithmetical categories in the sense of D. Bourn are not the same as Mal'tsev majority categories, he discovered that the duals of the categories of relations (on sets) are able to detect implications of matrix properties (see this paper of his). This leads to formulating an algorithm, that we implemented on a computer, for deciding implications of matrix properties. 

The story of how exactly one arrives to the algorithm is an interesting one in its own right. It actually builds on ideas contained in this paper of Thomas Weighill, which is another interesting story that comes out of an observation of how the Mal'tsev property (corresponding to the first matrix considered above) relates to a separation axiom in topology.

5. The algorithm

Here we cut short all of these interesting stories and instead, present the algorithm directly, on a concrete example. 

As we will explain below, the following table gives a proof of the fact that a Mal'tsev majority category has the matrix property of "arithmetical varieties".
The first row of this table states the theorem. The purple matrix is the Mal'tsev matrix. The orange matrix is the majority matrix. The blue matrix is the matrix corresponding to the property of arithmeticity. The theorem is thus that the first two matrix properties together imply the third. According to our algorithm, the following procedure allows to confirm this implication. 

We want to introduce new columns to the blue matrix until we get a column of 0's. To do that, consider the orange matrix in the second row. Each row in this matrix is obtained from the orange matrix in the first row by renaming its entries (the renaming rule may be different across different rows). The corresponding renaming of entries in the invisible column of 0's of the orange matrix in the first row, results in the blue column in the second row. Each row of the blue column thus shows to what was 0 renamed in the corresponding row of the orange matrix (in the second row of the table). Furthermore, every column of the orange matrix in the second row of the table must be a column of the blue matrix in the first table. Similarly, the in the third row of the table we finally obtain the blue column of 0's. This time, we rely on the purple matrix rather than the orange one. Just like the orange one, that purple matrix has every row obtained from a row of the purple matrix in the first row of the table by renaming of its entries, while every column of the purple matrix is one of the previous blue columns (including the one obtained in the previous step).

That the successful execution of the procedure above can confirm the theorem was already known, at least implicitly, from this paper of mine, which is one of the first two papers on matrix properties. What is new is that we now know that a theorem stating implication of matrix properties will be true if and only if it can be established using the algorithm. Thus, in the language of mathematical logic, we had soundness and now we have established completeness.

6. Computer enters the scene

Execution of the algorithm above may not be easy at all. Although the process is always finite, there may be a significant amount of steps we need to take, before we reach the destination. By hand, this task may take a very long time. In fact, it may take a long time even using a computer. Nevertheless, the computer can do better. The following computer-generated image shows all implications between equivalence classes of matrix properties given by "binary matrices" having four rows and up to seven columns, where black spots represent entry 1 and white ones represent entry 0:


We leave it as an exercise to find the Mal'tsev and majority matrices in this diagram (note that matrix properties are invariant under permutation of columns in the representing matrix). 

Interestingly, the poset above (which grows from left to right) appears to have the smallest element -- the left-most matrix. This matrix is in fact the one corresponding to the property of arithmeticity. As Emil van der Walt (grandson of Andries van der Walt) proved in his first year of undergraduate studies, this matrix property implies every other binary matrix property. It has been furthermore shown that every binary matrix property either implies the Mal'tsev one, or is implied by the majority property.

7. Concluding remarks

Let me conclude by adding to the title: not only was a computer taught to prove some theorems in finitely complete categories, but the computer also helped us to identify new theorems that we, humans, were able to prove (and which would be impossible for the computer to prove using the algorithm, since they deal with an infinite number of matrices).

The results described in this talk are taken from two recent papers, the first by Michael Hoefnagel, Pierre-Alain Jacqmin and myself, and the second, by the three of us and Emil van der Walt. The work on these papers is complete and links to them will be placed here once they are published. You can see the arXiv preprint of one of them (note however that a slightly revised version of this preprint is still to be posted). The work on both of these papers was carried out in 2020-2021, and started with a visit of Pierre-Alain Jacqmin to Stellenbosch in January 2020.
Read More