The long break in blogs is because I am preparing to
move to
another university next week, which happens to be a few thousand kilometers away. I am not sure how many people actually read this thing though - leave a comment to let me know you are here (If you wish to remain anonymous, then something like "Hi from anonymous person number n+1", where n is the number of the last anonymous person).
I'll hold off on the symmetry stuff for a while, because I am still absorbing a few references from some
kindly local mathematicians, which may answer one of my questions from last time (at least partially). So, I'll talk about something else.
While browsing through the library the other day, the following caught my eye:
Donald N. Cohen: Knowledge Based Theorem Proving and Learning.
(no relation, honest). This turned out to be a copy of Cohen's PhD thesis, wherein he describes his admirable attempt to make an automated theorem prover that is less stupid than usual. Sadly, since the thesis dates from 1981 (making it older than me!), a large chunk of the work concerns efforts to make the program more memory efficient, despite his "large" 256kb memory space :) The basic premise is that humans hardly ever prove theorems starting from the axioms (the exception being when one is cleaning up the trivial consequences of a new definition, such as that of a group). However, this is something that has not been adopted much in the automated reasoning community, for some good reasons mind you.
Cohen first sets about creating a database that can store new theorems, as well as new *techniques*. The latter is particularly interesting, because anyone who has done tedious high school maths homework knows that once you learn the trick to solving a problem, you can just blindly follow the recipe when you notice a new instance of the problem, as opposed to solving the same problem again.
Cohen focuses his attention on something that often occurs in a backtriacking search - you end up solving the same subproblem over and over again. The computer science community has developed many heuristics for avoiding this situation, including backjumping (don't backtrack to the previous node, backtrack to the previous significant point) and look-ahead (what would happen if I picked this option?). Cohen's approach is to have the computer "learn" about the subproblem and how to solve it efficiently. A more recent approach, is the following:
Andrew Slater:
Investigations into Satisfiability Search.
The basic approach there is to use
relevant logic, which essentially says that when proving an implication, the assumptions should all be "relevant" to the conclusion. For example, while the statement "It is raining and I am wearing my leather shoes, so it is wet outside" is technically correct, what shoes I am wearing is not really relevant. Using this philosophy, Slater creates an algorithm for reordering a search tree, so that assumptions are relevant to "conclusions", thus sidestepping the problem of the computer re-solving subproblems.
The "learning" in Cohen's thesis is hard to spot. What he calls learning seems to be the ability for the program to remember succesful proof attempts and the pattern of the formulae that it worked on (he works with the typed lambda calculus, but this is not really important). The trick here is to then avoid a "combinatorial explosion", where the prover ahs
too many options at any given stage, so spends far too much time trying different alternatives. Cohen makes some steps towards giving the prover "semantic" knowledge, so that it can reason with models of the formulae (for example, when trying to prove a statement in, say, topology, one usually has a stock of different spaces to test the statement out on, to see what makes it tick). Cohen does not make much progress on this area (and who can blame him, given the number of tasks he sets himself to complete!). A more recent piece of software along the last lines, is
SCOTT (Semantically Constrained OTTer).
So, what more can be said about making automated theorem provers more intelligent, using all that we have discovered int he 24 years since Cohen's thesis? Certainly, machine learning has come on leaps and bounds with, for example, support vector machines proving to be very good at classification tasks. So too has knowledge representation and database theory. As far as building models of theories goes, there are now
very powerful systems for doing modern algebra (groups, rings, fields,...), which can be used to generate as many models as you please. Thus, it seems that the time is ripe to develop software that reasons and learns a lot more like humans...