# A Rule-based Revolution

## Organizing Mathematical Knowledge as a Rule-based Decision Tree

### Albert D. Rich, Applied Logician Hawaii Island, July 2015

I am the author of 3 implementations of the muLisp artificial intelligence development system, and coauthor of the muMath and Derive computer algebra systems. However, I am convinced my current research will be by far my most important and enduring contribution to the fields of math and computer science.

Since the beginning of recorded history, mathematicians have been amassing mathematical truths at an ever greater rate and into ever more esoteric realms. However, this vast amount of knowledge has not been systematically organized so the precise formula required to solve a particular problem can easily be found. It's like leaves scattered on a lawn, rather than organized into a heavily branched tree with each leaf attached to the appropriate limb.

#### It's Feasible

I am convinced much of mathematical knowledge can be organized into a rule-based decision tree that is unique and optimal. Others have proposed similar such grandiose programs, but the rule-based paradigm actually makes it possible to accomplish it. And this is not just armchair speculation.

Over my almost 40-year career implementing computer algebra systems, I am driven to the realization that rule-based systems are the optimal way to get optimal results. This is not limited to my current focus: indefinite integration. muMath and Derive are general purpose programs able to solve a broad range of problems. Throughout their development these systems gained knowledge by the addition of new rules and the generalization of existing ones.

David Stoutemyer and I began the design and implementation of muMath in the late 1970s. From the earliest versions, despite being limited to a 64 kilobyte address space, the system provided a mechanism for assigning rules to the property lists of functions and operators. When an expression was being evaluated, the property list of its top-level function or operator was searched to see if there was a rule specifically for the expression's second-level function or operator. If so, the rule was applied and the process repeated recursively until no more rules applied.

Although crude by today's standards, this two-level pattern matching mechanism was used extensively in the implementation of muMath. Also its modularity and simplicity allowed users to add their own rules to the system. Derive was also heavily dependent on rules to simplify, solve and integrate expressions; but the rules were "hard wired" in programming code. Although more powerful than muMath, Derive was often criticized for not allowing users to define their own rules for built-in functions and operators.

Computer hardware and software are highly ephemeral, so the mathematical knowledge in muMath and Derive will soon be lost. After development of Derive was terminated prematurely in 2005, I was determined to find a way to pass on the accumulated knowledge in such systems that would survive changing technology.

To that end, I began implementing a Rule-Based Integrator, nicknamed Rubi. Currently it consists of over 6000 integration rules organized into a decision tree that uniquely determines the appropriate rule to apply to a given integrand. Stored as a simple directed graph, this decision tree is what distinguishes Rubi from the program code used by its predecessors to select rules.

Indefinite integration accounts for only a tiny fraction of all mathematical knowledge. However, based on my experience implementing muMath and Derive, I see no reason why much of the rest of mathematics cannot be organized as a rule-based decision tree as well. Certainly much of analysis including equation solving, expression simplification, differentiation, summation, limits, etc. can be automated using this paradigm.

#### It's Desirable

So assuming a discipline of mathematics can be reduced to a rule-based decision tree, is it worth the considerable effort required? My experience implementing Rubi clearly indicates the answer is a resounding yes! The benefits include:

• While loading a rule-based system, it is easy to enclose each rule in a "wrapper" that displays the rule when it is applied and temporarily suspends evaluation so the result of the application is displayed. This is exactly what occurs in Derive 6.1, and now Rubi, when showing the steps required to simplify an expression. Not only is this ability to show steps great for pedagogical purposes, when debugging the system it makes it relatively easy to track down errant rules.

• The ability to craft rules tailored for specific classes of problems makes possible the fine control required to produce optimal results. Whereas systems that depend on monolithic, one-size-fits-all algorithms frequently return dramatically inferior ones. For example, there are hundreds of problems in the integration test suite where the major commercial systems return incomprehensible, multi-page antiderivatives, but Rubi returns a simple, compact one involving only elementary functions.

• A rule-based system solves problems by evaluating predicate tests in a decision tree to determine the appropriate rule to apply. For a balanced tree having n rules, log2 n test evaluations are required to select the appropriate rule. For example, since Rubi 4.8 has 6,178 rules in a relatively balanced tree, it only needs to evaluate 12.6 tests on average to select an integration rule.

To fully solve a problem several such applications may be required. For example, Rubi uses 308,301 rule applications to integrate the 55,430 problems in its test suite, for an average of 5.56 applications per problem.

Therefore, only about 70 (12.6 x 5.56) tests are required to fully integrate a typical problem in the test suite. What's more, the tests are fast and easy to perform. For example, common in Rubi are tests to determine if an exponent is an integer, fraction or symbolic; or if the discriminant of a quadratic (b2-4ac) is positive, negative or of unknown sign.

But since Rubi 4.8 is slowed down by its use of pattern matching to search down a list of over 6,000 rules, it is only able to integrate expressions at roughly the same rate as Mathematica's built-in integrator. However, the forthcoming version 5 of Rubi uses the highly efficient if-then-else control construct instead of pattern matching to select rules. Preliminary testing indicates Rubi 5 integrates expressions almost 2 orders of magnitude faster than Rubi 4.8 or Mathematica.

• Since virtually all computer algebra systems provide an if-then-else control construct, it will be relatively easy to port Rubi 5 to a variety of such systems. Already written are programs that automatically translate if-then-else decision trees written in Mathematica into the syntax used by 3 other computer algebra systems.

• In a rule-based system each rule is independent in the sense that it can be added to, or deleted from, the system without affecting the other rules. This modularity makes implementing such systems amenable to group development with individuals able to focus on their own areas of expertise.

• The decision tree of a rule-based system must provide an appropriate rule for every possible instance of a given form. During development, "holes" in the tree indicate exactly where new rules can and need to be discovered. Time and again during the development of Rubi, filling such holes has resulted in finding hitherto unknown (at least to me) integration formulas. Thus the process of implementing a rule-based system often leads to the discovery of some exciting new mathematics.

All the integration rules, test suite problems and results demonstrating the above benefits are freely available on Rubi's website.

#### It's Real

Finally it should be noted that Rubi is an ever improving, but imperfect, model of the actual integration decision tree existing in some Platonic sense. In the same way, software simulating a hurricane is just an imperfect model of the actual hurricane blowing in the physical world. So Rubi may be just a model, but she's a darn good-looking one.

Not only does an optimal integration decision tree exist, it is unique. Time and again, in order to achieve optimal antiderivatives for integrands of a given form, I was forced to restructure Rubi's decision tree and/or discover new rules. Often this repetitive process severely tested my deeply held belief that an optimal set of rules for integrating expressions must exist. But eventually the process always did converge, leaving me no choice in the design of the decision tree.

Rubi provides proof-of-concept of the utility of organizing mathematical knowledge in the form of a rule-based decision tree. Hopefully it will convince others in the math and computer science communities to join the rule-based revolution, and explore whether it is applicable to their field of interest as well.

The above is a preprint of an article to appear in the 25th anniversary issue of the Derive Newsletter published by the Derive & CAS-TI User Group.