The main class of linear algebra problems for which symbolic manipulation systems are helpful, however, is for systems with parameters. These can usually be cast in the form of matrices with multivariate polynomial entries. For simplicity we concentrate first on the univariate case.
The main goals in this chapter are to show that the solutions of some parametric linear systems can be found using standard computer algebra packages, and that these solutions can be evaluated efficiently, and that they can give insight into the nature of the problem under study. A secondary goal is to show the limitations of this approach (again, efficiency and numerical stability of the resulting expressions are critical).
We will also see for the first time the problem of `specialization' --- it is common for computer algebra programs to produce an answer for a parametrized problem which is correct for `most' values of the parameters but is wrong for some values of the parameters. These errors can be visible (when, for example, singularities appear) or invisible. It is the invisible errors that concern the mathematician, as it is often the `special cases' that give the most insight into the nature of the problem.
We introduce some new (but simple) mathematics, the Row Echelon Decomposition, to help deal with this problem. This is an example of the `proviso' approach in computer algebra [34].
Consider the following example problem, adapted from the introductory text [82] and discussed in the present context in [36].
Problem. below is the augmented matrix for a system of linear equations. Find the values of the parameter x for which the system has no, one, and infinitely many solutions.
During row reduction, we divide by pivots and , both of which could equal zero and thus define special cases. Putting these cases to one side, we obtain the general row-echelon form of as
We must now go back to the original problem and calculate the special cases. The points we wish to make are clear if we only consider one root of , namely x=0, and of , namely . These special cases then give the row-echelon forms
and
From the above analysis the complete solution to the original problem is clear.
The important point to notice in this example is the fact that the special cases are only apparent while the row reductions are being carried out. However, computer algebra systems such as Maple typically take the problem as input and return one and only one answer. When this is the case, the only matrices that the user sees are (1) and (2), and all of the intermediate manipulations are lost.
Exercises
rref
to calculate the row echelon
form of (1).
solve by hand . Use Maple to solve the system, and compare your solutions with Maple's. Is Maple correct?
One can expect that the special case would be noticed by most users, because in that case the entries in the last column are undefined (of course the only x for which this is true are complex, and thus may be of no interest). There is nothing, however, in the general result ( 2) that gives any hint that is a special case and that the general result (2) must then be replaced by (3). Clearly some information has been lost during the transformation of (1) into (2). It is for all cases like this that we wish to devise a better way of proceeding.
There are essentially four approaches to this problem:
rref
and it is
usually satisfactory, if you don't mind being lied to occasionally.
(To be fair, see the remark following this list).
Remark on indeterminates versus parameters. Item (1) above is a fair criticism of Maple (and all other CAS) only if you take the view that parameters have values: p = 1 + x is a real number, given a certain (as yet unknown) real number x. This is the most natural view of polynomials and functions, but it is not the only one. Many abstract algebra courses discuss the algebra of polynomials in an indeterminate x, with (say) complex coefficients. In this perspective, the zero polynomial is , and p = 0 implies that p is the zero polynomial (and says nothing about x). Thus p = 1 + x is never zero, and so if you divide another polynomial by it, there is no problem. Thus in this perspective and that is the end of the story --- x is indeterminate and hence can never equal any particular value, and so in particular is not -1. This viewpoint is not natural to most people (I seem to remember arguments about it when first taking number theory some years ago) but does simplify the work greatly, by eliminating the worrisome zero divisor problem from your algebra.
This algebraic point of view is the one taken by most computer algebra systems, and not just in the context of polynomials or linear systems of equations. The Risch decision procedure, for example, is a procedure for working in differential fields and is essentially algebraic, and a related analytical difficulty surfaces there (see Chapter 7). This book takes the analytic point of view (because I think it is more interesting and useful --- please feel free to disagree) and I will point out some consequences of this shift in point of view on the use of computer algebra systems.
As the simplest possible example, consider solving the equation kx = 1 for x. All computer algebra systems (except Mathematica using its Reduce command, and Theorist) return , which is true only when k is not 0 (but is always true from the algebraic point of view). This, however, might be counted acceptable because the failure is visible (i.e. easily noticed) in the special case k=0. Now consider solving the equation kx = k. All systems, with the same exceptions, give only the (algebraically but not analytically correct) solution x = 1. The fact that there was a division by k is no longer evident from the general solution, and thus a special case has been lost from view. Row reduction is essentially similar to the last case, because it is possible to lose any visible reminder of the fact that divisions were made by quantities that could be zero.
Note that the definition of row-echelon form contributes to the difficulties in its automatic implementation. First, row reduction is usually presented as a transformation of one matrix into another. This old-fashioned point of view has largely been replaced by the idea of matrix factorizations in other topics of linear algebra (for example, the Gram-Schmidt reduction has been replaced by QR-factorization, Gauss-Jordan reduction by PA = LU factorization, and even the FFT is now being looked at from the factorization point of view [102]). The fact that this is a transformation, rather, means that some information is hidden from the user.
It is the normalization, however, which causes the central difficulty. A key property of row-echelon form is the fact that it is unique --- many proofs in linear algebra rely on this fact --- and to obtain uniqueness, we must normalize the leading term in each row to 1, thus forcing the division that is the point of discussion. One way to tackle the problem would be to change the definition of row-echelon form to remove the need to normalize to 1, and this leads, for example, to the Hermite normal form for matrices of integers or univariate polynomials (see e.g. [80]). However, the Hermite normal form is not defined for the general algebraic objects encountered in applied mathematics, necessarily. In any event, the same difficulty of `losing' a division by a parameter can be constructed for the Hermite normal form by considering polynomials with parameters in the coefficients.
By defining the row-echelon decomposition or factorization (the terms are used interchangeably here) we can resolve the difficulties described above, without requiring changes either in the definitions of linear algebra or in computer-algebra systems. Factorization offers many advantages over a transformation. Factorizations are written as equations, in which the given matrix is written as a product of matrices with special properties. The key advantage of this for our purpose is that if we express row reduction as an equation, the divisors must appear somewhere in that equation, and that means that a computer system does not have to watch during the reduction to identify special cases for us. The row-echelon decomposition thus preserves all information about the original matrix, and moreover presents it in a convenient form for the investigation of special cases.
The existence and continuity of a row-echelon decomposition is examined in Theorems 1 and 2 of [36]. We reproduce the theorems and examples here.
Theorem Theorem 1: If A is an matrix over C, there exists a permutation matrix P, a unit lower triangular matrix L, and a nonsingular upper triangular matrix U such that
where R is the (unique) row-echelon form of A.
Remark. This decomposition subsumes the definition of the row echelon form and the LU-factorization of a matrix. If a (square) matrix is nonsingular, its row-echelon form R is just the identity matrix. If A is singular or non-square, then there exists a square nonsingular matrix F such that A = F R where R is the unique row-echelon form of A [82]. Applying the LU-factorization to F gives the decomposition above. Further, we need do no extra computation to discover P, L, and U --- we merely remember the steps in the computation of R.
Definition. A family of matrices is continuous at x=a if for all there exists such that implies . The 2-norm is used for explicitness, both for the k-dimensional parameter space and for the space of matrices.
Theorem Theorem 2. Let be a matrix depending upon one or more variables or parameters x. Let A be continuous at a point x=a. If has a row-echelon decomposition in a neighbourhood of x=a (with some particular permutation matrix P)
with , then R is continuous at a.
Remark. There may be only one permutation matrix P which gives such a factorization, or there may be several. The hypotheses of the theorem say that if such a factorization exists with , then R is continuous. Verification that such a factorization exists is easy --- one just computes it.
Corollary 1: The only values of x at which can be discontinuous are those for which .
For proofs, see [36].
The continuity of the row echelon form is exactly the analytic expression of the computer-algebra concept of `correctness on specialization' for this problem. That is, the row echelon form will be correct on specialization if we can substitute the value of the parameter into the formula for and get the correct row echelon form of . Since the expressions contained in will always be continuous functions of the parameters if the original matrix is continuous at x, substitution of values into always gets the limiting value: if the actual row echelon form is discontinuous at that point, the computed formula will be incorrect `on specialization'.
Exercises
c := 1 while (n > 0) do # loop invariant: u = c*x^n at this point if n is even then n := n/2 x := x^2 else c := c*x n := n - 1 fi od terminate and return c (which must now be x^n).
mpower(A,n)
. Test your routine.
and check your answer.