The main class of linear algebra problems for which symbolic manipulation systems are helpful, however, is for systems with parameters. These can usually be cast in the form of matrices with multivariate polynomial entries. For simplicity we concentrate first on the univariate case.
The main goals in this chapter are to show that the solutions of some parametric linear systems can be found using standard computer algebra packages, and that these solutions can be evaluated efficiently, and that they can give insight into the nature of the problem under study. A secondary goal is to show the limitations of this approach (again, efficiency and numerical stability of the resulting expressions are critical).
We will also see for the first time the problem of `specialization' --- it is common for computer algebra programs to produce an answer for a parametrized problem which is correct for `most' values of the parameters but is wrong for some values of the parameters. These errors can be visible (when, for example, singularities appear) or invisible. It is the invisible errors that concern the mathematician, as it is often the `special cases' that give the most insight into the nature of the problem.
We introduce some new (but simple) mathematics, the Row Echelon Decomposition, to help deal with this problem. This is an example of the `proviso' approach in computer algebra [34].
Consider the following example problem, adapted from the introductory text [82] and discussed in the present context in [36].
Problem.
below is the augmented matrix
for a system of linear equations. Find the values of the parameter
x for which the system has no, one, and infinitely many solutions.
During row reduction, we divide by
pivots
and
, both of which could equal zero
and thus define special cases.
Putting these cases
to one side, we obtain the general row-echelon form of
as
We must now go back to the original problem and calculate the special cases.
The points we wish to make are clear if we only consider one root
of
, namely x=0, and of
, namely
.
These special cases then give the
row-echelon forms
and
From the above analysis the complete solution to the original problem is clear.
The important point to notice in this example is the fact that the special cases are only apparent while the row reductions are being carried out. However, computer algebra systems such as Maple typically take the problem as input and return one and only one answer. When this is the case, the only matrices that the user sees are (1) and (2), and all of the intermediate manipulations are lost.
Exercises
rref to calculate the row echelon
form of (1).

solve by hand
. Use Maple to solve the
system, and compare your solutions with Maple's. Is Maple correct?
One can expect that the special case
would be noticed by most users,
because in that case the entries in the
last column are undefined (of course the only x for
which this is true are complex, and thus may be of no
interest).
There is nothing, however, in the general result (
2) that gives any hint that
is a special case and that the general result (2) must then
be
replaced by (3).
Clearly some information has been lost during the
transformation of (1) into (2).
It is for all cases like this that we wish to
devise a better way of proceeding.
There are essentially four approaches to this problem:
rref and it is
usually satisfactory, if you don't mind being lied to occasionally.
(To be fair, see the remark following this list).
?' `Is x > my/temporary/local/variable?' This is
the sort of thing that Macsyma does. Sometimes it is what you want,
but often the CAS winds up asking questions that the user
cannot possibly answer. Further, there may be an awful lot
of questions, more than the user can bear to answer. And finally,
this is not at all suitable to a batch or file-reading environment.
'.
If the proviso is in fact invalid --- `Oh, nuts, I forgot to specify that
' --- then it is the user's responsibility
to re-do the calculation.
Remark on indeterminates versus parameters.
Item (1) above is a fair criticism of Maple (and all other CAS)
only if you take the
view that parameters have values: p = 1 + x is a real number,
given a certain (as yet unknown) real number x. This is the
most natural view of polynomials and functions, but it is not
the only one. Many abstract algebra courses discuss the algebra
of polynomials in an indeterminate x, with (say) complex
coefficients. In this perspective, the zero polynomial is
, and p = 0 implies that p is the zero polynomial (and
says nothing about x). Thus p = 1 + x is never
zero, and so if you divide another polynomial by it, there is no
problem. Thus in this perspective
and that is the end of the story --- x is indeterminate and hence
can never equal any particular value, and so in particular is not
-1. This viewpoint is not natural to most people (I seem to
remember arguments about it when first taking number theory some years
ago) but does simplify the work greatly, by eliminating the
worrisome zero divisor problem from your algebra.
This algebraic point of view is the one taken by most computer algebra systems, and not just in the context of polynomials or linear systems of equations. The Risch decision procedure, for example, is a procedure for working in differential fields and is essentially algebraic, and a related analytical difficulty surfaces there (see Chapter 7). This book takes the analytic point of view (because I think it is more interesting and useful --- please feel free to disagree) and I will point out some consequences of this shift in point of view on the use of computer algebra systems.
As the simplest possible example,
consider solving the equation kx = 1 for x. All
computer algebra systems (except Mathematica using its
Reduce command, and Theorist) return
,
which is true only when k is not 0
(but is always true from the algebraic point of view).
This, however, might be counted acceptable because the
failure is
visible (i.e. easily noticed) in the special case k=0.
Now consider solving
the equation kx = k.
All systems, with the same exceptions, give only the
(algebraically but not analytically correct) solution x = 1.
The fact that there was a division by k is no longer
evident
from the general solution, and thus a special case has been
lost from view.
Row reduction is essentially similar to the last case,
because it is possible
to lose any visible reminder of the fact that divisions were
made by
quantities that could be zero.
Note that the definition of row-echelon form contributes to the difficulties in its automatic implementation. First, row reduction is usually presented as a transformation of one matrix into another. This old-fashioned point of view has largely been replaced by the idea of matrix factorizations in other topics of linear algebra (for example, the Gram-Schmidt reduction has been replaced by QR-factorization, Gauss-Jordan reduction by PA = LU factorization, and even the FFT is now being looked at from the factorization point of view [102]). The fact that this is a transformation, rather, means that some information is hidden from the user.
It is the normalization, however, which causes the central difficulty. A key property of row-echelon form is the fact that it is unique --- many proofs in linear algebra rely on this fact --- and to obtain uniqueness, we must normalize the leading term in each row to 1, thus forcing the division that is the point of discussion. One way to tackle the problem would be to change the definition of row-echelon form to remove the need to normalize to 1, and this leads, for example, to the Hermite normal form for matrices of integers or univariate polynomials (see e.g. [80]). However, the Hermite normal form is not defined for the general algebraic objects encountered in applied mathematics, necessarily. In any event, the same difficulty of `losing' a division by a parameter can be constructed for the Hermite normal form by considering polynomials with parameters in the coefficients.
By defining the row-echelon decomposition or factorization (the terms are used interchangeably here) we can resolve the difficulties described above, without requiring changes either in the definitions of linear algebra or in computer-algebra systems. Factorization offers many advantages over a transformation. Factorizations are written as equations, in which the given matrix is written as a product of matrices with special properties. The key advantage of this for our purpose is that if we express row reduction as an equation, the divisors must appear somewhere in that equation, and that means that a computer system does not have to watch during the reduction to identify special cases for us. The row-echelon decomposition thus preserves all information about the original matrix, and moreover presents it in a convenient form for the investigation of special cases.
The existence and continuity of a row-echelon decomposition is examined in Theorems 1 and 2 of [36]. We reproduce the theorems and examples here.
Theorem Theorem 1:
If A is an
matrix over C,
there exists a permutation matrix P,
a unit lower triangular matrix L,
and a nonsingular upper triangular matrix U such
that

where R is the (unique) row-echelon form of A.
Remark. This decomposition subsumes the definition of the row echelon form and the LU-factorization of a matrix. If a (square) matrix is nonsingular, its row-echelon form R is just the identity matrix. If A is singular or non-square, then there exists a square nonsingular matrix F such that A = F R where R is the unique row-echelon form of A [82]. Applying the LU-factorization to F gives the decomposition above. Further, we need do no extra computation to discover P, L, and U --- we merely remember the steps in the computation of R.
Definition. A family of matrices
is
continuous at x=a if
for all
there exists
such that
implies
. The 2-norm is used for
explicitness, both
for the k-dimensional parameter space and for the space of
matrices.
Theorem Theorem 2.
Let
be a matrix depending upon one or more
variables
or
parameters x. Let A be continuous at a point x=a.
If
has a row-echelon decomposition in a
neighbourhood
of x=a (with some particular permutation matrix P)

with
, then R is continuous at a.
Remark. There may be only one permutation matrix P which
gives such a factorization, or there may be several. The hypotheses
of the theorem say that if such a factorization exists with
, then R is continuous. Verification that such
a factorization exists is easy --- one just computes
it.
Corollary 1: The only values of x at which
can be
discontinuous are those for which
.
For proofs, see [36].
The continuity of the row echelon form is exactly the analytic
expression of the computer-algebra concept of `correctness on
specialization' for this problem.
That is, the row echelon form will be correct
on specialization if we can substitute the value of the parameter
into the formula for
and get the correct row echelon
form of
. Since the expressions contained in
will
always be continuous functions of the parameters if the original
matrix is continuous at x, substitution of values into
always gets the limiting value: if the actual row echelon form
is discontinuous at that point, the computed formula will be
incorrect `on specialization'.
Exercises
that discontinuous changes in the
row echelon form may occur.
Consider

and
show that
if and only if
is an eigenvalue of B.
some of the pivots are zero, in spite of the fact that
is not an eigenvalue. Discuss how to remedy this, and
the consequences for the theorem.
the row-echelon form might be continuous.
Consider

when b = a,
the row echelon form is correct unless also c=a.
This problem shows that Theorem 2 is as strong as
possible, in that parameter values rendering U singular
can only be said to `need further investigation'.
given x and n runs as follows.
c := 1
while (n > 0) do # loop invariant: u = c*x^n at this point
if n is even then
n := n/2
x := x^2
else
c := c*x
n := n - 1
fi
od
terminate and return c (which must now be x^n).
.
mpower(A,n). Test your routine.
where A is

and check your answer.