Consider the instruction
X = B - X;A simple program will subtract X from B, store the result in a temporary T1 and copy T1 into X. It would be faster if the program recognised that the result could be stored directly into X. This would happen automatically if the program could look at the instruction first and mark X as temporary.
C programmers would expect to avoid the same problem with
X = X - B;by using an operator -=
X -= B;However this is an unnatural notation for non C users and it may be nicer to write X = X - B; and know that the program will carry out the simplification.
Another example where this intelligent analysis of an instruction is helpful is in
X = A.i() * B;where i() denotes inverse. Numerical analysts know it is inefficient to evaluate this expression by carrying out the inverse operation and then the multiply. Yet it is a convenient way of writing the instruction. It would be helpful if the program recognised this expression and carried out the more appropriate approach.
I regard this interpretation of A.i() * B as just providing a convenient notation. The objective is not to correct the errors of people who are unaware of the inefficiency of A.i() * B if interpreted literally.
There is a third reason for the two-stage evaluation of expressions and this is probably the most important one. In C++ it is quite hard to return an expression from a function such as (*, + etc) without a copy. This is particularly the case when an assignment (=) is involved. The mechanism described here provides one way for avoiding this in matrix expressions.
To carry out this intelligent analysis of an instruction matrix expressions are evaluated in two stages. In the the first stage a tree representation of the expression is formed. For example (A+B)*C is represented by a tree
* / \ + C / \ A B
Rather than adding A and B the + operator yields an object of a class AddedMatrix which is just a pair of pointers to A and B. Then the * operator yields a MultipliedMatrix which is a pair of pointers to the AddedMatrix and C. The tree is examined for any simplifications and then evaluated recursively.
Further possibilities not yet included are to recognise A.t()*A and A.t()+A as symmetric or to improve the efficiency of evaluation of expressions like A+B+C, A*B*C, A*B.t() (t() denotes transpose).
One of the disadvantages of the two-stage approach is that the types of matrix expressions are determined at run-time. So the compiler will not detect errors of the type
Matrix M; DiagonalMatrix D; ....; D = M;
We don't allow conversions using = when information would be lost. Such errors will be detected when the statement is executed.