Consider the instruction
X = A + B + C;To evaluate this a simple program will add A to B putting the total in a temporary T1. Then it will add T1 to C creating another temporary T2 which will be copied into X. T1 and T2 will sit around till the end of the execution of the statement and perhaps of the block. It would be faster if the program recognised that T1 was temporary and stored the sum of T1 and C back into T1 instead of creating T2 and then avoided the final copy by just assigning the contents of T1 to X rather than copying. In this case there will be no temporaries requiring deletion. (More precisely there will be a header to be deleted but no contents).
For an instruction like
X = (A * B) + (C * D);we can't easily avoid one temporary being left over, so we would like this temporary deleted as quickly as possible.
I provide the functionality for doing all this by attaching a status variable to each matrix. This indicates if the matrix is temporary so that its memory is available for recycling or deleting. Any matrix operation checks the status variables of the matrices it is working with and recycles or deletes any temporary memory.
An alternative or additional approach would be to use reference counting and delayed copying - also known as copy on write. If a program requests a matrix to be copied, the copy is delayed until an instruction is executed which modifies the memory of either the original matrix or the copy. If the original matrix is deleted before either matrix is modified, in effect, the values of the original matrix are transfered to the copy without any actual copying taking place. This solves the difficult problem of returning an object from a function without copying and saves the unnecessary copying in the previous examples.
There are downsides to the delayed copying approach. Typically, for delayed copying one uses a structure like the following:
Matrix | +------> Array Object | | | +------> Data array | | | +------- Counter | +------ Dimension informationwhere the arrows denote a pointer to a data structure. If one wants to access the Data array one will need to track through two pointers. If one is going to write, one will have to check whether one needs to copy first. This is not important when one is going to access the whole array, say, for a add operation. But if one wants to access just a single element, then it imposes a significant additional overhead on that operation. Any subscript operation would need to check whether an update was required - even read since it is hard for the compiler to tell whether a subscript access is a read or write.
Some matrix libraries don't bother to do this. So if you write A = B; and then modify an element of one of A or B, then the same element of the other is also modified. I don't think this is acceptable behaviour.
Delayed copy does not provide the additional functionality of my approach but I suppose it would be possible to have both delayed copy and tagging temporaries.
My approach does not automatically avoid all copying. In particular, you need use a special technique to return a matrix from a function without copying.