Tuning matvec library. ====================== I. Introduction Modern computers supports a hierarchy of memory. We can consider three level model of memory: registers (fastest), cache memory (medium), RAM memory (slow). Modern processor can make several tens of multiplications for time of getting operands from main memory. So efficient implementation of elemantary vector/matrix operations: dor product, product of a mastrix and a vector, matrix product, matrix inversion is far from elementary. If an operand is not found in cache, this event is called cache miss, data will be requested rom main memory and processor will wait for completeion of this operation. The purpose of optimization is to minimize the number of cache misses. It can be achived by immplementing such a modificaion of alogrithm which performs operations under block of vectors and matrix. At the same time such algoritms will have low performance at low dimensionns due to overheads for groupping operands in blocks. An effricient procedure matches several algorithms which have the optimal perfomance at a range of dimensions. Some operations, f.e. multiplication of rectangular matrixes, is done in highly optimized BLAS library. However, this library does not support operations under symmetric matrixes in upper triangular representation. Matvec library provides a set of highly optimized routines for these operations whcih in turn call BLAS routines. II. How to tune. Matvec supports a set of parameters which describes dimensions for block algorithms. Optiomal value of these parameters depends on an architecture of the specific processor. There is a program matvec_test which execute specific routine under consideration and measures its performance. A user can change values of the matvec parameters, compile the routien under consideration, link matvec_test with new version of matvec linrary and run a test. Parameters are located in file $MK5_ROOT/include/matvec.i . Upon completion tuning these paramters should be copied to $Mk5_ROOT/local/solve.i file since matvec.i is a temporary file. Then Calc/Solve should be re-compiled and re-linked with use of command $MK5_ROOT/support/make_all_mk5 III. Parameters to be tuned. 1) MUL_MM_IT_S ( C_s = A_r * B_r(T) ) where is A_r, B_r are rectangular matrices, C_s is a square symmetric matric un upper triangular representation. DB2__MUL_MM_IT_S -- maximal dimension when result C_s is first held in memory as a rectangular matrix and them isx transform to a sym. matrix. DB3__MUL_MM_IT_S -- Block size. 2) MUL_MM_TI_S ( C_s = A_r(T) * B_ ) where is A_r, B_r are rectangular matrices, C_s is a square symmetric matric un upper triangular representation. DB2__MUL_MM_T_S -- maximal dimension when result C_s is first held in memory as a rectangular matrix and them isx transform to a sym. matrix. DB3__MUL_MM_TI_S -- Block size. ... document will be completed later ... Leonid Petrov 20-MAY-2003 18:12:45