Tiny Vector Matrix library using Expression Templates | Sourceforge Project Page |
Some implementations are using a loop with temporaries even if there is a solution with ET. Than the loops are faster than MT.
element_wise
because they are element wise (and not strictly mathematical) operations.But there is more: some functions do element wise operations per se (e.g. vector addition) and are NOT inside namespace element_wise. Furthermore, all comparison operators perform element wise operations.
Functional
instead of naming them with the fncl_
prefix they currently have. (I did beforehand, and have thought better since). Unfortunately, this technique doesn't work well. I got compiler errors like:
template <class T> Functional::xyt<T>' is not a function conflict with `template <class E> xyz(Xpr<E>)' in call to `xyz'
when trying:
typedef Vector<double, 3> vector3d; vector3d t1(1,2,3); vector3d t2(t1); vector3d r; r = sqrt( t1 * t2 );
ADL (argument dependent lookup), aka Koenig Lookup, is causing the compiler to check for a match in namespace Functional, since the template instantiation is part of Functional (the Xpr stuff), it matches before the global namespace (the Vector stuff) is checked. Writing:
r = ::sqrt( t1 * t2 );
seems to solve the problem at first glance. However, to force the user of the library into this syntax is painful and could probably run cause other problems with other namespaces (I haven't checked this). Therefore, all "Functionals" have the prefix fncl_.
This is very easy to understood if you see a matrix-vector product. Both contain different data in different (unique, non-overlapping) memory regions -- hence, they are alias free. Contrast this with a matrix-matrix multiply which maybe can have an aliasing, e.g. . When source and destination memory regions are the same, the computed results may be wrong. (Probably they will be.) But, is alias free.
Let's see an example in detail:
Matrix<double,3,3> M1; M1 = 1,2,3,4,5,6,7,8,9; cout << "M1 = " << M1 << endl; M1 = trans(M1); cout << "M1 = " << M1 << endl;
M1 = Matrix<d, 3, 3> = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] M1 = Matrix<d, 3, 3> = [ [1, 4, 7], [4, 5, 8], [7, 8, 9] ]
Unfortunately, to avoid the aliasing problem, you must use temporaries as shown here:
matrix_type temp_A(A); A = temp_A * B; cout << "matrix_type temp_A(A);\n" << "A = temp_A * B = " << A << endl;
Starting with tvmet release 1.4.1 there is a new function alias. These function use a proxy to call special member functions of the Matrix/Vector class. These member functions introduce the temporary for you.
typedef tvmet::Matrix<double, 3, 3> matrix_type; matrix_type M; std::generate(M.begin(), M.end(), tvmet::util::Incrementor<matrix_type::value_type>()); std::cout << "M = " << M << std::endl; alias(M) = trans(M); std::cout << "M = " << M << std::endl;
M = [ [1, 4, 7], [2, 5, 8], [3, 6, 9] ]
Unfortunately, the g++ compiler throws in the towel sometimes even on transposing matrices. Because of this, tvmet offers specialized functions which speed up at runtime (about factor 2 ... 3) using meta templates.
using namespace tvmet; Matrix<double, 6, 3> M1(0); // will be transposed to be conform to vector size Vector<double, 6> v1(0); Vector<double, 3> v2(0); M1 = ... v1 = ... v2 = Mtx_prod(M1, v1); // equal to: v2 = trans(M1)*v1;
MMt_prod
MtM_prod
trans_prod
Internally, an expression template may contain other expression templates (meta templates inside as well as) too - the compiler will unroll all of these expression into a single resultant expression (which is a hard job). Sometimes the code generated from this is worse (from a performance point of view) than just using simple temporaries.
You can chain matrix-matrix and matrix-vector operations without writing temporaries by yourself (if this is what you want).
tvmet::Matrix<double,3,2> B; tvmet::Matrix<double,3,3> D; tvmet::Matrix<double,2,2> K; B = -0.05, 0, 0, 0.05, 0.05, -0.05; D = 2000, 1000, 0, 1000, 2000, 0, 0, 0, 500; K = trans(B) * D * B;
// as before K = tvmet::Matrix<double,2,3>(trans(B) * D) * B;
K = prod(tvmet::Matrix<double,2,3>(prod(trans(B), D)), B);
At this moment an intelligent cache and pre-evaluating strategy is missing by tvmet.
some notes ... on temporaries
Author: |