Tiny Vector Matrix library using Expression Templates | Sourceforge Project Page |
mutable
keyword is required. This is used by the CommaInitializer only.
typename
keyword is used exhaustively here.
There are certain difficulties - see ambiguous overload for ... Compiler Error (also, please read about GNU C++ Compiler v2.96 (Rh7.x, MD8.x)). Furthermore, there are problems with functions and operators declared in the namespace element_wise
- the compiler doesn't seem to find them--even though the compiler does know about namespace tvmet. It appears to be a problem with nested namespaces and the compiler's ability to perform function/operator lookup, especially during regression tests: matrix /= matrix
compiles inside a single file but not at the regression tests--which is a contradiction in terms.
Porting to gcc v2.95.3 requires a lot of knowledge and effort--unfortunately, I don't have enough of either. The examples do compile and the regression tests build partially.
Matrix and vector operators are working, but don't expect too much.
Blitz++ is using a hasFastAccess() flag to perform a check for the use of _bz_meta_vecAssign::fastAssign (without bounds checking) or _bz_meta_vecAssign::assign (with bounds checking). This isn't really necessary for operations on blitz::TinyVector, since it's always true. Nevertheless, it is important for the produced asm code using the gcc-c++-2.96-0.48mdk. Generally the code for Blitz++ using the gcc-2.96 is better than tvmet because of this (tested!).
I got into trouble with stl_relops.h where miscellaneous operators are defined. A simple define of __SGI_STL_INTERNAL_RELOPS in the config header doesn't solve the problem, only the commented out header version, see ambiguous overload for ... Compiler Error. Because of this problem, the regression tests don't compile with this version. Projects with do not use the relational operators are not affected.
It seems that the inlining performed by this compiler collection isn't very smart. I got a lot of warnings: can't inline call to ... So, it would be best to use the GNU C++ Compiler v3.0.x and later compilers.
Due to the nature of ET and MT there is a need for a high level of inlining. The v3.0.x seems to do this well as compared to the v2.9x compilers which produce inline warnings.
This compiler works great with the <a href=http://www.stlport.org">STLPort-4.5.3 implementation of the STL/C++ Library, Tiny Vector and Matrix template library and cpp-unit.
The primary goal is conformance to the standard ISO/IEC 14882:1998.
There are some problems with the GNU C++ compiler collection on the regression test due to some bugs (IMO),
Anyway, here the code from examples/ray.cc
on gcc 3.3.3 using -O2 -DTVMET_OPTIMIZE
movl 16(%ebp), %edx movl 12(%ebp), %ebx movl 8(%ebp), %esi fldl 8(%edx) fldl 16(%edx) fmull 16(%ebx) fxch %st(1) movl %ebx, -24(%ebp) fmull 8(%ebx) movl %edx, -32(%ebp) fldl (%edx) fmull (%ebx) fxch %st(1) movl %edx, -60(%ebp) movl %edx, -12(%ebp) faddp %st, %st(2) faddp %st, %st(1) fadd %st(0), %st fstpl -56(%ebp) movl -56(%ebp), %ecx movl -52(%ebp), %eax movl %ecx, -20(%ebp) movl %eax, -16(%ebp) movl %ecx, -40(%ebp) movl %eax, -36(%ebp) movl %ecx, -68(%ebp) movl %eax, -64(%ebp) fldl (%edx) fmull -20(%ebp) fsubrl (%ebx) fstpl (%esi) fldl 8(%edx) fmull -20(%ebp) fsubrl 8(%ebx) fstpl 8(%esi) fldl 16(%edx) fmull -20(%ebp) fsubrl 16(%ebx) fstpl 16(%esi) addl $64, %esp popl %ebx popl %esi popl %ebp ret
There is no assembler output for our examples/ray.cc
, since I don't have this compiler yet (yes, I need to update my linux system ;-)
The problem is related to anonymous enum types which doesn't like these release.
If you have used it successfully including regression and/or benchmark tests, please give me an answer.
Maybe there is a solution with other standard library implementations like STLPort (On a quick try the STL Port doesn't recognize the pgCC). If you know more about this, please let me know.
Anyway, the code produced is very poor even if I use high inlining levels like the command line option -Minline=levels:100 which increases the compile time dramatically! The benchmark tests have not been done. Unfortunately, my trial period has expired. I haven't any idea if this compiler will pass the regression tests.
The code produced isn't very compact compared with the intel or gnu compiler. Anyway it works, but the compiler time increases dramatically even on higher inline levels.
The produced code looks good but, I haven't done a benchmark to compare it with the gcc-3.0.x since the compile time increases for the benchmark test dramatically.
I have not run any regression tests due to the compile time needed by my AMD K6/400 Linux box ...
examples/ray.cc
more compact than the GNU C++ Compiler v3.3.
Anyway, here the code from examples/ray.cc
using -O2 -DTVMET_OPTIMIZE
movl 4(%esp), %ecx movl 8(%esp), %edx movl 12(%esp), %eax fldl (%edx) fmull (%eax) fldl 8(%edx) fmull 8(%eax) fldl 16(%edx) fmull 16(%eax) faddp %st, %st(1) faddp %st, %st(1) fldl (%eax) fxch %st(1) fadd %st(0), %st fmul %st, %st(1) fxch %st(1) fsubrl (%edx) fstpl (%ecx) fldl 8(%eax) fmul %st(1), %st fsubrl 8(%edx) fstpl 8(%ecx) fldl 16(%eax) fmulp %st, %st(1) fsubrl 16(%edx) fstpl 16(%ecx) ret
The Microsoft Visual C++ Toolkit 2003 and Visual C++ prior 7.1 do not compile - you will get an undefined internal error unfortunally.
Anyway, here the code from examples/ray.cc
:
push ebp mov ebp, esp and esp, -8 ; fffffff8H sub esp, 28 ; 0000001cH mov eax, DWORD PTR _ray$[ebp] mov ecx, DWORD PTR _surfaceNormal$[ebp] fld QWORD PTR [eax+16] fmul QWORD PTR [ecx+16] push ebx fld QWORD PTR [eax+8] push esi fmul QWORD PTR [ecx+8] push edi mov edi, DWORD PTR $T35206[esp+52] faddp ST(1), ST(0) mov DWORD PTR $T35027[esp+60], edi fld QWORD PTR [eax] pop edi fmul QWORD PTR [ecx] faddp ST(1), ST(0) fadd ST(0), ST(0) fstp QWORD PTR $T35206[esp+36] mov esi, DWORD PTR $T35206[esp+40] mov edx, DWORD PTR $T35206[esp+36] mov ebx, DWORD PTR $T35265[esp+40] mov DWORD PTR $T35027[esp+44], edx mov edx, DWORD PTR _reflection$[ebp] mov DWORD PTR $T35027[esp+48], esi fld QWORD PTR $T35027[esp+44] fmul QWORD PTR [ecx] pop esi mov DWORD PTR $T35027[esp+36], ebx pop ebx fsubr QWORD PTR [eax] fstp QWORD PTR [edx] fld QWORD PTR $T35027[esp+36] fmul QWORD PTR [ecx+8] fsubr QWORD PTR [eax+8] fstp QWORD PTR [edx+8] fld QWORD PTR $T35027[esp+36] fmul QWORD PTR [ecx+16] fsubr QWORD PTR [eax+16] fstp QWORD PTR [edx+16] mov esp, ebp pop ebp
Author: |