Tiny Vector Matrix library using Expression Templates

Compiler Support

Contents:

General compiler Requirements

This library is designed for portability - no compiler specific extensions are used. Nevertheless, there are a few requirements: (These are all a part of the C++ standard.)

Support for the mutable keyword is required. This is used by the CommaInitializer only.

The typename keyword is used exhaustively here.

The namespace concept is required. The tvmet library is itself is a namespace. To avoid collisions of operators, there is also an element_wise namespace within tvmet.

Partial specialization is needed for the extrema functions min and max to distinguish between vectors and matrices. This allows tvmet to return an object with a specific behavior. (The location of an extremum in a matrix has a (row, column) position whereas a vector extremum has only a single index for its position).

The GNU Compiler Collection

The GNU compiler collection is mainly used for developing this library. Moreover, it does compile the library the fastest.

GNU C++ Compiler v2.95.3

Gcc v2.95.3 is the last official release of the version 2 series from gnu.org. Since this compiler features the General compiler Requirements it does work, but only partial.

There are certain difficulties - see ambiguous overload for ... Compiler Error (also, please read about GNU C++ Compiler v2.96 (Rh7.x, MD8.x)). Furthermore, there are problems with functions and operators declared in the namespace element_wise - the compiler doesn't seem to find them--even though the compiler does know about namespace tvmet. It appears to be a problem with nested namespaces and the compiler's ability to perform function/operator lookup, especially during regression tests: matrix /= matrix compiles inside a single file but not at the regression tests--which is a contradiction in terms.

Porting to gcc v2.95.3 requires a lot of knowledge and effort--unfortunately, I don't have enough of either. The examples do compile and the regression tests build partially.

Matrix and vector operators are working, but don't expect too much.

GNU C++ Compiler v2.96 (Rh7.x, MD8.x)

This compiler isn't an official release of the GNU Compiler group but shipped by Red Hat and Co.

Blitz++ is using a hasFastAccess() flag to perform a check for the use of _bz_meta_vecAssign::fastAssign (without bounds checking) or _bz_meta_vecAssign::assign (with bounds checking). This isn't really necessary for operations on blitz::TinyVector, since it's always true. Nevertheless, it is important for the produced asm code using the gcc-c++-2.96-0.48mdk. Generally the code for Blitz++ using the gcc-2.96 is better than tvmet because of this (tested!).

I got into trouble with stl_relops.h where miscellaneous operators are defined. A simple define of __SGI_STL_INTERNAL_RELOPS in the config header doesn't solve the problem, only the commented out header version, see ambiguous overload for ... Compiler Error. Because of this problem, the regression tests don't compile with this version. Projects with do not use the relational operators are not affected.

It seems that the inlining performed by this compiler collection isn't very smart. I got a lot of warnings: can't inline call to ... So, it would be best to use the GNU C++ Compiler v3.0.x and later compilers.

GNU C++ Compiler v3.0.x

These compiler produce better code than the GNU C++ Compiler v2.96 (Rh7.x, MD8.x)! Even the problems with blitz++ fastAssign have vanished. And this compiler conforms to the standard. The regression tests does compile and run successfully.

Due to the nature of ET and MT there is a need for a high level of inlining. The v3.0.x seems to do this well as compared to the v2.9x compilers which produce inline warnings.

This compiler works great with the <a href=http://www.stlport.org">STLPort-4.5.3 implementation of the STL/C++ Library, Tiny Vector and Matrix template library and cpp-unit.

GNU C++ Compiler v3.1

tvmet does compile with this new GNU C++ compiler. The produced code looks as good as the code created by GNU C++ Compiler v3.0.x. (Does anyone have time to make a benchmark?)

The primary goal is conformance to the standard ISO/IEC 14882:1998.

GNU C++ Compiler v3.2.x

The once again changed Application Binary Interface (ABI) doesn't affect tvmet since it isn't a binary library--it's only compiled templates inside the client code.

There are some problems with the GNU C++ compiler collection on the regression test due to some bugs (IMO),

See also:: Failed regression tests.

GNU C++ Compiler v3.3

Tested and works fine. Only some warnings on failed inlining which doesn't concern tvmet directly.

Anyway, here the code from examples/ray.cc on gcc 3.3.3 using -O2 -DTVMET_OPTIMIZE

Assembler (IA-32 Intel® Architecture):

  movl  16(%ebp), %edx
  movl  12(%ebp), %ebx
  movl  8(%ebp), %esi
  fldl  8(%edx)
  fldl  16(%edx)
  fmull 16(%ebx)
  fxch  %st(1)
  movl  %ebx, -24(%ebp)
  fmull 8(%ebx)
  movl  %edx, -32(%ebp)
  fldl  (%edx)
  fmull (%ebx)
  fxch  %st(1)
  movl  %edx, -60(%ebp)
  movl  %edx, -12(%ebp)
  faddp %st, %st(2)
  faddp %st, %st(1)
  fadd  %st(0), %st
  fstpl -56(%ebp)
  movl  -56(%ebp), %ecx
  movl  -52(%ebp), %eax
  movl  %ecx, -20(%ebp)
  movl  %eax, -16(%ebp)
  movl  %ecx, -40(%ebp)
  movl  %eax, -36(%ebp)
  movl  %ecx, -68(%ebp)
  movl  %eax, -64(%ebp)
  fldl  (%edx)
  fmull -20(%ebp)
  fsubrl  (%ebx)
  fstpl (%esi)
  fldl  8(%edx)
  fmull -20(%ebp)
  fsubrl  8(%ebx)
  fstpl 8(%esi)
  fldl  16(%edx)
  fmull -20(%ebp)
  fsubrl  16(%ebx)
  fstpl 16(%esi)
  addl  $64, %esp
  popl  %ebx
  popl  %esi
  popl  %ebp
  ret

GNU C++ Compiler v3.4.x

The compiler 3.4.3 works fine, starting with tvmet release 1.7.1. The problem is the correct syntax for the CommaInitializer template declaration and implementation.

There is no assembler output for our examples/ray.cc, since I don't have this compiler yet (yes, I need to update my linux system ;-)

GNU C++ Compiler v4.0.x

Forget it. I spend weeks to find a solution for this compiler release which works on older releases and doesn't make big changes.

The problem is related to anonymous enum types which doesn't like these release.

GNU C++ Compiler v4.0.x

Works fine - tested with 4.1.2 to be more precise.

Kai C++

This has not been tested. Unfortunately Kai's compiler is no longer shipped -- one should use the Intel compiler instead (see here).

If you have used it successfully including regression and/or benchmark tests, please give me an answer.

Portland Group Compiler Technology

Portland Group C++ 3.2

The Portland Group C++ compiler is shipped with the RogueWave Standard C++ Library which provides conformance to the standard. Unfortunately, the <cname> C library wrapper headers and the C++ overloads of the math functions are not provided on all platforms, see <http://www.cug.com/roundup>. The download evaluation version 3.2-4 for Linux is affected for example. At first glance, it does compile with pgCC since it has has the great EDG front-end.

Maybe there is a solution with other standard library implementations like STLPort (On a quick try the STL Port doesn't recognize the pgCC). If you know more about this, please let me know.

Anyway, the code produced is very poor even if I use high inlining levels like the command line option -Minline=levels:100 which increases the compile time dramatically! The benchmark tests have not been done. Unfortunately, my trial period has expired. I haven't any idea if this compiler will pass the regression tests.

Portland Group C++ 5.1

The Portland Group C++ compiler is shipped with the STLport Standard C++ Library, cool!

The code produced isn't very compact compared with the intel or gnu compiler. Anyway it works, but the compiler time increases dramatically even on higher inline levels.

Intel Compiler

Intel Compiler v5.0.1

This compiler complains even more than gcc-3.0.x regarding template specifiers (e.g. correct spaces for template arguments to std::complex are needed even when not instanced).

The produced code looks good but, I haven't done a benchmark to compare it with the gcc-3.0.x since the compile time increases for the benchmark test dramatically.

I have not run any regression tests due to the compile time needed by my AMD K6/400 Linux box ...

Intel Compiler v6.0.x

Should work, but I haven't tested it.

Intel Compiler v7.x

This compiler is well supported by tvmet and passes the regression tests without any failure - as opposed to the GNU C++ compiler collection.

Intel Compiler v8.x

No regression tests are done - reports are welcome. I'm not expecting problems. Anyway, this versions uses pure macros for IEEE math isnan and isinf. This prevents overwriting with tvmet's functions. Therefore this functions are disabled after tvmet release 1.4.1. The code produced is even on examples/ray.cc more compact than the GNU C++ Compiler v3.3.

Anyway, here the code from examples/ray.cc using -O2 -DTVMET_OPTIMIZE

Assembler (IA-32 Intel® Architecture):

        movl      4(%esp), %ecx
        movl      8(%esp), %edx
        movl      12(%esp), %eax
        fldl      (%edx)
        fmull     (%eax)
        fldl      8(%edx)
        fmull     8(%eax)
        fldl      16(%edx)
        fmull     16(%eax)
        faddp     %st, %st(1)
        faddp     %st, %st(1)
        fldl      (%eax)
        fxch      %st(1)
        fadd      %st(0), %st
        fmul      %st, %st(1)
        fxch      %st(1)
        fsubrl    (%edx)
        fstpl     (%ecx)
        fldl      8(%eax)
        fmul      %st(1), %st
        fsubrl    8(%edx)
        fstpl     8(%ecx)
        fldl      16(%eax)
        fmulp     %st, %st(1)
        fsubrl    16(%edx)
        fstpl     16(%ecx)
        ret

Microsoft Visual C++ v7.1

has reported the success on tvmet using Visual C++ v7.1. At this release of tvmet there are some warnings left - the work is on progress.

The Microsoft Visual C++ Toolkit 2003 and Visual C++ prior 7.1 do not compile - you will get an undefined internal error unfortunally.

Anyway, here the code from examples/ray.cc:

Assembler (IA-32 Intel® Architecture, no SSE2):

  push  ebp
  mov ebp, esp
  and esp, -8         ; fffffff8H
  sub esp, 28         ; 0000001cH
  mov eax, DWORD PTR _ray$[ebp]
  mov ecx, DWORD PTR _surfaceNormal$[ebp]
  fld QWORD PTR [eax+16]
  fmul  QWORD PTR [ecx+16]
  push  ebx
  fld QWORD PTR [eax+8]
  push  esi
  fmul  QWORD PTR [ecx+8]
  push  edi
  mov edi, DWORD PTR $T35206[esp+52]
  faddp ST(1), ST(0)
  mov DWORD PTR $T35027[esp+60], edi
  fld QWORD PTR [eax]
  pop edi
  fmul  QWORD PTR [ecx]
  faddp ST(1), ST(0)
  fadd  ST(0), ST(0)
  fstp  QWORD PTR $T35206[esp+36]
  mov esi, DWORD PTR $T35206[esp+40]
  mov edx, DWORD PTR $T35206[esp+36]
  mov ebx, DWORD PTR $T35265[esp+40]
  mov DWORD PTR $T35027[esp+44], edx
  mov edx, DWORD PTR _reflection$[ebp]
  mov DWORD PTR $T35027[esp+48], esi
  fld QWORD PTR $T35027[esp+44]
  fmul  QWORD PTR [ecx]
  pop esi
  mov DWORD PTR $T35027[esp+36], ebx
  pop ebx
  fsubr QWORD PTR [eax]
  fstp  QWORD PTR [edx]
  fld QWORD PTR $T35027[esp+36]
  fmul  QWORD PTR [ecx+8]
  fsubr QWORD PTR [eax+8]
  fstp  QWORD PTR [edx+8]
  fld QWORD PTR $T35027[esp+36]
  fmul  QWORD PTR [ecx+16]
  fsubr QWORD PTR [eax+16]
  fstp  QWORD PTR [edx+16]
  mov esp, ebp
  pop ebp

See also:: Failed regression tests
Build and Installation of tvmet on MS Windows

Author: