Hey all,
i'm currently trying to compile the following native Intel Intrinsics code.
for ( std::size_t k = 0; k < n; k += 16 ) {
x_ = _mm512_load_ps( x + k );
u_ = _mm512_load_ps( u + k );
xhd_ = _mm512_movehdup_ps( x_ );
upm_ = _mm512_permute_ps( u_, 177 );
hdpm_ = _mm512_mul_ps( xhd_, upm_ );
xld_ = _mm512_moveldup_ps( x_ );
cmplx_ = _mm512_fmaddsub_ps( xld_, u_, hdpm_ );
_mm512_store_ps( x + k , cmplx_ );
}
and thereby I get a compile error:
mic_test.cpp:(.text+0x1097): undefined reference to `_mm512_movehdup_ps'
mic_test.cpp:(.text+0x10b1): undefined reference to `_mm512_permute_ps'
mic_test.cpp:(.text+0x10ce): undefined reference to `_mm512_moveldup_ps'
mic_test.cpp:(.text+0x10e8): undefined reference to `_mm512_fmaddsub_round_ps'
I have included immintrin.h, the memory is 64 byte aligned and I checked
http://software.intel.com/en-us/comment/1726413#comment-1726413
But the Problem is that my program runs in native mode. I don't use pragmas for offloading the code to the MIC.
The strange thing is that a similar code:
for ( std::size_t k = 0; k < n; k += 16 ) {
x_ = _mm512_load_ps( x + k );
z_ = _mm512_load_ps( z + k );
u_ = _mm512_load_ps( u + k );
v_ = _mm512_load_ps( v + k );
zv_ = _mm512_mul_ps( z_, v_ );
real_ = _mm512_fmsub_ps( x_, u_, zv_);
zu_ = _mm512_mul_ps( z_, u_ );
imag_ = _mm512_fmadd_ps( x_, v_, zu_);
_mm512_store_ps( x + k , real_ );
_mm512_store_ps( z + k , imag_ );
}
compiles and runs without problems on the MIC.
Thanks,
Patrick