Problem with _mm256_and_ps instruction
Hi, I was trying to do a very simple exercise using vector instructions. But I am getting wrong results. In the following program I am trying to do a bit-wise-and operation using _mm256_and_ps...
View ArticleHow to debug mic boot problem?
Since MPSS 3.2 mics does not boot reliably way, there is following error on the console: "Initramfs unpacking failed: junk in compressed archive"There are two 7120 cards on the host and boot failure...
View ArticleHow to evaluate flops of MIC card?
Hi everybody,I need to count flops of a code which should be running on MIC card with the native mode. But I don’t know the correct way to evaluate the flops. I find a document from internet, and the...
View Articleopensm does not work on OFED-3.5-2-MIC: base lid 65536
The problem that I am having is on the Mellanox InfiniBand HCA side, however, the problem is with the OFED-3.5-2-MIC-beta1 branch of OFED, so I hope this is the right forum. I am trying to configure...
View ArticleReg: How to increase Data transfer speed between host and MIC
Hi,I was trying to offload some portion of my code and I set OFFLOAD_REPORT =3. The report is like this.[Offload] [HOST] [Tag 172] [CPU Time] 0.120382(seconds) [Offload] [MIC 0] [Tag 172]...
View ArticleProblem with Scatter/Gather operations
Hi, i am vectorizing some code using the MIC intrinsics. But i am getting segmentation fault on MIC (offload error: process on the device 0 was terminated by signal 11 (SIGSEGV) ). Some one please tell...
View ArticleCalculating prefetches that missed L2
Hi, I am currently doing some performance tests on some offload code for Xeon Phi. I have been calculating performance numbers by measuring hardware counters using PAPI, with the calculation methods...
View ArticleWhich optimizations are applied in O1?
On Xeon Phi coprocessor, I have obtained 5x speedup when the following code is compiled with O1 instead of O0, for a single thread.int i; for( i = 0; i < nrows; i++){ int j; double y0 = 0.0; int...
View ArticleMemory allocated on MIC creates a copy on main memory?
Hi,I find out that main memory also stores a copy of data allocated on MIC Card when handling big data on MIC Card. For example, given below codes, the program uses 1G RAM when pausing at "Press key to...
View ArticleNew case study: shallow water equation solver
We have just published a new paper with a case study of the Intel MIC architecture.http://research.colfaxinternational.com/post/2014/05/12/Shallow-Water.aspxThis is a simplified CFD application solving...
View ArticleDebugger cache in windows
I'm having trouble debugging my offload executable in windows, and I suspect there may be a cache someplace that needs to be purged. If I create a new solution/project with the same name as my real...
View Articlewhat is the relation between "hardware thread" and "hyperthread"?
Dear Forum,One of the Intel TBB webpages states that "a typical Xeon Phi coprocessor has 60 cores, and 4 hyperthreads/core". But this blog from Intel emphasizes that "The Xeon Phi co-processor utilizes...
View ArticleSubmissions open: High Performance Parallelism Gems
You are invited to contribute to High Performance Parallelism Gems – Successful Approaches for Multicore and Many-core Programming (working title) a contribution-based book that will focus on practical...
View Articleweird error of Xeon Phi cards on running IMB
Hi,I am using Intel MPI Benchmark to evaluate my Xeon Phi cards (5110p). Particularly, for the Pingpong test, all of my cards work well but one. This card will fail and automatically reboot when the...
View Article[BUG] ipcp 2015 linux -mmic + -rdynamic = segfault
I am beta testing the new 2015 Intel compiler and associated tools. I am currently undertaking development on the Phi. When the -rdynamic flag is passed to the compiler when -mmic is present, ipcp...
View ArticleNew MPSS released today
An update to the MPSS was released today. Users who are running Intel® MPSS 3.2-1 (released March 17 2014) or 3.2.1-1 (released April 10 2014) are strongly advised to update to this release, as it...
View Articleinstalling mpss | mic module issue
Hi Everyone,Need some help installing mpss on my Intel Xeon Phi machine:mpss-3.2.1 with RedHat-6.5.lspci | grep coprocessor 08:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor SE10/7120 series...
View Articlehardware_concurrency() returns 0
Just wanted to report a bug: when calling thread::hardware_concurrency(), I get a 0 (zero) instead of 240 :(
View ArticleTemporary arrays in Sparse Matrix Vector Multiply Format Prototype Package
Considering "The Intel® Math Kernel Library Sparse Matrix Vector Multiply Format Prototype Package", I have two questions:1) The use of a temporary array for each thread may not pay off when the number...
View ArticleDebugging in native mode: gdb complains about libthread_db
When I try to debug a program in native mode directly on the card using gdb, I get the following message:warning: Unable to find libthread_db matching inferior's thread library, thread debugging will...
View Article