I have an MPI application wherein I want to a have multithreaded section that can be utilized by hyperthreads.
For example, suppose I have 4 MPI ranks running on 4 different cores on MIC. I want to utilize two hyperthreads from each of these 4 cores for my mutithreaded section (which basically contains just two omp sections).
I'm exporting the following variables:
export MIC_ENV_PREFIX=PHI
export PHI_KMP_AFFINITY=balanced
export PHI_KMP_PLACE_THREADS=4c,2t
export PHI_OMP_NUM_THREADS=2
export OMP_NUM_THREADS=2
When I start my application, I see the following on top utility:
- CPU Command
- 1 my_app
- 62 my_app
- 123 my_app
- 184 my_app
These are the 4 MPI processes.
After switching to thread view, I get:
- CPU Command
- 1 my_app
- 2 my_app
- 5 my_app
- 62 my_app
- 63 my_app
- 65 my_app
- 123 my_app
- 124 my_app
- 125 my_app
- 184 my_app
- 185 my_app
- 186 my_app
What I would like is to spawn the threads in the same core as the parent process. (For eg, Process running on CPU 1 spawn two threads on CPU 2 and 3 so that they are in the same core).
Also is there a way to utilize the parent process as one of the threads and spawn just one extra thread.