Hi,
For 240 iterations on Xeon Phi, 240 threads give the best time, for 180 iterations, 180 threads give the best time, and similarly for 120 and 60. Shouldn't we expect that for even 120 and 180 iterations, 240 threads would've given the best times? If not, then why, ie, please provide some explanation or pointers to some document. (please note that iterations take the same time, programming language is fortran)
Thanks in advance.