|Message: Re: Xeon Phi performance||Not Logged In (login)|
Click on the Forum title, e.g. on the "Forums by Category" page, to read a sequence of postings to the Forum and its threads all in one page. If you are only interested in one thread or the thread following a specific posting, click the thread or the posting, which takes you to a smaller page, which contains only the part you are interested in and may be easier to navigate.
Messages are "chained" if there are only replies at the first level, i.e. 1/1.html, 1/1/1.html etc. In case of "chained" messages the message number is replaced by the icon and there is no indentation.
Inline: Display the subject line only or also the text of the posting(s); for the choice "All" the "Outline" choices are switched off.
|1||0||1||no text / full text of posting|
|2||1||All||text for level 1 only / text for All postings|
Outline: Choose the depth of the posting thread, successive toggle controls provide increasing detail.
|1||2||1||2 levels / 1 level (original posting)|
|2||3||2||3 levels / 2 levels|
|3||3||All||3 levels / all levels (all postings)|
please note the following: what was suggested by Makoto, can improve the linearity of the speedup (e.g. doubling the number of threads you should get the double of performances as long as physical cores are available).
See for example: https://twiki.cern.ch/twiki/bin/view/Geant4/MultiThreadingTaskForce#CPU_and_Memory_Performances
Having good linearity is a prerequisite to obtain good performances on Xeon Phi (otherwise you are not using efficiency all hardware threads).
If you have a good linearity and you manage to use the maximum number of threads allowed on the card, you may still get performances that are less than the theoretical peak performances.
This is because we have not yet optimized Geant4 code for Intel Xeon Phi.
Independently of this, I would like to comment that the "extremely optimized" applications I've seen run on Xeon Phi a factor 2 faster than on the host. You should never expect very large performances (this is because every improvement on the Xeon Phi will also benefit the host). If I understand correctly your post you get more or less the same performances on the host and on the card.
I hope this clarifies a bit, please do not hesitate to contact us if you want more information on these aspects. Andrea