|Message: Re: How is parallelization implemented using MPI and Geant4?||Not Logged In (login)|
Click on the Forum title, e.g. on the "Forums by Category" page, to read a sequence of postings to the Forum and its threads all in one page. If you are only interested in one thread or the thread following a specific posting, click the thread or the posting, which takes you to a smaller page, which contains only the part you are interested in and may be easier to navigate.
Messages are "chained" if there are only replies at the first level, i.e. 1/1.html, 1/1/1.html etc. In case of "chained" messages the message number is replaced by the icon and there is no indentation.
Inline: Display the subject line only or also the text of the posting(s); for the choice "All" the "Outline" choices are switched off.
|1||0||1||no text / full text of posting|
|2||1||All||text for level 1 only / text for All postings|
Outline: Choose the depth of the posting thread, successive toggle controls provide increasing detail.
|1||2||1||2 levels / 1 level (original posting)|
|2||3||2||3 levels / 2 levels|
|3||3||All||3 levels / all levels (all postings)|
Could you give a stripped down sample of your MPI implementation, or some form of pseudocode?
In general, MPI can be used for synchronization of different "processes", and passing reasonable amounts of data (or unreasonably large data, but over a fast interconnect).
However if you launch multiple MPI processes on one computer, the most straightforward implementation of it means that you will end up launching multiple, full simulations, each with their own identical geometry, full sets of physics tables, etc. The only difference will be that different processes (or ranks) will generate primaries from a different random number seed.
This is fine if you are using a distributed computing setup, where multiple standalone computers/nodes, each with dedicated CPU and RAM can run a simulation starting at different random number seeds, and combine their final results.
However if you are running on one computer, you go from trying to initialize one physics table using one CPU and ~2GB of RAM (for voxelized geometry), to suddenly trying to initialize (lets say you use four separate processes) four identical physics tables and geometries (4x the RAM consumption), in an attempt to allow four CPU cores to do the work.
If you end up exhausting all of your system memory and end up writing to a hard drive's swap space or something, you can see where the slowdown can occur.
OpenMP on the other hand, is very well suited to running one program, while using additional cores/threads to speed up certain portions of the code (such as for loops, etc.).
These extra threads can pop in and out of existence as the code runs. This means that you are effectively running one simulation, on one CPU with one set of physics tables, but during any portion that you deem parallelisable, you can enable additional cores to help with the computation.
Was this what you were trying to do? Does this make sense?
Also, please feel free to refer to me as Ming, I am not a PhD yet C: Ming
On Sun, 01 Jul 2012 08:47:33 GMT, Geng wrote:
> Dear Dr: > > I tried to use Geant4 with MPI. During running, the cpu only be 1% each > core.So the time consumed by the application would even be more than one > core. Could you explain what is wrong with this situation? > > Thanks sir. geng
|Inline Depth:||Outline Depth:||Add message:|