|Message: Re: superslow make on a cluster over the network||Not Logged In (login)|
Click on the Forum title, e.g. on the "Forums by Category" page, to read a sequence of postings to the Forum and its threads all in one page. If you are only interested in one thread or the thread following a specific posting, click the thread or the posting, which takes you to a smaller page, which contains only the part you are interested in and may be easier to navigate.
Messages are "chained" if there are only replies at the first level, i.e. 1/1.html, 1/1/1.html etc. In case of "chained" messages the message number is replaced by the icon and there is no indentation.
Inline: Display the subject line only or also the text of the posting(s); for the choice "All" the "Outline" choices are switched off.
|1||0||1||no text / full text of posting|
|2||1||All||text for level 1 only / text for All postings|
Outline: Choose the depth of the posting thread, successive toggle controls provide increasing detail.
|1||2||1||2 levels / 1 level (original posting)|
|2||3||2||3 levels / 2 levels|
|3||3||All||3 levels / all levels (all postings)|
computational node where the make takes place is completely free. i even use -j8 (it's 8 cores machine), but all 8 cc1plus-es are idling almost all the time waiting for something.
our cluster admin sees no network problem on Gb ethernet, as well as I/O on the node, and NFS deamons see no stress.
i'm not familiar with intricate g4 make system, but if i look into the user code intermediate make-process files in $G4WORKDIR/tmp/Linux/usercode I see those dependency .d files where all header files of g4 as well as the OS(!) are listed. some of those files hold 500 or even 800 dependencies. now, imagine, how make build those. it must be opening each source or header file, parsing for includes, open those includes and so on and so on. one poor g4 user code analysis file has 500 dependencies--pretty crazy. this sounds like an overkill. the bottleneck must be I/O on the disk system when make process constantly opens hundreds of those files while building dependencies.
there is no point to copy the user code on the local node and compiling there. i need to copy the whole g4 system (300 MB for built system) plus(!) all system gcc includes etc because they are mentioned in those dependency files. that means all devel files should be on the calc node.
note, that our cluster disk system is remote, it's not on the head node, but somewhere outside. when i build on another cluster which has a local disk system, the build takes 1 min. when i run on this cluster with remote disk system, the make requests to deliver (to the local node) a tiny file, and it must be the latency of the remote disk system plus the network what slows everything down. that's why we see cc1pluses ideling all the time--they are waiting for a tiny header and cpp files to be delivered, parsed in nano sec and other requests will fly again to the disk system.
prestaging of g4 system on the calc node should eliminate about hald of the waiting time (since in the dependency files i estimate that about the half files are native g4, others from the linux OS)
does it make sense?
|Inline Depth:||Outline Depth:||Add message:|