|Message: composite_calorimeter Example Crashing During Decay Process, SubType 201 (out of the box, fresh install)||Not Logged In (login)|
Click on the Forum title, e.g. on the "Forums by Category" page, to read a sequence of postings to the Forum and its threads all in one page. If you are only interested in one thread or the thread following a specific posting, click the thread or the posting, which takes you to a smaller page, which contains only the part you are interested in and may be easier to navigate.
Messages are "chained" if there are only replies at the first level, i.e. 1/1.html, 1/1/1.html etc. In case of "chained" messages the message number is replaced by the icon and there is no indentation.
Inline: Display the subject line only or also the text of the posting(s); for the choice "All" the "Outline" choices are switched off.
|1||0||1||no text / full text of posting|
|2||1||All||text for level 1 only / text for All postings|
Outline: Choose the depth of the posting thread, successive toggle controls provide increasing detail.
|1||2||1||2 levels / 1 level (original posting)|
|2||3||2||3 levels / 2 levels|
|3||3||All||3 levels / all levels (all postings)|
I am working with the example code composite_calorimeter and am currently experiencing a segmentation fault during run time. The fault occurs with completely unmodified example code, as packaged on install, as well as my modified code, and occurs fastest when running 300 GeV Pions (pi-) into the calorimeter, via:
/gun/energy 300 GeV
I have tested this with two different versions of Geant 4 on two separate operating systems, with completely unmodified composite_calorimeter code. The details are as follows:
I have tested using Geant 4.9.3 on Mac OSX 10.6.8, and Geant 4.9.3.p02. The CLHEP installed is the recommended version 126.96.36.199 and I have Open Scientist Batch version 16.11 installed with the AIDA interface included. The environment is set up as follows, in a bash shell:
I have run several of my own debugs since discovering the fault and the crash always occurs when the active process from PostStepPoint is a Decay, SubType 201, and the creator process for the current track is LambdaInelastic, SubType 121. The following is debug output from CCalSteppingAction at the point of the crash:
Process Name from PostStepPoint is Decay
Process Type from PostStepPoint is 6
Process Type Name from PostStepPoint is Decay
Process Subtype from PostStepPoint is 201
Track creator process from 'aStep' is: LambdaInelastic
Process Type from Track creator process is 4
Process Type Name from Track creator process is Hadronic
Process Subtype from Track creator process is 121
These conditions occur a handful of times during the run, before the crash, so there must be another factor at play, causing the bug to occur.
The fault is caused by a problem involving the TSliceID variable in CCalSteppingAction::UserSteppingAction (located in CCalSteppingAction.cc).
A global time value is returned via PostStepPoint->GetGlobalTime() and this value rounded to an integer which is then used to reference the 200 member timeDeposit array and increment the appropriate energy to the corresponding member of the array.
However, I have found that a segmentation fault is occurring and causing the program to crash, due to this global time value returned spontaneously taking an extremely large value (after a random number of steps, usually several thousand, which varies with initial conditions). For example 2.0177806e+12 ns, or 2.525698e+11 ns, when it should be taking a value between 0 and 200 nanoseconds.
This somehow causes the rounding operation which occurs in the following line to assign a large negative integer to TSliceID. This value is always the same, regardless of the global time returned from PostStepPoint, setting TSliceID = -2147483648:
TSliceID = static_cast<int>( (PostStepPoint->GetGlobalTime() ) / nanosecond);
The segmentation fault occurs when CCalSteppingAction::UserSteppingAction tries to write to the -2147483648th member of the 200 member timeDeposit array, here:
timeDeposit[TSliceID] += aStep->GetTotalEnergyDeposit() / GeV;
I am posting here to see if anyone knows why this extremely large value is being returned by PostStepPoint->GetGlobalTime (rather than a value in the range 0 to 200 nanoseconds), or if anyone can suggest a possible cause or solution to this crash.
To confirm this problem was present with a fresh install, I completely removed Geant 4 from OSX and reinstalled it, as per the instructions in the installation guide, to version 4.9.3.p02. The bug occurred again, unchanged, with completely unmodified example code.
The problem is also occurring independently on a Linux system (Fermi Linux lts30 INSTALL for FermiGenericDesktopOffsite), with an install of Geant version 4.9.4.p02. The bug and debug data occur exactly the same with this install. The CLHEP version installed here is 188.8.131.52, and version 16.11.5 of OpenScientist batch is installed.
If it is a problem with my install, environment, supporting software, or a build error, I would ask if anyone has any suggestions for a possible solution.
|Inline Depth:||Outline Depth:||Add message:|