Welcome to the forums. Please post in English or French.

You are not logged in.

#1 2016-10-28 10:55:18

stephaneberger
Member
From: Strasbourg (France)
Registered: 2012-10-15
Posts: 70

Seg Fault in THER_NON_LINE V12.4 parallel

Hi,

I face a segmentation fault during a THER_NON_LINE.
I'm using a parallel version of code_aster v12.4 compiled with the gnu c/fortran compiler.

Here under an extract of the flasheur message.

    Champ stocké <COMPORTHER> à l'instant -6.000000000000e+01 pour le numéro d'ordre 22

 Instant de calcul: -4.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------

  Temps CPU consommé dans ce pas de temps  : 37.390 s



 Instant de calcul: -4.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------



 Instant de calcul: -4.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------
    Champ stocké <COMPORTHER> à l'instant -6.000000000000e+01 pour le numéro d'ordre 22

  Temps CPU consommé dans ce pas de temps  : 37.400 s



 Instant de calcul: -4.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------
|     1       1.09385E-15     2.32746E-12        0      0.00000E+00        OUI      |
-------------------------------------------------------------------------------------
|     1       1.09385E-15     2.32746E-12        0      0.00000E+00        OUI      |
-------------------------------------------------------------------------------------
|     1       1.09385E-15     2.32746E-12        0      0.00000E+00        OUI      |
-------------------------------------------------------------------------------------
|     1       1.09385E-15     2.32746E-12        0      0.00000E+00        OUI      |
-------------------------------------------------------------------------------------

  Archivage des champs
    Champ stocké <TEMP> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23

  Archivage des champs
    Champ stocké <TEMP> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23
    Champ stocké <COMPORTHER> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23
    Champ stocké <COMPORTHER> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23

  Temps CPU consommé dans ce pas de temps  : 42.130 s



 Instant de calcul: -2.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------

  Temps CPU consommé dans ce pas de temps  : 42.130 s



 Instant de calcul: -2.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------

  Archivage des champs
    Champ stocké <TEMP> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23

  Archivage des champs
    Champ stocké <TEMP> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23
    Champ stocké <COMPORTHER> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23

  Temps CPU consommé dans ce pas de temps  : 42.140 s


    Champ stocké <COMPORTHER> à l'instant -4.000000000000e+01 pour le numéro d'ordre 23

 Instant de calcul: -2.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------

  Temps CPU consommé dans ce pas de temps  : 42.150 s



 Instant de calcul: -2.000000000000e+01
-------------------------------------------------------------------------------------
| ITERATION      RESIDU         RESIDU      ITERATION   COEFFICIENT   ACTUALISATION |
|                RELATIF        ABSOLU      RECH. LIN.   RECH. LIN.      MATRICE    |
|            RESI_GLOB_RELA  RESI_GLOB_MAXI                 RHO         TANGENTE    |
-------------------------------------------------------------------------------------
[AsterX:08827] *** Process received signal ***
[AsterX:08827] Signal: Segmentation fault (11)
[AsterX:08827] Signal code: Address not mapped (1)
[AsterX:08827] Failing at address: 0xfffffffe8559d9f0
[AsterX:08827] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330) [0x7fc6a1434330]
[AsterX:08827] [ 1] /opt/aster/PAR12.4/bin/aster(__Compute2WayPartitionParams+0xd7) [0x2f14a87]
[AsterX:08827] [ 2] /opt/aster/PAR12.4/bin/aster(__MlevelNodeBisection+0xe5) [0x2ef20c5]
[AsterX:08827] [ 3] /opt/aster/PAR12.4/bin/aster(__MlevelNestedDissection+0x17e) [0x2ef2c7e]
[AsterX:08827] [ 4] /opt/aster/PAR12.4/bin/aster(__MlevelNestedDissection+0xed) [0x2ef2bed]
[AsterX:08827] [ 5] /opt/aster/PAR12.4/bin/aster(__MlevelNestedDissection+0xed) [0x2ef2bed]
[AsterX:08827] [ 6] /opt/aster/PAR12.4/bin/aster(__MlevelNestedDissection+0xed) [0x2ef2bed]
[AsterX:08827] [ 7] /opt/aster/PAR12.4/bin/aster(METIS_NodeND+0x205) [0x2ef3ba5]
[AsterX:08827] [ 8] /opt/aster/PAR12.4/bin/aster(dmumps_195_+0x228e) [0x2adf6ca]
[AsterX:08827] [ 9] /opt/aster/PAR12.4/bin/aster(dmumps_26_+0x1488) [0x2b23c45]
[AsterX:08827] [10] /opt/aster/PAR12.4/bin/aster(dmumps_+0x1034) [0x2ab7310]
[AsterX:08827] [11] /opt/aster/PAR12.4/bin/aster(amumpd_+0xd5d) [0x133dfdd]
[AsterX:08827] [12] /opt/aster/PAR12.4/bin/aster(amumph_+0x2187) [0x1341d87]
[AsterX:08827] [13] /opt/aster/PAR12.4/bin/aster(tldlg3_+0xde5) [0x66c1f5]
[AsterX:08827] [14] /opt/aster/PAR12.4/bin/aster(prere1_+0x66d) [0x650f2d]
[AsterX:08827] [15] /opt/aster/PAR12.4/bin/aster(prere2_+0x1e5) [0x651875]
[AsterX:08827] [16] /opt/aster/PAR12.4/bin/aster(preres_+0x158) [0x651e18]
[AsterX:08827] [17] /opt/aster/PAR12.4/bin/aster(nxacmv_+0x6f7) [0xa293d7]
[AsterX:08827] [18] /opt/aster/PAR12.4/bin/aster(op0186_+0x1245) [0x15d9dd5]
[AsterX:08827] [19] /opt/aster/PAR12.4/bin/aster(execop_+0x4c8) [0x18e4658]
[AsterX:08827] [20] /opt/aster/PAR12.4/bin/aster(expass_+0x12) [0x18e4892]
[AsterX:08827] [21] /opt/aster/PAR12.4/bin/aster() [0x5747ea]
[AsterX:08827] [22] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4bd4) [0x7fc6a08290d4]
[AsterX:08827] [23] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4b59) [0x7fc6a0829059]
[AsterX:08827] [24] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4b59) [0x7fc6a0829059]
[AsterX:08827] [25] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4b59) [0x7fc6a0829059]
[AsterX:08827] [26] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x80d) [0x7fc6a082a54d]
[AsterX:08827] [27] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(+0x1c37a5) [0x7fc6a085f7a5]
[AsterX:08827] [28] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyObject_Call+0x43) [0x7fc6a07cbd43]
[AsterX:08827] [29] /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0xeb1) [0x7fc6a08253b1]
[AsterX:08827] *** End of error message ***
/mnt/disque2/aster-AsterX-interactif.8656-AsterX/global/mpi_script.sh : ligne 37 :  8827 Erreur de segmentation  (core dumped) /opt/aster/PAR12.4/bin/aster /opt/aster/PAR12.4/lib/aster/Execution/E_SUPERV.py -commandes fort.1 -max_base 200000 -num_job 8656-AsterX -mode interactif -rep_outils /opt/aster/outils -rep_mat /opt/aster/PAR12.4/share/aster/materiau -rep_dex /opt/aster/PAR12.4/share/aster/datg -suivi_batch -memjeveux 1906.25 -tpmax 358200
EXECUTION_CODE_ASTER_EXIT_8656-AsterX=139
Contenu après l'exécution de /mnt/disque2/aster-AsterX-interactif.8656-AsterX/proc.0 :
.:
total 70544
drwx------ 3 aster aster     4096 oct.  27 18:36 .
drwxrwxr-x 7 aster aster     4096 oct.  27 18:36 ..
-rw-rw-r-- 1 aster aster     2170 oct.  27 18:36 8656-AsterX.export
-rw-rw-r-- 1 aster aster     2380 oct.  27 18:36 config.txt
-rw-rw-r-- 1 aster aster    13029 oct.  27 18:36 fort.1
-rw-rw-r-- 1 aster aster    13029 oct.  27 18:36 fort.1.1
-rw-rw-r-- 1 aster aster        0 oct.  27 18:36 fort.15
-rw-rw-r-- 1 aster aster 18779692 oct.  27 18:36 fort.20
-rw-rw-r-- 1 aster aster   112592 oct.  27 19:00 fort.6
-rw-rw-r-- 1 aster aster        0 oct.  27 18:36 fort.8
-rw-rw-r-- 1 aster aster        0 oct.  27 18:36 fort.9
-rw-rw-r-- 1 aster aster    19928 oct.  27 18:36 fort.95
-rw-rw-r-- 1 aster aster 19660808 oct.  27 18:41 glob.1
-rwxr-xr-x 1 aster aster     2180 oct.  27 18:36 mpi_script.sh
drwxr-xr-x 2 aster aster     4096 oct.  27 18:36 REPE_OUT
-rw-rw-r-- 1 aster aster 35225608 oct.  27 18:58 vola.1

REPE_OUT:
total 8
drwxr-xr-x 2 aster aster 4096 oct.  27 18:36 .
drwx------ 3 aster aster 4096 oct.  27 18:36 ..
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 8801 on
node AsterX exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
EXIT_COMMAND_8761_00000016=1
<INFO> Exécution Code_Aster terminée, diagnostic : <F>_ABNORMAL_ABORT

Here after the commande of THER_NON_LINE

  # ------------------------------------------------------------------------------------------
  # Commande No :  0067            Concept de type : evol_ther
  # ------------------------------------------------------------------------------------------
  TEMPE = THER_NON_LINE(MODELE=MOTH,
                        CHAM_MATER=MAT,
                        EXCIT=_F(CHARGE=CHTHCHAU,),
                        ETAT_INIT=_F(VALE=20.0,
                                     PRECISION=1.E-06,
                                     CRITERE='RELATIF',),
                        SOLVEUR=_F(SYME='NON',
                                   METHODE='MUMPS',
                                   STOP_SINGULIER='OUI',
                                   ELIM_LAGR='LAGR2',
                                   TYPE_RESOL='AUTO',
                                   GESTION_MEMOIRE='AUTO',
                                   FILTRAGE_MATRICE=-1.0,
                                   RENUM='AUTO',
                                   NPREC=8,
                                   PCENT_PIVOT=20,
                                   RESI_RELA=-1.0,
                                   PRETRAITEMENTS='AUTO',
                                   POSTTRAITEMENTS='AUTO',
                                   MIXER_PRECISION='NON',),
                        NEWTON=_F(ITER_LINE_MAXI=3,
                                  REAC_ITER=1,
                                  RESI_LINE_RELA=1.E-3,),
                        INCREMENT=_F(LIST_INST=LISTAUTO,
                                     INST_INIT=-500.0,
                                     INST_FIN=0.0,
                                     PRECISION=1.E-06,),
                        CONVERGENCE=_F(ITER_GLOB_MAXI=60,
                                       RESI_GLOB_RELA=1.E-06,),
                        PARM_THETA=0.57,
                        ARCHIVAGE=_F(PRECISION=1.E-06,
                                     CRITERE='RELATIF',),
                        COMPORTEMENT=_F(RELATION='THER_NL',),
                        )

As I see some error messages related to mumps, I have restarted the solve with MULT_FRONT as solver. The calcul is on progress so far. I will see if it works.

Best Regards

Stephane

Offline