Atom topic feed | site map | contact | login | Protection des données personnelles | Powered by FluxBB | réalisation artaban
You are not logged in.
Hi everyone,
I get the following error trying to run Salome in GPU mode:
$ /home/luca/salome_meca-lgpl-2022.1.0-1-20221225-scibian-9
not exist: /etc/krb5.conf
/usr/bin/nvidia-smi
**************************************
INFO : Running salome_meca in GPU mode
**************************************
runSalome running on luca-Precision-7520
Searching for a free port for naming service: 2810 - OK
Searching Naming Service + found in 0.1 seconds
Searching /Kernel/Session in Naming Service +++SALOME_Session_Server: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /.singularity.d/libs/libGLX.so.0)
Traceback (most recent call last):
File "/opt/salome_meca/V2022.1.0_scibian_univ/modules/KERNEL_V9_8_0/bin/salome/orbmodule.py", line 181, in waitNSPID
os.kill(thePID,0)
ProcessLookupError: [Errno 3] No such process
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/salome_meca/appli_V2022.1.0_scibian_univ/bin/salome/runSalome.py", line 694, in useSalome
clt = startSalome(args, modules_list, modules_root_dir)
File "/opt/salome_meca/appli_V2022.1.0_scibian_univ/bin/salome/runSalome.py", line 639, in startSalome
session=clt.waitNSPID("/Kernel/Session",mySessionServ.PID,SALOME.Session)
File "/opt/salome_meca/V2022.1.0_scibian_univ/modules/KERNEL_V9_8_0/bin/salome/orbmodule.py", line 183, in waitNSPID
raise RuntimeError("Process %d for %s not found" % (thePID,theName))
RuntimeError: Process 14762 for /Kernel/Session not found
--- Error during Salome launch ---
I'm on Ubuntu 22.04 LTS
Runs fine on --soft mode
The GLIBC version installed is 2.35, not 2.34:
$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.1) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
Questo è software libero; si veda il sorgente per le condizioni di copiatura.
NON c'è alcuna garanzia; neppure di COMMERCIABILITÀ o IDONEITÀ AD UN
PARTICOLARE SCOPO.
Scritto da Roland McGrath e Ulrich Drepper.
I've also installed Nvidia Container toolkit:
$ nvidia-container-cli info
NVRM version: 525.60.13
CUDA version: 12.0
Device Index: 0
Device Minor: 0
Model: Quadro M2200
Brand: Quadro
GPU UUID: GPU-88ece61b-9d86-8830-1b8b-bef5d20fa27e
Bus Location: 00000000:01:00.0
Architecture: 5.2
Nvidia driver works correctly:
$ nvidia-smi
Fri Jan 20 18:04:26 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13 Driver Version: 525.60.13 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro M2200 On | 00000000:01:00.0 Off | N/A |
| N/A 32C P8 N/A / N/A | 126MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2249 G /usr/lib/xorg/Xorg 35MiB |
| 0 N/A N/A 5119 C+G ...613623068333722871,131072 83MiB |
+-----------------------------------------------------------------------------+
Can someone point to a solution to this?
Thanks.
Last edited by lucaf (2023-02-03 20:20:11)
Offline
Exact the same problem!
Offline
Hello,
I am having the same problem. I tried everything on an older workstation, there installing the nvidia toolkit solved the problem.
Therefore, I tried the same on my main workstation, and now it does NOT work! aaaaaaaaaaah!
Does anybody have a solution? I assume, many of us are seeing this error, as U22.04LTS would probably the first choice of most users,
thank you in advance,
Mario.
Here is my output, it is almost exactly the same:
mario@mario-HP-Z8-G4:~$ ./salome_meca-lgpl-2021.1.0-2-20220817-scibian-9
not exist: /etc/krb5.conf
/usr/bin/nvidia-smi
**************************************
INFO : Running salome_meca in GPU mode
**************************************
runSalome running on mario-HP-Z8-G4
Searching for a free port for naming service: 2817 - OK
Searching Naming Service + found in 0.1 seconds
Searching /Kernel/Session in Naming Service ++++SALOME_Session_Server: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /.singularity.d/libs/libGLX.so.0)
Traceback (most recent call last):
File "/opt/salome_meca/Salome-V2021-s9/modules/KERNEL_V9_7_0/bin/salome/orbmodule.py", line 181, in waitNSPID
os.kill(thePID,0)
ProcessLookupError: [Errno 3] No such processDuring handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/salome_meca/appli_V2021/bin/salome/runSalome.py", line 694, in useSalome
clt = startSalome(args, modules_list, modules_root_dir)
File "/opt/salome_meca/appli_V2021/bin/salome/runSalome.py", line 639, in startSalome
session=clt.waitNSPID("/Kernel/Session",mySessionServ.PID,SALOME.Session)
File "/opt/salome_meca/Salome-V2021-s9/modules/KERNEL_V9_7_0/bin/salome/orbmodule.py", line 183, in waitNSPID
raise RuntimeError("Process %d for %s not found" % (thePID,theName))
RuntimeError: Process 50614 for /Kernel/Session not found--- Error during Salome launch ---
Last edited by mf (2023-01-26 16:30:37)
Offline
Ok, I tried the following:
I edited the singularity.conf in /etc/singularity to:
use nvidia-container-cli = yes
and the path
nvidia-container-cli path = /usr/bin/nvidia-container-cli
The GUI starts in GPU mode, but I am getting a lot of SIGSEVs, so this unusable. But somehow I might be on the right track. Output from CLI:
mario@mario-HP-Z8-G4:~$ ./salome_meca-lgpl-2021.1.0-2-20220817-scibian-9
not exist: /etc/krb5.conf
INFO: Setting 'NVIDIA_VISIBLE_DEVICES=all' to emulate legacy GPU binding.
INFO: Setting --writable-tmpfs (required by nvidia-container-cli)
/usr/bin/nvidia-smi
**************************************
INFO : Running salome_meca in GPU mode
**************************************
runSalome running on mario-HP-Z8-G4
Searching for a free port for naming service: 2818 - OK
Searching Naming Service + found in 0.1 seconds
Searching /Kernel/Session in Naming Service ++++++++libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
+****************************************************************
Warning: module HexoticPLUGIN is improperly configured!
Module HexoticPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module GHS3DPLUGIN is improperly configured!
Module GHS3DPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module GHS3DPRLPLUGIN is improperly configured!
Module GHS3DPRLPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module BLSURFPLUGIN is improperly configured!
Module BLSURFPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module NETGENPLUGIN is improperly configured!
Module NETGENPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module HYBRIDPLUGIN is improperly configured!
Module HYBRIDPLUGIN will not be available in GUI mode!
****************************************************************
****************************************************************
Warning: module GMSHPLUGIN is improperly configured!
Module GMSHPLUGIN will not be available in GUI mode!
****************************************************************
found in 4.5 seconds
Start SALOME, elapsed time : 4.9 seconds
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
TKOpenGl.WinSystem | Type: Error | ID: 0 | Severity: High | Message:
glXMakeCurrent() has failed!
TKOpenGl.WinSystem | Type: Error | ID: 0 | Severity: High | Message:
glXMakeCurrent() has failed!
TKOpenGl.WinSystem | Type: Error | ID: 0 | Severity: High | Message:
glXMakeCurrent() has failed!
TKOpenGl.WinSystem | Type: Error | ID: 0 | Severity: High | Message:
glXMakeCurrent() has failed!
Could anybody provide help with this please?
Thank you,
Mario
Offline
screenshot.....
Offline
Is there a possibility within the container to incorporate GLIBC_2.34 instead of GLIBC_2.35 that is provided by Ubuntu 22.04LTS?
Offline
The problem seems to be related to libGLX.so being checked when libGLX_nvidia.so is (presumably) going to be used. Don't know enough about what is going on to be more precise. The check can be avoid it would seem by removing the libGLX.so entry in the /usr/local/etc/singularity/nvliblist.conf file. This file might be named/stored somewhere else depending on how singularity/apptainer was installed. For me this allowed salome to run in GPU mode and create a box geometry from a python script. Not checked anything further.
Using:
- Ubuntu 22.04.1 LTS
- singularity-ce version 3.10.5
- nvidia-drivers
- nvidia-container-toolkit
- salome_meca-lgpl-2022.1.0-1-20221225-scibian-9
Last edited by honestguvnor (2023-01-27 21:32:24)
Offline
Hello,
when I remove libGLX.so, the GUI boots, but when I press the geometry button, I get a fatal error (see image).
When I directly go to Asterstudy, the GUI crashes, sometimes with a segmentation fault.
I also tried commenting other GLX-related entries in nvliblist.conf with no success,
Mario.
Last edited by mf (2023-01-28 13:56:42)
Offline
Hello again!
Great news: after removing Wayland (Nvidia users know: Wayland is basically complete garbage) above solution seems to work! Great!
I tested all relevant modules (GEO, MESH, AsterStudy,..)
Currently running an old simulation on an older workstation and will apply above changes + removing Wayland on my main workstation also.
Removing Wayland is easy:
sudo nano /etc/gdm3/custom.conf
There edit the file to:
WaylandEnable=false
basically uncommenting the line. There must be reason why this line exists, some smart person left it on purpose.
Reboot.
And for completeness, editing
sudo nano /etc/singularity/nvliblist.conf
to
....
libglx.so
#libGLX.so
libnvcuvid.so
....
Hopefully, this will also solve thread ID=26348 and similar threads. Now it's clear, why my headless system worked with installing Nvidia-container-cli right from the start: no Wayland there! (see thread ID=26341, which for some reason I cannot open anymore)
Bye for now, thanks honestguvnor,
Mario.
Last edited by mf (2023-01-29 12:30:52)
Offline
Wasn't singularity supposed to be a solution to previous installation problems on different systems? I can't see any benefits in this matter.
regards,
Krzysztof
Offline
Wasn't singularity supposed to be a solution to previous installation problems on different systems? I can't see any benefits in this matter.
The main benefit to users is a higher quality salome/aster distribution. This follows from the developers only have to work with a single stable environment rather than a range of gratuitously different ones. That environment is presumably derived from a more stable one than Ubuntu. To get an appreciation for some of the issues being avoided try building the aster source code on Ubuntu 22.04. Ubuntu names libraries and modules differently requiring data to be changed (and then to break on other distributions without messing about writing scripts), the C++ compiler is newer requiring the replacement of the TFEL package by a newer one (which may not be compatible...), libc is newer which shouldn't break anything but has, plus no doubt many other fixes required to get it running on a particular gratuitously different linux distribution. I abandoned the task when it became clear that this traditional way of developing software was clearly not being supported (or at least not on anything other than whatever the salome/aster "distribution" is derived from).
Adopting a container based approach has made life easier for the high value salome/aster people (the developers) by transferring most of the tasks of dealing with distribution differences to the singularity/apptainer software, it's dependencies like nvidia driver and container software and ubuntu software,... Our problems here seem to be mainly issues with the interaction between those projects although there might still be an issue with how salome/aster has setup singularity/apptainer. I don't know because all this container stuff is new to me but I understand why it has been done. I will take a bit more messing about as a user as the price to get more reliable and faster developed simulation software.
Offline
The problem seems to be related to libGLX.so being checked when libGLX_nvidia.so is (presumably) going to be used. Don't know enough about what is going on to be more precise. The check can be avoid it would seem by removing the libGLX.so entry in the /usr/local/etc/singularity/nvliblist.conf file. This file might be named/stored somewhere else depending on how singularity/apptainer was installed. For me this allowed salome to run in GPU mode and create a box geometry from a python script. Not checked anything further.
Using:
- Ubuntu 22.04.1 LTS
- singularity-ce version 3.10.5
- nvidia-drivers
- nvidia-container-toolkit
- salome_meca-lgpl-2022.1.0-1-20221225-scibian-9
Thanks a lot! It works!
Offline
Hello mf,
may I ask were to find the nvliblist.conf, and change it to
....
libglx.so
#libGLX.so
libnvcuvid.so
....
I don't have the directory you specified above.
I am using your container github.com/emefff/Code-Aster-MPI-in-Singularity-of-SM2021 on WSL2.
Doing WaylandEnable=false didn't help.
Last edited by jacob (2023-02-22 08:56:36)
Offline
Hello,
I do not know the answer to that. Maybe you have a different installation of singularity ("singularity --version" gives me "singularity-ce version 3.10.3-focal")?
I'd suggest to further search in /etc or maybe in your ~/
Mario.
Last edited by mf (2023-02-22 09:00:39)
Offline
Ok, I found that, but it didn't help.
Offline