Welcome to the forums. Please post in English or French.

You are not logged in.

#1 2022-05-05 08:45:10

mf
Member
Registered: 2019-06-18
Posts: 324

GUI of SM Container crashing on headless system with Nvidia GPU

Good morning!

Let me first describe my setup: I run SM2021 on a HP DL360 with Nvidia P400 GPU (OS=Ubuntu 20.04). At the moment no monitor is plugged in this system (I have an EDID emulator too, but that does not seem to work correctly).

Until now, and this still works, but is very slow and has graphical errors, I used SalomeMeca via logging into this server with ssh -X and the soft rendering option of the container (./salome....... --soft). The X-server then forwards the GUI. Thus I installed a graphics card in this server (I know, it is not recommended in this server, but I do not care about that).

Now this happens when I start SalomeMeca (I am logged in via ssh -X, just like described above...):

serveruser@hp-dl360-8:~$ ./salome_meca-lgpl-2021.0.0-0-20210601-scibian-9
not exist: /etc/krb5.conf
/usr/bin/nvidia-smi
**************************************
INFO : Running salome_meca in GPU mode
**************************************
runSalome running on hp-dl360-8
Searching for a free port for naming service: 2810 - OK
Searching Naming Service  + found in 0.1 seconds
Searching /Kernel/Session in Naming Service  ++++++++++++ found in 6.0 seconds
Start SALOME, elapsed time :   6.5 seconds

So far, so good. Salome wants to use the GPU.

Also, the X-server seems to use the GPU:

serveruser@hp-dl360-8:~$ nvidia-smi
Thu May  5 07:40:02 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         Off  | 00000000:07:00.0 Off |                  N/A |
| 34%   34C    P8    N/A /  N/A |     13MiB /  2000MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1282      G   /usr/lib/xorg/Xorg                  9MiB |
|    0   N/A  N/A      1571      G   /usr/bin/gnome-shell                1MiB |
+-----------------------------------------------------------------------------+

The GUI of SM starts up correctly. But, when I press the ParaVis-button, I get the following error and the GUI closes immediately (sorry for the long text :-( ):

Loguru caught a signal: SIGABRT
Stack trace:
78      0x55e9f771f1da _start + 42
77      0x7f99d9ca12e1 __libc_start_main + 241
76      0x55e9f7721b1a int AbstractGUIAppMain<GUIAppOldStyle>(int, char**) + 4394
75      0x7f99ea4f7843 QCoreApplication::exec() + 131
74      0x7f99ea4ee7f2 QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) + 274
73      0x7f99ea54c48c QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) + 92
72      0x7f99d08ffb0c g_main_context_iteration + 44
71      0x7f99d08ffa60 /lib/x86_64-linux-gnu/libglib-2.0.so.0(+0x4aa60) [0x7f99d08ffa60]
70      0x7f99d08ff7f7 g_main_context_dispatch + 679
69      0x7f9977aa9a5a /opt/qt/5.15.2/lib/libQt5XcbQpa.so.5(+0x62a5a) [0x7f9977aa9a5a]
68      0x7f99eab42e5b QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) + 187
67      0x7f99eab66e35 QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) + 261
66      0x7f99eab65cf3 QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) + 2403
65      0x7f99ea4efe38 QCoreApplication::notifyInternal2(QObject*, QEvent*) + 264
64      0x55e9f771fb0d SALOME_Session_Server(+0x8b0d) [0x55e9f771fb0d]
63      0x7f99f954a226 SalomeApp_ExceptionHandler::handle(QObject*, QEvent*) + 54
62      0x7f99f954a514 SalomeApp_ExceptionHandler::handleSignals(QObject*, QEvent*) + 132
61      0x7f99eb454d97 QApplication::notify(QObject*, QEvent*) + 519
60      0x7f99eb44e31c QApplicationPrivate::notify_helper(QObject*, QEvent*) + 156
59      0x7f99eb4a9143 /opt/qt/5.15.2/lib/libQt5Widgets.so.5(+0x1b9143) [0x7f99eb4a9143]
58      0x7f99eb4a6477 /opt/qt/5.15.2/lib/libQt5Widgets.so.5(+0x1b6477) [0x7f99eb4a6477]
57      0x7f99eb45432a QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool, bool) + 506
56      0x7f99ea4efe38 QCoreApplication::notifyInternal2(QObject*, QEvent*) + 264
55      0x55e9f771fb0d SALOME_Session_Server(+0x8b0d) [0x55e9f771fb0d]
54      0x7f99f954a226 SalomeApp_ExceptionHandler::handle(QObject*, QEvent*) + 54
53      0x7f99f954a514 SalomeApp_ExceptionHandler::handleSignals(QObject*, QEvent*) + 132
52      0x7f99eb455cc4 QApplication::notify(QObject*, QEvent*) + 4404
51      0x7f99eb44e31c QApplicationPrivate::notify_helper(QObject*, QEvent*) + 156
50      0x7f99eb62125c QToolButton::event(QEvent*) + 108
49      0x7f99eb48cb08 QWidget::event(QEvent*) + 488
48      0x7f99eb62118a QToolButton::mouseReleaseEvent(QMouseEvent*) + 10
47      0x7f99eb53c465 QAbstractButton::mouseReleaseEvent(QMouseEvent*) + 213
46      0x7f99eb53c26d /opt/qt/5.15.2/lib/libQt5Widgets.so.5(+0x24c26d) [0x7f99eb53c26d]
45      0x7f99eb44a92e QAction::activate(QAction::ActionEvent) + 158
44      0x7f99eb4484b2 QAction::triggered(bool) + 50
43      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
42      0x7f99f02f2ff4 /opt/salome_meca/appli_V2021/lib/salome/libqtx.so(+0x182ff4) [0x7f99f02f2ff4]
41      0x7f99f0237d8f QtxActionSet::onActionTriggered(bool) + 63
40      0x7f99f02f2efe QtxActionSet::triggered(int) + 46
39      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
38      0x7f99f907550b LightApp_ModuleAction::activate(int, bool) + 875
37      0x7f99f90a4272 LightApp_ModuleAction::moduleActivated(QString const&) + 34
36      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
35      0x7f99f9575b5d /opt/salome_meca/appli_V2021/lib/salome/libSalomeAppImpl.so(+0x9db5d) [0x7f99f9575b5d]
34      0x7f99f952dae2 SalomeApp_Application::onModuleActivation(QString const&) + 34
33      0x7f99f9045eb6 LightApp_Application::onModuleActivation(QString const&) + 230
32      0x7f99f9041796 LightApp_Application::activateModule(QString const&) + 262
31      0x7f99f8bc02a3 CAM_Application::activateModule(QString const&) + 163
30      0x7f99f8bbf0cb CAM_Application::addModule(CAM_Module*) + 235
29      0x7f99389ac05d PVGUI_Module::initialize(CAM_Application*) + 909
28      0x7f99f0a2f39d PVViewer_Behaviors::instanciateAllBehaviors(QMainWindow*) + 301
27      0x7f99f0a2f1c2 PVViewer_Behaviors::instanciateMinimalBehaviors(QMainWindow*) + 194
26      0x7f99ee0a1ba5 pqAlwaysConnectedBehavior::pqAlwaysConnectedBehavior(QObject*) + 309
25      0x7f99ee0a19f4 pqAlwaysConnectedBehavior::serverCheck() + 148
24      0x7f99ecbdea4a pqObjectBuilder::createServer(pqServerResource const&, int) + 218
23      0x7f99e9f89439 vtkSMSession::ConnectToSelf(int) + 121
22      0x7f99e50b3e9e vtkProcessModule::RegisterSession(vtkSession*) + 142
21      0x7f99e1ed5e0e /opt/pv/5.9.0/lib/libvtkCommonCore-pv5.9.so.1(+0x2fae0e) [0x7f99e1ed5e0e]
20      0x7f99e1dbebf9 vtkCallbackCommand::Execute(vtkObject*, unsigned long, void*) + 25
19      0x7f99ebb8c51c /opt/pv/5.9.0/lib/libvtkGUISupportQt-pv5.9.so.1(+0x3b51c) [0x7f99ebb8c51c]
18      0x7f99ebb9a8d5 /opt/pv/5.9.0/lib/libvtkGUISupportQt-pv5.9.so.1(+0x498d5) [0x7f99ebb9a8d5]
17      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
16      0x7f99ecc64a29 /opt/pv/5.9.0/lib/libpqCore-pv5.9.so.1(+0x15aa29) [0x7f99ecc64a29]
15      0x7f99ecc60962 pqServerManagerObserver::connectionCreated(long long) + 66
14      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
13      0x7f99ecc33172 pqServerManagerModel::onConnectionCreated(long long) + 802
12      0x7f99ecc5d712 pqServerManagerModel::serverAdded(pqServer*) + 66
11      0x7f99ea52929e /opt/qt/5.15.2/lib/libQt5Core.so.5(+0x2d729e) [0x7f99ea52929e]
10      0x7f99ee0f96d3 pqDefaultViewBehavior::onServerCreation(pqServer*) + 83
9       0x7f99e9e83912 vtkPVSessionCore::GatherInformation(unsigned int, vtkPVInformation*, unsigned int) + 50
8       0x7f99e9e837b3 vtkPVSessionCore::GatherInformationInternal(vtkPVInformation*, unsigned int) + 163
7       0x7f99d445d2e9 vtkPVRenderingCapabilitiesInformation::CopyFromObject(vtkObject*) + 9
6       0x7f99d445d2ab vtkPVRenderingCapabilitiesInformation::GetLocalCapabilities() + 427
5       0x7f99e8215db5 vtkOpenGLRenderWindow::SupportsOpenGL() + 1349
4       0x7f99e82c5082 vtkXOpenGLRenderWindow::WindowInitialize() + 18
3       0x7f99e82c8b76 vtkXOpenGLRenderWindow::CreateAWindow() + 1926
2       0x7f99d9cb542a abort + 362
1       0x7f99d9cb3fff gsignal + 207
0       0x7f99e17820e0 /lib/x86_64-linux-gnu/libpthread.so.0(+0x110e0) [0x7f99e17820e0]
(  45.117s) [paraview        ]                       :0     FATL| Signal: SIGABRT

I assume, some of you have a similar setup. Did you get this error and perhaps know how to solve it?

I might add: when I press the AsterStudy button, the GUI also crashes with a similar error.

Thanks a lot for any hints,

Mario.

Last edited by mf (2022-05-05 09:10:52)

Offline

#2 2022-05-08 09:26:01

hberro
Member
From: Palaiseau, France
Registered: 2011-07-05
Posts: 130

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hi Mario,

Indeed what you are trying to do is tricky and requires further configuration on your headless server so as to properly forward the x-application (salome) with the gl rendering using gpu.

First of all, you should know that the x-server in your case runs on the local machine (linux or windows x server) while the application is running within the container. This itself can be problematic since the local machine does not have the same gpu capabilities as the server.

One solution is to use a vnc server instead of ssh x forwarding. Using vnc means that the x-server is started and effectively runs **on the server machine** and the images are then sent through the vnc protocol locally. You should get both better compatibility and performance using this solution.

If you manage to set up the system using vnc, it could be interesting to share the principle so it can be added natively within the container. For example, to have a --vnc option that runs the vnc server and connects salome to the vnc display.

Good luck and let us know how it goes.

HB

Last edited by hberro (2022-05-08 09:26:31)

Offline

#3 2022-05-08 16:03:02

mf
Member
Registered: 2019-06-18
Posts: 324

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hello,

thank you for taking time to answer. I will try what you suggested and report back, although, at the moment, I have no idea what to do :-).

Mario.

Offline

#4 2022-05-09 21:07:56

mf
Member
Registered: 2019-06-18
Posts: 324

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hello,

so far it has been a nightmare. I tried tightvnc, tigervnc I could not get to run any of the two.

I was able to get VNC to work following this tutorial for x11vnc:
https:__//www.youtube.com/watch?v=3K1hUwxxYek

Using vinagre on the client, I am getting a very weird resolution at the moment, that I did not set anywhere (1360*768, see image)

However, the error from the first post persists. So I will keep on digging,

Mario.

Last edited by mf (2022-05-09 21:09:01)


Attachments:
Bildschirmfoto vom 2022-05-09 22-04-00-min.png, Size: 657.69 KiB, Downloads: 39

Offline

#5 2022-05-09 21:22:15

hberro
Member
From: Palaiseau, France
Registered: 2011-07-05
Posts: 130

Re: GUI of SM Container crashing on headless system with Nvidia GPU

That's already a good step in the right direction!

If I understood well, tightVNC is setup on the headless server (the ubuntu we see in the image)?

Do you get better performance with the --soft mode using VNC already (w r t ssh -X)?

Offline

#6 2022-05-10 06:00:00

mf
Member
Registered: 2019-06-18
Posts: 324

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hello,

no, the image above is a vinagre-window, in it you see the GUI of the server (x11vnc-server, like in the link above). The terminal inside this GUI shows the output of SalomeMeca in GPU mode after I pressed the ParaVis button (same error as in first post, more or less).

tightvnc: I did not get a connection to the server (connection refused, but maybe due to another problem, I tried vncviewer-java and vinagre for viewing on the client, vncviewer-java does not work properly), same with tigervnc. I wanted to use tightvnc because it does not need a connected monitor, but that also doesn't seem to be a problem for x11vnc, at least, at the moment.

I only tried ssh -X into the server and SM with software rendering just now, performance is better, that is correct. However, the graphical error in the center-window of AsterStudy persists (Asterworkspace-window). Like before, everything that is behind this window shines through. This is what you see in the attached image (I began writing this post earlier, you can also see it in the image shining through the center AsterStudy-window+errors of terminal output of SM).

Mario.

EDIT: Maybe I'll live with --soft via ssh -X just now, performance is OK :-), not as terrible as before.

EDIT2: WEIRD, after some button-pressing in SM (changing modules), the AsterStudy-Workspace window is also shown correctly! :-) Good!

Last edited by mf (2022-05-10 06:26:42)


Attachments:
Bildschirmfoto vom 2022-05-10 06-56-26.png, Size: 717.54 KiB, Downloads: 41

Offline

#7 2022-05-10 06:25:46

hberro
Member
From: Palaiseau, France
Registered: 2011-07-05
Posts: 130

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Yes this is what I had understood earlier

Regarding the transparency of the mesh or Asterstudy viewer, this is a known issue and it should be enough to right click the window to get it back to the right size. If that doesn't work try switching back and forth to Paravis.

Best regards.

Offline

#8 2022-05-10 06:50:15

mf
Member
Registered: 2019-06-18
Posts: 324

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hello again,

oddly enough, with ssh -X and SM in GPU mode, NOW (I don't know why) ParaVis and AsterStudy do not crash (see image, SalomeSessionServer running on nvidia).

However, again, there are even more graphical errors than before (overlays of windows.... quite unusable).

So, for the moment I will stick to software rendering (P400 is pretty much out of work now :-( ).

I will do a write-up of what I have done with vnc etc. later,

Mario.


Attachments:
Bildschirmfoto vom 2022-05-10 07-45-32.png, Size: 537.58 KiB, Downloads: 37

Offline

#9 2022-05-10 08:34:16

mf
Member
Registered: 2019-06-18
Posts: 324

Re: GUI of SM Container crashing on headless system with Nvidia GPU

Hello,

here is what I have done yesterday to get at least a better performance from software rendering. The ultimate goal would be GPU rendering.

Everthing that follows is done on the server, until the very end when we login into it from the client.
I started from a fresh install of Ubuntu 20.04 LTS Desktop. This is more convenient as opposed to the server version , as you'll need a GUI anyway. It will also install nvidia drivers, if you have an nvidia GPU.

I then installed openssh, this is not included in the desktop version. Be sure activate in /etc/ssh/ssh_config (just uncommenting is enough):

ForwardX11 yes
ForwardX11Trusted yes
PasswordAuthentication yes

After a reboot you should be able to log into your server with ssh -X user@ip_address. Any software with a GUI should then be forwarded, you could test that with nautilus or nvidia-settings.

I then copied my SM container to the server, followed by installing singularity.

Then I bound my GPU to the X server. This is not necessary, but my GPU should do at least SOMETHING. So start nvidia-smi -a and search for the lines with PCI. You'll need to find out on which PCI lane the GPU is running, I get this:

PCI
        Bus                               : 0x07
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1CB310DE
        Bus Id                            : 00000000:07:00.0
        Sub System Id                     : 0x11BE10DE

Only the bus ID is needed. I then edit /usr/share/X11/xorg.conf.d/xorg.conf to:

Section "Device"
    Identifier  "Device0"
    Driver      "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:7:0:0"
EndSection

Section "Monitor"
    Identifier  "Configured Monitor"
    HorizSync 31.5-48.5
    VertRefresh 50-70
EndSection

Section "Screen"
    Identifier  "Default Screen"
    Monitor     "Configured Monitor"
    Device      "Configured Video Device"
    DefaultDepth 24
    SubSection "Display"
    Depth 24
    Modes "1920x1080"
    EndSubSection
EndSection

Only the device part is important for now, I couldn't verify if the Monitor and Screen part do anything. Note the different notation of the BusID in the xorg.conf.
After a reboot, Xorg should run on the GPU. You can verify this by running nvidia-smi (see one of the images above, where this can be seen).

The next step is installing the vnc server. This can be done directly at the machine or via ssh-ing into the machine. Basically, it is all done like in the above link, first we install lightdm, reboot and install x11vnc :

sudo apt update
sudo apt install lightdm
sudo reboot
sudo apt install x11vnc

Then we edit the file

sudo nano /lib/systemd/system/x11vnc.service

and paste the following text into this file:

[Unit]
Description=x11vnc service
After=display-manager.service network.target syslog.target

[Service]
Type=simple
ExecStart=/usr/bin/x11vnc -forever -display :0 -auth guess -passwd password
ExecStop=/usr/bin/killall x11vnc
Restart=on-failure

[Install]
WantedBy=multi-user.target

Choose your own password, take care if it contains anything like # or ", the you should write 'pass#"word' (with quotation marks, the ones next to the return key, shift+#) otherwise you'll get an error!
Save and run:

systemctl daemon-reload
systemctl enable x11vnc.service
systemctl start x11vnc.service
sudo reboot now

To check if the x11vnc server runs you could do a

systemctl status x11vnc.service

On your client you should be able to login to your server with

ssh -X user@ip_address

and start SalomeMeca the usual way with

singularity run --app install salome_meca-lgpl-2021.0.0-0-20210601-scibian-9.sif
./salome_meca-lgpl-2021.0.0-0-20210601-scibian-9 --soft

Still, as described above, GPU rendering does not work properly. For now, this leads to a snappier interface at least. The attached image shows SM running on the server, GUI transfer via ssh -X and vnc. The center 3D model (which I cannot show entirely) is depicted correctly. The text in the headline shows it's running on the the server ('auf HP-DL360P-G8' just means 'on HP-DL360P-G8')

Thank you hberro for the tips, I didn't know anything about vnc until yesterday. It might be a substitute for teamviewer also, at least in the internal network,

thanks again,

Mario.

Last edited by mf (2022-05-10 09:45:55)


Attachments:
Bildschirmfoto vom 2022-05-10 09-31-26.png, Size: 904.25 KiB, Downloads: 43

Offline