Proposed Fix for GTK3 Crashes - dlopen flags

Hi all,

I've been working on investigating a couple of crashes that occur with wxPython 3.0.2.0 with a GTK+3 backend:
http://trac.wxwidgets.org/ticket/16820

There is a similar crash that occurs when using a wx.Printer.

Anyway, I've spent considerable time investigating these and found that the crash is because GTK performs some dlsym() calls to lookup symbol names and fails in going so because its symbols were loaded in the local namespace, not the global namespace (ie, RTLD_LOCAL vs RTLD_GLOBAL). I've discussed this with the GTK devs and basically, GTK expects its symbols to be loaded globally.

Thus, I have a solution that I would like to propose for discussion:

diff -up wxPython/src/__init__.py.dlopenflags wxPython/src/__init__.py
--- wxPython/src/__init__.py.dlopenflags 2013-02-27 15:14:01.000000000 -0500
+++ wxPython/src/__init__.py 2015-02-24 20:43:44.143492336 -0500
@@ -42,7 +42,11 @@ __all__ = [
      ]

  # Load the package namespace with the core classes and such
+import dl, sys
+flags = sys.getdlopenflags()
+sys.setdlopenflags(flags|dl.RTLD_GLOBAL)
  from wx._core import *
+sys.setdlopenflags(flags)
  del wx

  if 'wxMSW' in PlatformInfo:

···

===

Basically, this would change the dlopen flags before importing the core module to RTLD_GLOBAL and then change them back afterwards. That way, the core module and its dependencies (to include GTK) are loaded into the global namespace.

Comments on this proposal?

It needs more work but I wanted to get feedback before finalizing a patch. (ie, it probably only needs to be done on when we're using a GTK+3 backend, needs to check whether the dl module is available, etc.)

Thanks,
Scott

Okay, here's an updated version of the patch. Unfortunately, it isn't possible to check the backend (and only change the dlopen flags if on GTK+3) because the backend isn't known until _core.so is imported. After that point, it's too late.

It's possible that we could somehow have SWIG put this into the generated _core.py only for GTK+3, but I'm not sure how to do that.

Comments?

Thanks,
Scott

dlopenflags_v2.patch (1.2 KB)

···

On Tue, 24 Feb 2015, Scott Talbert wrote:

I've been working on investigating a couple of crashes that occur with wxPython 3.0.2.0 with a GTK+3 backend:
wxTrac has been migrated to GitHub Issues - wxWidgets

There is a similar crash that occurs when using a wx.Printer.

Anyway, I've spent considerable time investigating these and found that the crash is because GTK performs some dlsym() calls to lookup symbol names and fails in going so because its symbols were loaded in the local namespace, not the global namespace (ie, RTLD_LOCAL vs RTLD_GLOBAL). I've discussed this with the GTK devs and basically, GTK expects its symbols to be loaded globally.

Thus, I have a solution that I would like to propose for discussion:

diff -up wxPython/src/__init__.py.dlopenflags wxPython/src/__init__.py
--- wxPython/src/__init__.py.dlopenflags 2013-02-27 15:14:01.000000000 -0500
+++ wxPython/src/__init__.py 2015-02-24 20:43:44.143492336 -0500
@@ -42,7 +42,11 @@ __all__ = [
    ]

# Load the package namespace with the core classes and such
+import dl, sys
+flags = sys.getdlopenflags()
+sys.setdlopenflags(flags|dl.RTLD_GLOBAL)
from wx._core import *
+sys.setdlopenflags(flags)
del wx

if 'wxMSW' in PlatformInfo:

===

Basically, this would change the dlopen flags before importing the core module to RTLD_GLOBAL and then change them back afterwards. That way, the core module and its dependencies (to include GTK) are loaded into the global namespace.

Comments on this proposal?

It needs more work but I wanted to get feedback before finalizing a patch. (ie, it probably only needs to be done on when we're using a GTK+3 backend, needs to check whether the dl module is available, etc.)

Scott Talbert wrote:

···

On Tue, 24 Feb 2015, Scott Talbert wrote:

I've been working on investigating a couple of crashes that occur with
wxPython 3.0.2.0 with a GTK+3 backend:
wxTrac has been migrated to GitHub Issues - wxWidgets

There is a similar crash that occurs when using a wx.Printer.

Anyway, I've spent considerable time investigating these and found
that the crash is because GTK performs some dlsym() calls to lookup
symbol names and fails in going so because its symbols were loaded in
the local namespace, not the global namespace (ie, RTLD_LOCAL vs
RTLD_GLOBAL). I've discussed this with the GTK devs and basically, GTK
expects its symbols to be loaded globally.

Thus, I have a solution that I would like to propose for discussion:

diff -up wxPython/src/__init__.py.dlopenflags wxPython/src/__init__.py
--- wxPython/src/__init__.py.dlopenflags 2013-02-27 15:14:01.000000000
-0500
+++ wxPython/src/__init__.py 2015-02-24 20:43:44.143492336 -0500
@@ -42,7 +42,11 @@ __all__ = [
]

# Load the package namespace with the core classes and such
+import dl, sys
+flags = sys.getdlopenflags()
+sys.setdlopenflags(flags|dl.RTLD_GLOBAL)
from wx._core import *
+sys.setdlopenflags(flags)
del wx

if 'wxMSW' in PlatformInfo:

===

Basically, this would change the dlopen flags before importing the
core module to RTLD_GLOBAL and then change them back afterwards. That
way, the core module and its dependencies (to include GTK) are loaded
into the global namespace.

Comments on this proposal?

It needs more work but I wanted to get feedback before finalizing a
patch. (ie, it probably only needs to be done on when we're using a
GTK+3 backend, needs to check whether the dl module is available, etc.)

Okay, here's an updated version of the patch. Unfortunately, it isn't
possible to check the backend (and only change the dlopen flags if on
GTK+3) because the backend isn't known until _core.so is imported. After
that point, it's too late.

It's possible that we could somehow have SWIG put this into the
generated _core.py only for GTK+3, but I'm not sure how to do that.

Comments?

I'm curious how pygtk handles this. Have you checked there?

IIRC in the early days of OSX the default was the OSX equivalent of RTLD_GLOBAL, and it caused some hard to diagnose issues. The alternative at the time was a 2 level namespace and using that meant that extension modules couldn't find the Python C API functions, so it was even worse. So setting RTLD_GLOBAL worries me a bit.

--
Robin Dunn
Software Craftsman

Basically, this would change the dlopen flags before importing the
core module to RTLD_GLOBAL and then change them back afterwards. That
way, the core module and its dependencies (to include GTK) are loaded
into the global namespace.

Comments on this proposal?

It needs more work but I wanted to get feedback before finalizing a
patch. (ie, it probably only needs to be done on when we're using a
GTK+3 backend, needs to check whether the dl module is available, etc.)

Okay, here's an updated version of the patch. Unfortunately, it isn't
possible to check the backend (and only change the dlopen flags if on
GTK+3) because the backend isn't known until _core.so is imported. After
that point, it's too late.

It's possible that we could somehow have SWIG put this into the
generated _core.py only for GTK+3, but I'm not sure how to do that.

Comments?

I'm curious how pygtk handles this. Have you checked there?

PyGTK is GTK2 only. I don't think they started doing this dynamic UI
builder stuff until GTK3. In PyGObject (the recommended GTK3 replacement
for PyGTK), they basically do exactly this - dlopen all the GTK libraries
into the global namespace.

IIRC in the early days of OSX the default was the OSX equivalent of
RTLD_GLOBAL, and it caused some hard to diagnose issues. The alternative
at the time was a 2 level namespace and using that meant that extension
modules couldn't find the Python C API functions, so it was even worse.
So setting RTLD_GLOBAL worries me a bit.

I too am a bit worried about the impacts of RTLD_GLOBAL. Thus, I suggested a less risky workaround patch to wxWidgets for the Font Chooser crash - posted to the bug report. It could probably be used in the wxPython C++ code if the wxWidgets guys won't take it.

Scott

···

On Tue, 3 Mar 2015, Robin Dunn wrote: