Code Painters The Art of Coding

20Nov/120

Gevent monkey patching versus sniffer & nose

Unless you already know, gevent is a great library built on top of greenlet module and libevent event loop, that allows for easy coroutine-based cooperative multitasking in networked applications. By the way, you know the C10K paper, don't you?

One of gevent's important features is monkey patching - ability to patch some standard Python modules to make them cooperative. There's no magic here, thanks to Python's dynamic nature gevent simply replaces the standard APIs with its own cooperative equivalents. Existing blocking code can then be reused without any modifications. Cool, isn't it? Most notably, gevent patches socket (replacing blocking I/O with libevent-based asynchronous one), thread and threading modules (replacing threads with greenlets).

There are some gotchas, however. Due to the way the patching work, it is important to patch all the necessary modules before they are imported by any 3rd party library you want to use. For example, this snippet works fine:

import gevent.monkey; gevent.monkey.patch_thread()
import threading

It is enough, however, to swap the lines:

import threading
import gevent.monkey; gevent.monkey.patch_thread()

Now this simple 2 lines snippet triggers an exception:

Exception KeyError: KeyError(3072570908L,) in <module 'threading' from
 '/usr/lib/python2.7/threading.pyc'> ignored

Why it happens? When threading module is imported it uses main thread ID as a key in some module-level thread dictionary. At exit time it tries to obtain the thread instance from this dictionary (using current thread ID) to perform some clean up, but the ID of the main thread is no longer the same, thanks to patching. If patching is performed before importing the threading module (as in the first example above), all is fine, as the thread ID stored at import time comes from patched threads implementation already. For more details, see my StackOverflow answer.

The above example was trivial, things get more nasty in real life scenarios. I came across this problem again recently when trying to use sniffer for testing. It's enough to patch threading in the test file to trigger the problem. Let's start with the following test file, test.py:

import gevent.monkey; gevent.monkey.patch_thread()
import unittest

class TestTesting(unittest.TestCase):

    def test_nose(self):
        self.assertTrue(True)

Sniffer works fine when started, nothing indicates the problems to come:

Did not find 'scent.py', running nose:
.
----------------------------------------------------------------------
Ran 1 test in 0.009s

OK
In good standing

But it's enough to touch the test file to make it explode (there are more exceptions, but I copy the first one only for brevity):

Traceback (most recent call last):
  File "/home/czajnik/env/local/lib/python2.7/site-packages/sniffer/runner.py", line 95, in _run
    if self.run():
  File "/home/czajnik/env/local/lib/python2.7/site-packages/sniffer/runner.py", line 177, in run
    return super(ScentSniffer, self).run()
  File "/home/czajnik/env/local/lib/python2.7/site-packages/sniffer/runner.py", line 115, in run
    import nose
  File "/home/czajnik/env/local/lib/python2.7/site-packages/nose/__init__.py", line 1, in 
    from nose.core import collector, main, run, run_exit, runmodule
  File "/home/czajnik/env/local/lib/python2.7/site-packages/nose/core.py", line 11, in 
    from nose.config import Config, all_config_files
  File "/home/czajnik/env/local/lib/python2.7/site-packages/nose/config.py", line 8, in 
    from nose.util import absdir, tolist
  File "/home/czajnik/env/local/lib/python2.7/site-packages/nose/util.py", line 19, in 
    log = logging.getLogger('nose')
  File "/usr/lib/python2.7/logging/__init__.py", line 1555, in getLogger
    return Logger.manager.getLogger(name)
  File "/usr/lib/python2.7/logging/__init__.py", line 1022, in getLogger
    _acquireLock()
  File "/usr/lib/python2.7/logging/__init__.py", line 217, in _acquireLock
    _lock.acquire()
  File "/usr/lib/python2.7/threading.py", line 122, in acquire
    me = _get_ident()
  File "/home/czajnik/env/local/lib/python2.7/site-packages/gevent/thread.py", line 28, in get_ident
    return id(getcurrent())
TypeError: 'NoneType' object is not callable

Clearly, logging module uses threads (fair enough - it tries to make logging API thread-safe). Unfortunately, not much can be done here, all the imports are performed before our module even gets loaded. As a workaround I've hacked nose/core.py, adding this line just before logging import:

  import gevent.monkey; gevent.monkey.patch_thread()

Now sniffer works fine, yet it's a hack, obviously. For my project it's enough, as I use separate virtual environment and thus I can hack around without consequences. But this problem
definitely asks for some clean solution.

Share:
  • Facebook
  • Digg
  • del.icio.us
  • Twitter
  • LinkedIn
  • Google Bookmarks
  • Reddit
  • StumbleUpon
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

No trackbacks yet.