Skip to main content

Notes on distributing cython code

One of the conveniences of Python is the package system which allows you to install your program and any dependencies smoothly. The package system works very well when the code is pure Python, but can run into trouble when code written in cython or c is part of the program.

I will illustrate some mis-steps I made while writing a install script for an example program that is a mixture of Python and Cython. I've put the code up on github and each step is a commit tag. You can follow along by setting up a virtual environment using virtualenvwrapper:

mkvirtualenv cy-test

And then trying to install the appropriate tag, e.g:

git clone git@github.com:kghose/cython-example.git
cd cython-example
git checkout ex2

ex1

The module installs without errors, but because of me not indicating the paths of the cython files properly (I omit the kgcyex directory in the path) the cython files do not compile. You will note this because there are no compilation messages during the install, though the failure is otherwise silent
kghose$ kgcyex
Traceback (most recent call last):
  File "/Users/kghose/.venvs/blog/bin/kgcyex", line 9, in <module>
    load_entry_point('kgcyex==1.0.0', 'console_scripts', 'kgcyex')()
  File "/Users/kghose/.venvs/blog/lib/python2.7/site-packages/pkg_resources.py", line 356, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/Users/kghose/.venvs/blog/lib/python2.7/site-packages/pkg_resources.py", line 2431, in load_entry_point
    return ep.load()
  File "/Users/kghose/.venvs/blog/lib/python2.7/site-packages/pkg_resources.py", line 2147, in load
    ['__name__'])
  File "/Users/kghose/.venvs/blog/lib/python2.7/site-packages/kgcyex/main.py", line 2, in <module>
    import kgcyex.cy1 as cy1
ImportError: No module named cy1

ex2

I correctly write out the full paths of the cython modules, and everything installs and runs fine.
kghose$ kgcyex
foo from kgcyex.mod1
foo from kgcyex.cy1
foo from kgcyex.lib.mod2
foo from kgcyex.lib.cy2

ex3

Suppose the other user does not have Cython? The cython documentation suggests that we distribute the generated c code with the source. There is some debate as to whether this is "proper" since the .c files are actually generated from the .pyx files and in principle we should only really be distributing files which can not be auto-generated from the "real" source. For now, we put pragmatism over principle. Note that the setup.py changes a bit
If you read the setup.py you will note that I have used a check to test if the user has Cython or not. This check then tells setup to either use the .pyx files or the .c files. This is standard stuff recommended by the Cython folks. Look carefully at the setup.py where I add the extensions.
extensions = [Extension("cy1", ["kgcyex/cy1"+ext]), Extension("cy2", ["kgcyex/lib/cy2"+ext])]
Things compile properly because I've remembered to indicate the peoper path to the .pyx (or .c) files. When we run setup.py we can see the modules being compiled. But what the #$%@! when we go to run the code it again complains that it can find the compiled modules! In real life this error caused me to lose about an hour :(
My error was that though I had correctly indicated the path to the source (the second parameter forExtension) I had not given the proper dotted path for the modules themselves. If you look undersite-packages of your installation you will note that there are two compiled modules cy1.so andcy2.so directly under site-packages rather than in their proper places under kgcyex andkgcyex/lib. The correct form of this line is ...

ex4

extensions = [Extension("kgcyex.cy1", ["kgcyex/cy1"+ext]), Extension("kgcyex.lib.cy2", ["kgcyex/lib/cy2"+ext])]





Comments

Popular posts from this blog

A note on Python's __exit__() and errors

Python's context managers are a very neat way of handling code that needs a teardown once you are done. Python objects have do have a destructor method ( __del__ ) called right before the last instance of the object is about to be destroyed. You can do a teardown there. However there is a lot of fine print to the __del__ method. A cleaner way of doing tear-downs is through Python's context manager , manifested as the with keyword. class CrushMe: def __init__(self): self.f = open('test.txt', 'w') def foo(self, a, b): self.f.write(str(a - b)) def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.f.close() return True with CrushMe() as c: c.foo(2, 3) One thing that is important, and that got me just now, is error handling. I made the mistake of ignoring all those 'junk' arguments ( exc_type, exc_val, exc_tb ). I just skimmed the docs and what popped out is that you need to return True or...

Store numpy arrays in sqlite

Use numpy.getbuffer (or sqlite3.Binary ) in combination with numpy.frombuffer to lug numpy data in and out of the sqlite3 database: import sqlite3, numpy r1d = numpy.random.randn(10) con = sqlite3.connect(':memory:') con.execute("CREATE TABLE eye(id INTEGER PRIMARY KEY, desc TEXT, data BLOB)") con.execute("INSERT INTO eye(desc,data) VALUES(?,?)", ("1d", sqlite3.Binary(r1d))) con.execute("INSERT INTO eye(desc,data) VALUES(?,?)", ("1d", numpy.getbuffer(r1d))) res = con.execute("SELECT * FROM eye").fetchall() con.close() #res -> #[(1, u'1d', <read-write buffer ptr 0x10371b220, size 80 at 0x10371b1e0>), # (2, u'1d', <read-write buffer ptr 0x10371b190, size 80 at 0x10371b150>)] print r1d - numpy.frombuffer(res[0][2]) #->[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] print r1d - numpy.frombuffer(res[1][2]) #->[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] Note that for work where data ty...