Skip to main content

Python: using a regexp to do pattern matching on a byte stream

Python regular expressions are a very concise and efficient way of performing pattern matching on strings. Many computing problems involves a similar kind of pattern matching, but on arbitrary data. For my particular application I have a long sequence of one byte digital codes that indicate a sequence of runs for an experiment. Each run starts with the codes 9,9,9 followed by some codes telling me what happened during the experiment and ends with 18,18,18. I need to split up this long sequence of codes into runs and then parse the events in each run.

In the past I would have written a state machine to do this, but I thought, that's a waste: the regexp module already implements logic of this kind. So I came up with the following:

  ec = array.array('B', [ev & 0xff for ev in event_sequence]).tostring()
  separate = re.compile(r"\x09\x09\x09(.*?)\x12\x12\x12")
  trials = separate.findall(ec)

array.array converts the sequence of bytes into a fake string and the regexp does its usual magic.

Comments

Popular posts from this blog

A note on Python's __exit__() and errors

Python's context managers are a very neat way of handling code that needs a teardown once you are done. Python objects have do have a destructor method ( __del__ ) called right before the last instance of the object is about to be destroyed. You can do a teardown there. However there is a lot of fine print to the __del__ method. A cleaner way of doing tear-downs is through Python's context manager , manifested as the with keyword. class CrushMe: def __init__(self): self.f = open('test.txt', 'w') def foo(self, a, b): self.f.write(str(a - b)) def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.f.close() return True with CrushMe() as c: c.foo(2, 3) One thing that is important, and that got me just now, is error handling. I made the mistake of ignoring all those 'junk' arguments ( exc_type, exc_val, exc_tb ). I just skimmed the docs and what popped out is that you need to return True or

Using adminer on Mac OS X

adminer is a nice php based sqlite manager. I prefer the firefox plugin "sqlite manager" but it currently has a strange issue with FF5 that basically makes it unworkable, so I was looking for an alternative to tide me over. I really don't want apache running all the time on my computer and don't want people browsing to my computer, so what I needed to do was: Download the adminer php script into /Library/WebServer/Documents/ Change /etc/apache2/httpd.conf to allow running of php scripts (uncomment the line that begins: LoadModule php5_module Start the apache server: sudo apachectl -k start Operate the script by going to localhost Stop the server: sudo apachectl -k stop