Python regular expressions are a very concise and efficient way of performing pattern matching on strings. Many computing problems involves a similar kind of pattern matching, but on arbitrary data. For my particular application I have a long sequence of one byte digital codes that indicate a sequence of runs for an experiment. Each run starts with the codes 9,9,9 followed by some codes telling me what happened during the experiment and ends with 18,18,18. I need to split up this long sequence of codes into runs and then parse the events in each run.
In the past I would have written a state machine to do this, but I thought, that's a waste: the regexp module already implements logic of this kind. So I came up with the following:
ec = array.array('B', [ev & 0xff for ev in event_sequence]).tostring()
separate = re.compile(r"\x09\x09\x09(.*?)\x12\x12\x12")
trials = separate.findall(ec)
array.array converts the sequence of bytes into a fake string and the regexp does its usual magic.
In the past I would have written a state machine to do this, but I thought, that's a waste: the regexp module already implements logic of this kind. So I came up with the following:
ec = array.array('B', [ev & 0xff for ev in event_sequence]).tostring()
separate = re.compile(r"\x09\x09\x09(.*?)\x12\x12\x12")
trials = separate.findall(ec)
array.array converts the sequence of bytes into a fake string and the regexp does its usual magic.
Comments
Post a Comment