Skip to main content


Showing posts from September, 2014

An annoying thing with Python slices

You of course know that Python slices are awesome:

a = 'ABCDEFG' a[:3] -> 'ABC' a[2:5] -> 'CDE'
And more interestingly:

a[-3:] -> 'EFG'

a[6:4:-1] -> 'GF'
But you can see that the reverse slicing is starting to stretch the fence-post we are familiar with. Python uses zero based, inclusive-exclusive indexing. This corresponds to a C syntax of (for i = n; i < m; i++). When you reverse it the slice goes (for i = m - 1; i > n - 1; i--).

As you can imagine this starts to get ugly and at one point it gets to be wrong:

Say, as is often the case, you are not taking static, pre-determined slices but rather slices determined at runtime. Say you are taking slices between n and m or [n, m).

The forward slice is a[n:m]
The backward slice is a[m-1:n-1:-1] right? Because of the fence posts?

Well yes, except what happens when n = 0? The forward slice is fine but the reverse slice resolves to a[m-1:-1:-1]

This is where Python becomes a littl…

Mac OS + 'cat' + 'sed' + \n = half-assed

You guys all know how Mac OS darwin does everything JUST a little differently from the *nixes. It's close enough to draw you in, and different enough to stab you in the back. Today's case sed and \n

I needed to cat some files together but I needed a newline between them. I asked my colleague Wan-Ping for a command that would do this and she suggested

cat 1.fa 2.fa 3.fa | sed 's/^>/\n>/g'
So I did this and the sucker added the character 'n' wherever I expected a newline.

It turns out that Macs are special little snow flakes and need a special little command:

cat chr*.fa | sed 's/^>/\'$'\n>/g' > hg38.fa
The magic sauce is the '$' that escapes the '\n'


Fixing the big mess with git and case insensitive filesystems

Mac OS X by default is a case insensitive file system. But Mac OS, as in a lot of other things, makes a half-assed job of this. In addition to causing various bits of confusion when creating directories it also leads to a potentially messy situation with git. This is how things happen:

1. You create a directory in your source tree called, say, Plugins with a capital "P".
2. After a few commits you decide that it's better to change this to a lower case "p": plugins
3. When you go to commit this rename (perhaps with a few other changes you implemented) git throws a hissy fit.

After a bit of searching on stack overflow it turns out that this is all related to Mac OS's case-insensitivity.

The cleanest fix I found on stack overflow was:

git mv Plugins temp00 git mv temp00 plugins git commit
Apparently this fools git's index into doing the change, where as git mv Plugins plugins - because the underlying file system does not recognize the difference - tells gi…