Skip to main content

Those geneticists and their Excel

Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics

If you are too lazy to get to the punchline:

"MatchMiner [1] and GoMiner [2] are two bioinformatics program packages we
published recently in another Biomed Central Journal, Genome Biology. When we
were beta-testing those programs on microarray data, a frustrating problem
occurred repeatedly: Some gene names kept bouncing back as "unknown." A little
detective work revealed the reason: Use of one of the research community's most
valuable and extensively applied tools for manipulation of genomic data can
introduce erroneous names. A default date conversion feature in Excel (Microsoft
Corp., Redmond, WA) was altering gene names that it considered to look like
dates. For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1]
[3] was being converted to '1-DEC.' Figure 1 lists 30 gene names that suffer an
analogous fate."


Popular posts from this blog

Flowing text in inkscape (Poster making)

You can flow text into arbitrary shapes in inkscape. (From a hint here).

You simply create a text box, type your text into it, create a frame with some drawing tool, select both the text box and the frame (click and shift) and then go to text->flow into frame.


The omnipresent anonymous asked:
Trying to enter sentence so that text forms the number three...any ideas?
The solution:
Type '3' using the text toolConvert to path using object->pathSize as necessaryRemove fillUngroupType in actual text in new text boxSelect the text and the '3' pathFlow the text

Drawing circles using matplotlib

Use the pylab.Circle command

import pylab #Imports matplotlib and a host of other useful modules cir1 = pylab.Circle((0,0), radius=0.75, fc='y') #Creates a patch that looks like a circle (fc= face color) cir2 = pylab.Circle((.5,.5), radius=0.25, alpha =.2, fc='b') #Repeat (alpha=.2 means make it very translucent) ax = pylab.axes(aspect=1) #Creates empty axes (aspect=1 means scale things so that circles look like circles) ax.add_patch(cir1) #Grab the current axes, add the patch to it ax.add_patch(cir2) #Repeat

Pandas panel = collection of tables/data frames aligned by index and column

Pandas panel provides a nice way to collect related data frames together while maintaining correspondence between the index and column values:

import pandas as pd, pylab #Full dimensions of a slice of our panel index = ['1','2','3','4'] #major_index columns = ['a','b','c'] #minor_index df = pd.DataFrame(pylab.randn(4,3),columns=columns,index=index) #A full slice of the panel df2 = pd.DataFrame(pylab.randn(3,2),columns=['a','c'],index=['1','3','4']) #A partial slice df3 = pd.DataFrame(pylab.randn(2,2),columns=['a','b'],index=['2','4']) #Another partial slice df4 = pd.DataFrame(pylab.randn(2,2),columns=['d','e'],index=['5','6']) #Partial slice with a new column and index pn = pd.Panel({'A': df}) pn['B'] = df2 pn['C'] = df3 pn['D'] = df4 for key in pn.items: print pn[key] -> output …