Saturday, August 10, 2013

Manipulating pandas data structures

I really enjoy using the Pandas Series and DataFrame objects. I find, however, that methods to update the series/frame are clunky. For a DataFrame it's pretty easy to add columns - you create a DataFrame or a Series and you just assign it. But adding rows to a Series or DataFrame is a bit clunky.

I sometimes have the need to modify a certain row with new data or add that row if it does not exist, which in a database would be a 'replace or insert' operation. You can concat or append another Series or DataFrame but I have not found a nice way of handling the 'replace or insert' case.

If the structure is small I simply convert it into a dictionary and manipulate the structure using the dictionary keys and then recreate the pandas structure.

If the structure is large I do an explicit test for the index (row) and then decide whether to append or replace.


No comments:

Post a Comment