Skip to main content

Use numpydoc + sphinx

An ideal code documentation system should allow you to write documentation once (when you are writing the code) and then allow you to display the documentation in different contexts, such as project manuals, inline help, command line help and so on. You shouldn't need to duplicate docs - it is a waste of effort and leads to errors when the code is updated. The documentation you write should be easily readable by human users as they peruse the code, as well as by users as they run your code.

Sphinx is an awesome documentation system for Python. Sphinx can take descriptions in docstrings and embed them in documentation so that we can approach the goals of an ideal documentation system.

However, Sphinx violates the human readability part when it comes to function parameter descriptions. Take the following function, for example.


def _repeat_sequence(seq_len=100, subseq=None, subseq_len=10, base_sel_rng=None, alphabet=['A', 'C', 'T', 'G']):
"""
Create a sequence by repeating a sub-sequence
"""
subseq = base_sel_rng.choice(alphabet, size=subseq_len, replace=True, p=[.3, .2, .2, .3]).tostring()
return subseq * (seq_len / subseq_len) + subseq[:seq_len % subseq_len]


The Sphinx-parsable way of writing the docstring is:

def _repeat_sequence(seq_len=100, subseq=None, subseq_len=10, base_sel_rng=None, alphabet=['A', 'C', 'T', 'G']):
  """Create a sequence by repeating a sub-sequence

  :param seq_len: Length of sequence.
  :type seq_len: int
  :param subseq: Sub-sequence to use as repeat block. Omit to generate a random sub-sequence.
  :type subseq: str
  :param subseq_len: If subseq is omitted this must be provided to indicate desired length of random sub-sequence
  :type subseq_len: Length of random sub-sequence
  :param base_sel_rng: Random number generator e.g. numpy.random
  :param alphabet: List of characters constituting the alphabet
  :type alphabet: list

  :returns:  str -- the sequence.
  """
  subseq = base_sel_rng.choice(alphabet, size=subseq_len, replace=True, p=[.3, .2, .2, .3]).tostring()
  return subseq * (seq_len / subseq_len) + subseq[:seq_len % subseq_len]

It does not matter that the Sphinx output is well formatted: for humans it's rather yucky to look at which defeats the purpose of docstrings.

It turns out that, as is common in Python, this problem has been noted and rectified. In this case the correction comes from the great folks at numpy who have a nice extension called numpydoc. The numpydoc version of this docstring is:

def repeat_sequence(seq_len=100, subseq=None, subseq_len=10, base_sel_rng=None, alphabet=['A', 'C', 'T', 'G']):
  """Create a sequence by repeating a sub-sequence

  Parameters
  ----------
  seq_len : int
                              Length of sequence.
  subseq : str, optional
                              Sub-sequence to use as repeat block. Omit to generate a random sub-sequence.
  subseq_len : int, optional
                              If subseq is omitted this must be provided to indicate desired length of random
                              sub-sequence
  subseq_len : int
                              Length of random sub-sequence
  base_sel_rng : object
                              Random number generator e.g. numpy.random
  alphabet : list, optional
                              List of characters constituting the alphabet

  Returns
  -------
  str
      The sequence.

  .. note:: This is meant to be used internally

  """
  subseq = base_sel_rng.choice(alphabet, size=subseq_len, replace=True, p=[.3, .2, .2, .3]).tostring()
  return subseq * (seq_len / subseq_len) + subseq[:seq_len % subseq_len]

Note: The only glitchy thing is that when you add numpydoc to the list of extensions in Sphinx's conf.py it can't be ahead of standard sphinx extensions - the order matters. For example, my extensions list is:

extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.doctest',
    'sphinx.ext.todo',
    'sphinx.ext.coverage',
    'sphinx.ext.pngmath',
    'sphinx.ext.mathjax',
    'sphinx.ext.viewcode',
    'numpydoc.numpydoc'
]

Putting numpydoc first leads to the error:
Extension error:
Unknown event name: autodoc-process-docstring

Comments

  1. Perhaps coming late here, but the solution (with Sphinx 1.3.5) is :

    extensions = [..., 'sphinx.ext.napoleon']

    Now you have additional options :

    http://www.sphinx-doc.org/en/stable/ext/napoleon.html

    ReplyDelete

Post a Comment

Popular posts from this blog

Flowing text in inkscape (Poster making)

You can flow text into arbitrary shapes in inkscape. (From a hint here).

You simply create a text box, type your text into it, create a frame with some drawing tool, select both the text box and the frame (click and shift) and then go to text->flow into frame.

UPDATE:

The omnipresent anonymous asked:
Trying to enter sentence so that text forms the number three...any ideas?
The solution:
Type '3' using the text toolConvert to path using object->pathSize as necessaryRemove fillUngroupType in actual text in new text boxSelect the text and the '3' pathFlow the text

Pandas panel = collection of tables/data frames aligned by index and column

Pandas panel provides a nice way to collect related data frames together while maintaining correspondence between the index and column values:


import pandas as pd, pylab #Full dimensions of a slice of our panel index = ['1','2','3','4'] #major_index columns = ['a','b','c'] #minor_index df = pd.DataFrame(pylab.randn(4,3),columns=columns,index=index) #A full slice of the panel df2 = pd.DataFrame(pylab.randn(3,2),columns=['a','c'],index=['1','3','4']) #A partial slice df3 = pd.DataFrame(pylab.randn(2,2),columns=['a','b'],index=['2','4']) #Another partial slice df4 = pd.DataFrame(pylab.randn(2,2),columns=['d','e'],index=['5','6']) #Partial slice with a new column and index pn = pd.Panel({'A': df}) pn['B'] = df2 pn['C'] = df3 pn['D'] = df4 for key in pn.items: print pn[key] -> output …

Drawing circles using matplotlib

Use the pylab.Circle command

import pylab #Imports matplotlib and a host of other useful modules cir1 = pylab.Circle((0,0), radius=0.75, fc='y') #Creates a patch that looks like a circle (fc= face color) cir2 = pylab.Circle((.5,.5), radius=0.25, alpha =.2, fc='b') #Repeat (alpha=.2 means make it very translucent) ax = pylab.axes(aspect=1) #Creates empty axes (aspect=1 means scale things so that circles look like circles) ax.add_patch(cir1) #Grab the current axes, add the patch to it ax.add_patch(cir2) #Repeat pylab.show()