Docstring and Doc plans

Robin · November 14, 2003, 6:33pm

Hi all,

I figured that shedding a bit more light on my documentation plans may help with the param with-types/without-types question and other questions for the docstrings, so here it is.

1. (mostly complete) Use SWIG to generate intelligent docstrings based on the information in the parse tree. SWIG already knows about things like renamed overloaded methods, parameter names and types, etc. so the smart thing to do is to use SWIG's knowledge to do the vast majority of the work. That way when something changes in the code the docstring is automatically updated on the next build. The autodoc strings can be added to or replaced with a docstring directive.

2. (ready to be started) The next step is to go through and fix the autodoc strings that need it (for example change "GetViewStart(OUTPUT, OUTPUT) -> None" to "GetViewStart() -> (x,y)") and then add simple one or two sentence descriptions to all the functions/methods that gives at least a summary of what they do. I think that this is all that needs to be in the docstrings, and is all that needs to be used in call tips, etc.

3. (still just a gleam in the eye) For those items that do need more detailed docs, then those details (written in something like ReST or epydoc format) will also be added to the SWIG parse tree, but will not be used at all for code generation. Instead they will be extracted from an XML dump of the parse tree, along with the actual docstrings from #1/#2 and a bit of additional metadata, and then they will be put into a more simplified XML doc. (I would also like to extract metadata and docstrings from the lib modules and put them in this same XML format.) Then tools can be written to transform that to full reference docs in HTML, PDFs or whatever, or apps like Boa can just use the XML directly.

The detailed docs are where things like parameter types can be documented if they are left out of the autodoc, (but then they would be hand-edited and not automatically maintained...) Also things like See Also's, style flags, event types, etc.

Questions: Is there already a simple XML format for #3 defined that we can use? Does anybody feel like starting work on the lib module extraction and html conversion tools for this?

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Kevin_Ollivier4 · November 14, 2003, 7:12pm

Hi Robin,

Hi all,

I figured that shedding a bit more light on my documentation plans may help with the param with-types/without-types question and other questions for the docstrings, so here it is.

1. (mostly complete) Use SWIG to generate intelligent docstrings based on the information in the parse tree. SWIG already knows about things like renamed overloaded methods, parameter names and types, etc. so the smart thing to do is to use SWIG's knowledge to do the vast majority of the work. That way when something changes in the code the docstring is automatically updated on the next build. The autodoc strings can be added to or replaced with a docstring directive.

2. (ready to be started) The next step is to go through and fix the autodoc strings that need it (for example change "GetViewStart(OUTPUT, OUTPUT) -> None" to "GetViewStart() -> (x,y)") and then add simple one or two sentence descriptions to all the functions/methods that gives at least a summary of what they do. I think that this is all that needs to be in the docstrings, and is all that needs to be used in call tips, etc.

3. (still just a gleam in the eye) For those items that do need more detailed docs, then those details (written in something like ReST or epydoc format) will also be added to the SWIG parse tree, but will not be used at all for code generation. Instead they will be extracted from an XML dump of the parse tree, along with the actual docstrings from #1/#2 and a bit of additional metadata, and then they will be put into a more simplified XML doc. (I would also like to extract metadata and docstrings from the lib modules and put them in this same XML format.) Then tools can be written to transform that to full reference docs in HTML, PDFs or whatever, or apps like Boa can just use the XML directly.

The detailed docs are where things like parameter types can be documented if they are left out of the autodoc, (but then they would be hand-edited and not automatically maintained...) Also things like See Also's, style flags, event types, etc.

Questions: Is there already a simple XML format for #3 defined that we can use? Does anybody feel like starting work on the lib module extraction and html conversion tools for this?

DocBook is the closest thing I can think of to what you're looking for here, in that it does have some HTML, PDF, etc. conversion tools available and has some specific tags for software and API documentation. #3 obviously will need some fleshing out before we make some solid decisions on what to do, but I wouldn't mind helping on the conversion aspect of this. In fact, I'm getting ready to produce some API docs for wxMozilla and since I'll need Python docs anyways this might be a good way to start exploring how that could be automated. (If I can run both with and without types, I could probably use it for C++ docs as well.) However, before that I will be looking at the Mac issues for 2.5.1 so it is still a gleam in my eye as well. If someone wants to take initiative here and look into this before I can, then please do. =)

Will the CVS SWIG generate docstrings as is, or is that part of the patch you mentioned in a previous message?

Thanks,

Kevin

···

On Nov 14, 2003, at 10:33 AM, Robin Dunn wrote:

Patrick_K_O_Brien · November 14, 2003, 7:44pm

Kevin Ollivier <kevino@tulane.edu> writes:

DocBook is the closest thing I can think of to what you're looking
for here, in that it does have some HTML, PDF, etc. conversion tools
available and has some specific tags for software and API
documentation.

Using DocBook would be killing the fly with a sledgehammer. Better
watch your fingers!

Better yet, look at reST or epydoc.

http://epydoc.sourceforge.net/

Twisted uses epydoc conventions for documenting types and other
parameter requirements in docstrings. I think epydoc is as close to a
Python standard as one will find. And Twisted is one heck of a big
framework. So we would be in good company.

http://twistedmatrix.com/documents/TwistedDocs/current/api/

Epydoc has already invented a wheel that works. I'd like to see
reasons why people would rather not use epydoc before we try to invent
our own wheel. Remember, the goal is to have an entire automobile
that works...

···

--
Patrick K. O'Brien
Orbtech http://www.orbtech.com/web/pobrien
-----------------------------------------------
"Your source for Python programming expertise."
-----------------------------------------------

Robin · November 14, 2003, 7:58pm

Kevin Ollivier wrote:

Will the CVS SWIG generate docstrings as is, or is that part of the patch you mentioned in a previous message?

No, I havn't even submitted it as a patch yet. When I think it is ready to go I'll do that and also put the patch in the wx CVS in the wxWindows/wxPython/SWIG dir so that you can patch your own copy of SWIG.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Robin · November 14, 2003, 8:40pm

Patrick K. O'Brien wrote:

Kevin Ollivier <kevino@tulane.edu> writes:

DocBook is the closest thing I can think of to what you're looking
for here, in that it does have some HTML, PDF, etc. conversion tools
available and has some specific tags for software and API
documentation.

Using DocBook would be killing the fly with a sledgehammer. Better
watch your fingers!

I agree. Although if the tools took the simpler xml structure that I'll be able to generate from SWIG's xml and converted it to a subset of DocBook that might be okay.

Better yet, look at reST or epydoc.

http://epydoc.sourceforge.net/

Twisted uses epydoc conventions for documenting types and other
parameter requirements in docstrings. I think epydoc is as close to a
Python standard as one will find. And Twisted is one heck of a big
framework. So we would be in good company.

Pat, I have a few goals for the reference docs, some of which I alluded to earlier in #3, maybe you could think about these and see if there is a good fit based on what you know about reST and epydoc.

1. I want the text content to be easy to edit and read as plain text with inline markup that doesn't get in the way. This obviously leads to something like reST or epy.

2. For various reasons I don't want the reference docs to be generated by scanning the real modules in the real wx package. Firstly because then we would have to put way too much text in some of the docstrings, or scrimp and not put enough because they are docstrings too. Secondly tools that I've seen for that will document *everything* and the docs end up bloated with nodes for things like wxNO_BORDER, or wxWindowPtr with no text about it, and things like that are better docuemtned with the class or method where they are supposed to be used anyway. I'm sure there are ways to avoid having the tools do that, but then we (I) would have to maintain or generate yet another list of things...

3. I think that the intermediate XML doc with all the data for the class/method docs is important, but we end up with lots of little snippets of reST or epytext that need to be combined in some way that leads to a coherent set of reference docs with an intuitive organizational structure, cross-links and such. Are any of the existing tools able to handle a situation such as this? I suppose one approach is to take the XML and generate a fake python wx package with just the items that we want documented and then use that with the existing tools, but that sounds icky. The approach I was orginally thinking of is that as part of the process of building that XML it will convert each of the little snippets of content text into reST's XML format (it does have one doesn't it?) and then making that structured XML be a part of the overall structure, in addition to the unconverted raw text. Finally, converting the whole thing to whatever format is wanted on the back end really shouldn't be all that hard.

Thoughts?

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Patrick_K_O_Brien · November 14, 2003, 9:27pm

Robin Dunn <robin@alldunn.com> writes:

Pat, I have a few goals for the reference docs, some of which I
alluded to earlier in #3, maybe you could think about these and see
if there is a good fit based on what you know about reST and epydoc.

1. I want the text content to be easy to edit and read as plain text
with inline markup that doesn't get in the way. This obviously
leads to something like reST or epy.

And epydoc/epytext recognizes reST conventions.

2. For various reasons I don't want the reference docs to be
generated by scanning the real modules in the real wx package.
Firstly because then we would have to put way too much text in some
of the docstrings, or scrimp and not put enough because they are
docstrings too.

I understand. But since most of these tools expect to work with
valid Python modules, we could think about producing versions of the
modules that existed strictly to produce documentation. Code is just
text, after all. I pulled a variety of tricks to produce the epydocs
for wxPython that are on my website (and are now quite out of date).

Secondly tools that I've seen for that will document *everything*
and the docs end up bloated with nodes for things like wxNO_BORDER,
or wxWindowPtr with no text about it, and things like that are
better docuemtned with the class or method where they are supposed
to be used anyway. I'm sure there are ways to avoid having the
tools do that, but then we (I) would have to maintain or generate
yet another list of things...

It seems to me that we're going to have that challenge no matter what
tool is used, whether reST, epy, DocBook, or custom code.

3. I think that the intermediate XML doc with all the data for the
class/method docs is important, but we end up with lots of little
snippets of reST or epytext that need to be combined in some way
that leads to a coherent set of reference docs with an intuitive
organizational structure, cross-links and such. Are any of the
existing tools able to handle a situation such as this?

Not that I know of. It seems that two things are always needed: a
reference doc that is strictly for looking up information, and a book
that can actually be read. It's very hard to combine the two,
especially with an automated tool. I don't know of any good solution
to this problem.

I suppose one approach is to take the XML and generate a fake python
wx package with just the items that we want documented and then use
that with the existing tools, but that sounds icky.

Replace "icky" with "clever". Then document the approach, call it a
"pattern", and give it a cool-sounding name.

The approach I was orginally thinking of is that as part of the
process of building that XML it will convert each of the little
snippets of content text into reST's XML format (it does have one
doesn't it?) and then making that structured XML be a part of the
overall structure, in addition to the unconverted raw text.

reST has an XML format, but not quite the way you describe. It uses a
tree structure internally, and can produce XML and "pseudo" XML files.
But, as far as I know, it always expects a reST text file as input,
not an XML file. Far be it from me to ever say something couldn't be
done <big wink>, but I think what you describe would require some
hacking on the reST code. The alternative is to simply generate some
kind of text that can be run through reST to produce an XML file.

And, no, my hand was not raised to volunteer for this task. I just
had an itch. A real itch, not a figurative "open source saves the
day" kind of itch.

Finally, converting the whole thing to whatever format is wanted on
the back end really shouldn't be all that hard.

Sounds like you're making good progress so far.

···

--
Patrick K. O'Brien
Orbtech http://www.orbtech.com/web/pobrien
-----------------------------------------------
"Your source for Python programming expertise."
-----------------------------------------------

Kevin_Ollivier4 · November 14, 2003, 10:19pm

Hi Robin,

Patrick K. O'Brien wrote:

Kevin Ollivier <kevino@tulane.edu> writes:

DocBook is the closest thing I can think of to what you're looking
for here, in that it does have some HTML, PDF, etc. conversion tools
available and has some specific tags for software and API
documentation.

Using DocBook would be killing the fly with a sledgehammer. Better
watch your fingers!

I agree. Although if the tools took the simpler xml structure that I'll be able to generate from SWIG's xml and converted it to a subset of DocBook that might be okay.

Better yet, look at reST or epydoc.
http://epydoc.sourceforge.net/
Twisted uses epydoc conventions for documenting types and other
parameter requirements in docstrings. I think epydoc is as close to a
Python standard as one will find. And Twisted is one heck of a big
framework. So we would be in good company.

Pat, I have a few goals for the reference docs, some of which I alluded to earlier in #3, maybe you could think about these and see if there is a good fit based on what you know about reST and epydoc.

1. I want the text content to be easy to edit and read as plain text with inline markup that doesn't get in the way. This obviously leads to something like reST or epy.

2. For various reasons I don't want the reference docs to be generated by scanning the real modules in the real wx package. Firstly because then we would have to put way too much text in some of the docstrings, or scrimp and not put enough because they are docstrings too. Secondly tools that I've seen for that will document *everything* and the docs end up bloated with nodes for things like wxNO_BORDER, or wxWindowPtr with no text about it, and things like that are better docuemtned with the class or method where they are supposed to be used anyway. I'm sure there are ways to avoid having the tools do that, but then we (I) would have to maintain or generate yet another list of things...

3. I think that the intermediate XML doc with all the data for the class/method docs is important, but we end up with lots of little snippets of reST or epytext that need to be combined in some way that leads to a coherent set of reference docs with an intuitive organizational structure, cross-links and such. Are any of the existing tools able to handle a situation such as this?

Interestingly enough, EClass might help here. It's basically a content management app geared towards documentation, or using the education-friendly buzzword, "learning modules". It lets you take a bunch of files (not necessarily HTML), organize them into sections/subsections, and (if you want) add some metadata. Then it creates an HTML version of the files, and also generates a table of contents in HTML that works much like a treeview, + - buttons and all. The end product can be viewed by any major web browser. It works with frames/non-frames and you could even create your own "Theme" (look and feel) for the wxPython documentation.

For an example (non-technical, it's a course on culture differences), see:

http://www3.uop.edu/sis/culture/index.htm?page=/sis/culture/

And before you ask, yes you can widen that table of contents. Why they chose to make it that small I don't know. =)

The interesting thing is that you aren't required to use HTML as an input format. There's a plugin API so that one can write a plugin to convert, say, .py files, to HTML, and then the plugin just tells EClass the "whatever.html" filename of the converted file so that it can generate appropriate links and such. And thanks to the plugin api, you can mix and match files of any format in your documentation. For example, you could add both .py files (containing automated docstrings) and also plain HTML docs (getting started, tutorial, examples) into your documentation seamlessly. You don't have to 'pick' a format that the entire documentation needs to be written in. In the end, they all blend together as HTML-based documentation for the user. You could even input LaTeX docs from wxWindows so long as you developed an EClass plugin to convert it to HTML (which could simply call tex2rtf or whatever). BTW, the structure of the documentation is stored as a separate XML file, which is how all this becomes possible. Oh, and EClass also creates a full-text index using SWISH-E, so the entire documentation is searchable. =)

PDF docs could be created using a HTML->PDF conversion tool, and same with HTML Help. OpenOffice and PyUNO could probably be used to make RTF/DOC/SXW format docs if we wanted to do that too. =)

EClass lacks documentation as I have been working on many of these features over the past year, but I'm now gearing up for a final 2.5 release and am currently developing documentation for it both at the end user and at the developer level. (i.e. architecture, plugin API) I plan on having the documentation ready within the next week or two. I'm also updating the web site to make it more helpful and less boring. =) Of course I am using EClass for all this, so you will be able to see how it works. Once I have the documentation ready, I am planning to post an announcement on wxPython-users anyways as I thought it might be a useful tool for some folks. =)

Just some thoughts. There is one last trivial benefit - it gives us the bragging rights to say we use a wxPython application to maintain the wxPython documentation. If anyone wants to play with the software, I can send out a link to the latest and greatest Windows version, or otherwise you can wait till I finish the docs and make the announcement on wxPython-users. =)

Thanks,

Kevin

···

On Nov 14, 2003, at 12:40 PM, Robin Dunn wrote:

I suppose one approach is to take the XML and generate a fake python wx package with just the items that we want documented and then use that with the existing tools, but that sounds icky. The approach I was orginally thinking of is that as part of the process of building that XML it will convert each of the little snippets of content text into reST's XML format (it does have one doesn't it?) and then making that structured XML be a part of the overall structure, in addition to the unconverted raw text. Finally, converting the whole thing to whatever format is wanted on the back end really shouldn't be all that hard.

Thoughts?

Grimmtooth · November 15, 2003, 3:43am

> I suppose one approach is to take the XML and generate a fake python
> wx package with just the items that we want documented and then use
> that with the existing tools, but that sounds icky.

Replace "icky" with "clever". Then document the approach, call it a
"pattern", and give it a cool-sounding name.

Without swinging either way on the descriptor (g), I can say I've seen this
approach work quite well in real-time and near-real-time processing. Other
than that I have to keep mum since it's proprietary software in a cuthroat
business sector.

Riaan_Booysen · November 16, 2003, 1:38pm

Hi Robin,

Robin Dunn wrote:

3. (still just a gleam in the eye) For those items that do need more detailed docs, then those details (written in something like ReST or epydoc format) will also be added to the SWIG parse tree, but will not be used at all for code generation. Instead they will be extracted from an XML dump of the parse tree, along with the actual docstrings from #1/#2 and a bit of additional metadata, and then they will be put into a more simplified XML doc. (I would also like to extract metadata and docstrings from the lib modules and put them in this same XML format.) Then tools can be written to transform that to full reference docs in HTML, PDFs or whatever, or apps like Boa can just use the XML directly.

Yeah, I've dreamed about a complete type library for wxPython before

Introspection only goes as far as method names because of the
*args, **kwargs used for SWIG code, so I've had to manually track
many little collections of various wxPython names in the Boa source.
Much of this could be extracted from such XML, but I also define
groupings like EventCollections that defines which events are
KeyEvents or MouseEvents. There are other groupings like styles
and other "enumerations".

Although I don't expect these last mentioned groupings to be in the
XML (because as far as I know it's not explicit in the wxWindows
source), I thought I'd mention it because it would be great to
have and generated documentation would certainly benefit from it.

Any ideas about the best way of exposing such XML data,
IOW Introspection for wxPython?
* An "inspect" equivalent wxPython.lib module?
* Dynamically transforming xml to a Python syntax?
e.g. wxTypeInfo.core.Window.parameters()
* Decorating the real wxPython objects with additional
type info.

Questions: Is there already a simple XML format for #3 defined that we can use? Does anybody feel like starting work on the lib module extraction and html conversion tools for this?

I'm very interested in seeing #3 succeed, so I'm volunteering for
any scripting tasks (preferably not involving learning and using SWIG)

Cheers,
Riaan.

Robin · November 17, 2003, 6:27pm

Riaan Booysen wrote:

Any ideas about the best way of exposing such XML data,
IOW Introspection for wxPython?
* An "inspect" equivalent wxPython.lib module?
* Dynamically transforming xml to a Python syntax?
e.g. wxTypeInfo.core.Window.parameters()
* Decorating the real wxPython objects with additional
type info.

To be honest, I havn't gone very far down that path yet in my thought processes, just thinking about refrence docs and such. When I get to the point of seeing how much useful info we can get out of SWIG's xml without too much effort I'll keep runtime introspection issues in mind too and see how far down that path we can get.

Questions: Is there already a simple XML format for #3 defined that we can use? Does anybody feel like starting work on the lib module extraction and html conversion tools for this?

I'm very interested in seeing #3 succeed, so I'm volunteering for
any scripting tasks (preferably not involving learning and using SWIG)

Thanks for volunteering! Don't worry about having to use SWIG. I may have to tweak one or two more things within SWIG to make this happen, but after that the bulk of the work will happen outside of it processing the XML files that it produces, and I can send those to you.

BTW: So I don't have to also respond to the rest of the messages on this subject I'll just say it here: a *big thanks* to everybody for thinking through this and giving me the "food for thought." I'm going to let it stew for a little while as I work on replacing the autodoc strings that need it and a few other things, and then I am going to try and get some preview binaries built later this week (maybe not with a full installer but just a zip file of the wx and wxPython packages) so those working on lib and demo modules have something to test with. After that (next week probably) I'll come back to #3 and start working on transforming the XML output of SWIG into something that more easily is usable for the task.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!