I am in the process of developing a fairly complex text mining application
(including storage and management of ontologies, tokenizers, indexers,
etc., etc.) and am using the Python Extension Package for the Berkeley DB
(BSDDB). I'm pretty happy with it in general (and a bit surprised at its
performance -- which seems fast to me). This is a Python wrapper for the
underlying DB library written in C. It supports three kinds of DBs (or of
DB access): Hash, BTree, Recno, and Queue, of which I am currently using
Hash and Recno. It also has a Dbshelve.py module that implements the
Python shelve interface, and I am using this shelve module which seems to
work nicely. This package has a *lot* of features and you will need to
check the documentation for all the details:
http://pybsddb.sourceforge.net/
ยทยทยท
--------------------------------------
Gary H. Merrill
Director and Principal Scientist, New Applications
Data Exploration Sciences
GlaxoSmithKline Inc.
(919) 483-8456