OT - Python/Databases/wxPython

Hi All,

    sorry for the off-topic, but I thought that the wxPython list is
much friendlier than any other place to ask for suggestions. Please
feel free to kick me if the question is too much off-topic.
We are trying to keep track of all the modifications we make to some
input files for our reservoir simulator, in order to not lose the
history of our work and to be able to retrieve in the future old
results. The possible modifications are endless, so it happens that
after few months we are lost because we can not recover the exact
input file that generated a particular result.
So, noting that the SVN/CVS approach is not an alternative, I am
trying to build a database in which I will store the "fathers" input
files, and then all their children, with a possibility for the user to
add comments regarding the changes, link to output/result files and
other information. With this is mind, I will populate a wx.TreeCtrl
with all the input file names and display information only on user
request (i.e., the user double-click an item in the tree or something
like that).
If this is a viable approach, could someone please suggest a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?
If this is not a viable solution or some of you can envisage a better
approach, could you please share some comments about this issue?

Thank you for every hint.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.virgilio.it/infinity77/

Andrea Gavana wrote:

a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?

I'd consider going with a straight python object database like Durus:

http://www.mems-exchange.org/software/durus/

It's a lot easier to deal with than an SQL database.

If you really want to use a "standard" database, then I'd consider an Object-Relational mapper.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

Andrea,

   My immediate reaction was to ask why subversion would not work to track
changes to your data files. It works with any text files, so you can track
changes to documents, data, and code.

   Regardless, I recommend SQLite3 <http://www.sqlite.org/&gt;, a lightweight
but powerful relational DBMS designed to be embedded in applications. I use
the pysqlite2 library with it, but there's another one, AWSN or something
like that. It's in the public domain so there are no license issues, Richard
Hipp and the rest of the crew on the mail list offer excellent assistance,
and the same goes for the pysqlite mail list.

Rich

···

On Sat, 10 Feb 2007, Andrea Gavana wrote:

If this is a viable approach, could someone please suggest a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem? If this is not a viable solution or some
of you can envisage a better approach, could you please share some
comments about this issue?

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc. | Accelerator(TM)
<http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863

Christopher Barker wrote:

Andrea Gavana wrote:

a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?

I'd consider going with a straight python object database like Durus:

http://www.mems-exchange.org/software/durus/

It's a lot easier to deal with than an SQL database.

If you really want to use a "standard" database, then I'd consider an Object-Relational mapper.

Another non-SQL possibility is the bsddb package included in the standard Python library (originally written years ago by some guy named Robin) which is a wrapper around the Sleepycat Berkeley DB library, (which is now known as the Oracle Embedded Database.) It can be used as simply as a dictionary mapping string keys to string values or pickled objects, or it can scale up with the use of transactions, locking, cursors, etc. It's pretty powerful but still allows you to have totally unstructured content, if that is what you want.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

So, noting that the SVN/CVS approach is not an alternative

  Why? Sounds to me like that would be the simplest approach.

I am trying to build a database in which I will store the "fathers" input
files, and then all their children, with a possibility for the user to
add comments regarding the changes, link to output/result files and
other information. With this is mind, I will populate a wx.TreeCtrl
with all the input file names and display information only on user
request (i.e., the user double-click an item in the tree or something
like that).
If this is a viable approach, could someone please suggest a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?

  I'd suggest something like SQLite as the database, and (of course!) a simple Dabo app to manage it. We do lots of data stuff, and this doesn't sound terribly complex.

-- Ed Leafe
-- http://leafe.com
-- http://dabodev.com

···

On Feb 9, 2007, at 6:26 PM, Andrea Gavana wrote:

You'll have to explain why using SVN as a backend doesn't seem to you like a good idea. From what you've described it sounds like it would be perfect, including the need for comments for every change (revision).

If you won't use SVN, I'd go for SQLite3 and I'd work with it via SQLAlchemy which makes things easier to code. You seem to imply a one-to-many relationship in you description of the tree control and SQLAlchemy makes handling such things (and loading only what is needed) quite simple.

Andrea Gavana wrote:

···

Hi All,

   sorry for the off-topic, but I thought that the wxPython list is
much friendlier than any other place to ask for suggestions. Please
feel free to kick me if the question is too much off-topic.
We are trying to keep track of all the modifications we make to some
input files for our reservoir simulator, in order to not lose the
history of our work and to be able to retrieve in the future old
results. The possible modifications are endless, so it happens that
after few months we are lost because we can not recover the exact
input file that generated a particular result.
So, noting that the SVN/CVS approach is not an alternative, I am
trying to build a database in which I will store the "fathers" input
files, and then all their children, with a possibility for the user to
add comments regarding the changes, link to output/result files and
other information. With this is mind, I will populate a wx.TreeCtrl
with all the input file names and display information only on user
request (i.e., the user double-click an item in the tree or something
like that).
If this is a viable approach, could someone please suggest a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?
If this is not a viable solution or some of you can envisage a better
approach, could you please share some comments about this issue?

Thank you for every hint.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.virgilio.it/infinity77/

You'll have to explain why using SVN as a backend doesn't seem to you
like a good idea. From what you've described it sounds like it would be
perfect, including the need for comments for every change (revision).

If you won't use SVN, I'd go for SQLite3 and I'd work with it via
SQLAlchemy which makes things easier to code. You seem to imply a
one-to-many relationship in you description of the tree control and
SQLAlchemy makes handling such things (and loading only what is needed)
quite simple.

I agree. I've been meaning to switch to SQLAlchemy for a while now.
Also, a sqlite 3 binding ships with Python 2.5 .

- Josiah

···

Eli Golovinsky <gooli@tuzig.com> wrote:

Andrea Gavana wrote:
> Hi All,
>
> sorry for the off-topic, but I thought that the wxPython list is
> much friendlier than any other place to ask for suggestions. Please
> feel free to kick me if the question is too much off-topic.
> We are trying to keep track of all the modifications we make to some
> input files for our reservoir simulator, in order to not lose the
> history of our work and to be able to retrieve in the future old
> results. The possible modifications are endless, so it happens that
> after few months we are lost because we can not recover the exact
> input file that generated a particular result.
> So, noting that the SVN/CVS approach is not an alternative, I am
> trying to build a database in which I will store the "fathers" input
> files, and then all their children, with a possibility for the user to
> add comments regarding the changes, link to output/result files and
> other information. With this is mind, I will populate a wx.TreeCtrl
> with all the input file names and display information only on user
> request (i.e., the user double-click an item in the tree or something
> like that).
> If this is a viable approach, could someone please suggest a free, non
> particularly complicated to use database with Python support which can
> help me in tackling this problem?
> If this is not a viable solution or some of you can envisage a better
> approach, could you please share some comments about this issue?
>
> Thank you for every hint.
>
> Andrea.
>
> "Imagination Is The Only Weapon In The War Against Reality."
> http://xoomer.virgilio.it/infinity77/

---------------------------------------------------------------------
To unsubscribe, e-mail: wxPython-users-unsubscribe@lists.wxwidgets.org
For additional commands, e-mail: wxPython-users-help@lists.wxwidgets.org

Andrea Gavana wrote:

Hi All,

   sorry for the off-topic, but I thought that the wxPython list is
much friendlier than any other place to ask for suggestions. Please
feel free to kick me if the question is too much off-topic.
We are trying to keep track of all the modifications we make to some
input files for our reservoir simulator, in order to not lose the
history of our work and to be able to retrieve in the future old
results. The possible modifications are endless, so it happens that
after few months we are lost because we can not recover the exact
input file that generated a particular result.

I would think that a DB type application will give you better search functionality then you get with SVN/CVS.

I use Firebird SQL with ORM ( http://orm.nongnu.org/ ), am using an older version of ORM as I never got around to upgrade.

One advantage is that you could deploy just the embedded engine if this is a single user application or use the server if you have many users using this application simultaneously.

You can then use validators to load data from/to your widgets or use something like Dabo.

Werner

Hi All,

    thank you for all your useful answers. I will dig into the
database world with your suggestions in mind :smiley:
I see that most of you would have chosen the SVN/CVS way. I have
thought the same thing at the beginning, but when I looked at the
Python wrapper for SVN (PySVN), and when I tried the GUI that comes
with the wrapper (PySVN Workbench), I noticed a couple of strange
things. My level of knowledge about SVN/CVS is about nothing above
zero, so I may write silly things all over the way: please forgive my
ignorance and possibly correct my stupid assumptions. Globally, the
reasons for which I am thinking about databases are:

1) I know next to nothing about databases and SVN, but in some way
databases seem less intimidating to me;
2) We don't have a "central repository" *and* "a working copy" of it:
the two main directories hold about 2 TB of data, and though I am
interested only in input files, we can not afford to have a working
copy of the central directories;
3) The software itself must be as simple and as fast to use as
possible: I am the only "programmer" of our group of reservoir
engineer (though I am only an amateurish-like Python coder), and we
honestly don't have a lot of time to play with the software, or the
work will deteriorate. The fastest way I found until now is:
a) Integrate an application-specific menu in the right click Windows
Explorer menu;
b) Select the "father" input file and the child input file with the mouse;
c) Send the selection (via right click menu) to the software, showing
a small window for comments;
d) Silently update the database
Then, if one of us has time to play with it, we can add more comments,
links to output files, results and whatever.

I think that's all... I may be missing a lot of things here, so please
forgive my ignorance on this subject.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.virgilio.it/infinity77/

Hi Andrea, how about using an XML file to store the configuration
files? Something like:

<config name="config1" date="1/1/2007">
  <note>The original configuration</note>
  <setting1>bla</setting1>
  <setting2>blabla</setting2>
</config>

If a configuration file is changed, add a 'child' configuration
element that inherits its parents data, like this:
<config name="config1" date="1/1/2007">
  <setting1>bla</setting1>
  <setting2>blabla</setting2>
  <config name="config2" date="2/1/2007">
    <note>Use a different setting1</note>
    <setting1>foo</setting1>
</config>

config2 is a 'child' of config1 and only changes setting1.

Reading and writing XML files is pretty easy in Python and the
treelike datastructure maps very well to the XML structure.

HTH, Frank

···

2007/2/10, Andrea Gavana <andrea.gavana@gmail.com>:

So, noting that the SVN/CVS approach is not an alternative, I am
trying to build a database in which I will store the "fathers" input
files, and then all their children, with a possibility for the user to
add comments regarding the changes, link to output/result files and
other information. With this is mind, I will populate a wx.TreeCtrl
with all the input file names and display information only on user
request (i.e., the user double-click an item in the tree or something
like that).

OK, then, I think I understand your concerns a bit better. You might want to check out Mecurial: Mercurial - Mercurial

  I've never used it myself, but another member of our Python user group was talking about it the other day. Apparently it's either written in Python, or has a Python front end, so that you can work with it using Python instead of just a command line and/or unfamiliar language. It is also supposed to be distributed, so that everyone has the 'repository'; more of a P2P setup than a client-server setup, which is what SVN is.

-- Ed Leafe
-- http://leafe.com
-- http://dabodev.com

···

On Feb 10, 2007, at 5:33 AM, Andrea Gavana wrote:

1) I know next to nothing about databases and SVN, but in some way
databases seem less intimidating to me;
2) We don't have a "central repository" *and* "a working copy" of it:
the two main directories hold about 2 TB of data, and though I am
interested only in input files, we can not afford to have a working
copy of the central directories;

Andrea,

Try RapidSVN its best, I use pySVN Workbench for some time but it dont
have some util tools, like choose the version match...

···

El sáb, 10-02-2007 a las 11:33 +0100, Andrea Gavana escribió:

Hi All,

    thank you for all your useful answers. I will dig into the
database world with your suggestions in mind :smiley:
I see that most of you would have chosen the SVN/CVS way. I have
thought the same thing at the beginning, but when I looked at the
Python wrapper for SVN (PySVN), and when I tried the GUI that comes
with the wrapper (PySVN Workbench), I noticed a couple of strange
things. My level of knowledge about SVN/CVS is about nothing above
zero, so I may write silly things all over the way: please forgive my
ignorance and possibly correct my stupid assumptions. Globally, the
reasons for which I am thinking about databases are:

--
Mario Lacunza <mlacunza@gmail.com>

Rich,

  One big advantage of subversion over any database backend is the avoidance
of conflicts. Suppose two engineers change the same dataset in different
ways, and both try to submit the changes to the repository. If the same
parts are changed, the second person's changes are rejected with a message
to resolve the conflict.

It doesn't work in this way. An input file has *never* the same name
as its father input file or other children files. This is done by
purpose as we identify the *major* changes by appending some (short)
string to the new input file name. So, it is extremely unlikely that
two of us (mostly working in different sub-directories of the 2 main
folders) can assign the same file name in the same directory. Most of
all, because no OS I know of allows to do that.

  Based on the above, I suggest that your group hire an external consultant
to assist you. I looks like how you proceed is critical to your company's
work -- and future. Since none of you is really sufficiently knowledgable to
make an informed decision, get outside help.

No, I don't think it is so "mission critical" (or whatever is the term
in english). The main problem for us was that we never had the will to
fill a file (a spreadsheet, text, whatever), with all the changes we
made for every input file. It is a long, error-prone and boring task,
mainly when you are buried under other thousands of things to do. So
my main task here is to transform the problem into something that can
be handled with a couple of clicks and a short message as a comment.
To me, this is much faster than anything SVN can do. I can easily be
wrong, of course.
Knowing nothing about SVN, as we are working across 2 networks, will I
need the support of our IT to set-up everything? If the answer is yes,
that will definitely kill the SVN idea. I still have to find (in
London as in Italy) an IT that is competent enough to avoid to screw
up what we have done or that is able to make changes fast enough for
our needs.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.virgilio.it/infinity77/

Hi Frank,

Hi Andrea, how about using an XML file to store the configuration
files? Something like:

Yes, this was my first idea at the beginning: going from XML to a
wx.TreeCtrl seems easier than getting data from a database or from
SVN. It seems like every approach suggested has its own advantage,
namely:

- XML: easier Python handling and easier conversion XML <=>
wx.TreeCtrl. I don't think, however, I can have more than one person
working on the same XML file at a time, so this is a limitation (but I
may be wrong);
- SVN: Conflicts are not possible as far as I understand, two or more
of us can work on the same project and changes are committed/merged
(even if the merging will never be used, every file has a different
name once a modification has been applied);
- DB: I have no idea here. I suppose that is possible for two persons
to work on a same database. Is that right?

Thanks to you all for your useful suggestions.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.virgilio.it/infinity77/

···

On 2/10/07, Frank Niessink wrote:

- SVN: Conflicts are not possible as far as I understand, two or more
of us can work on the same project and changes are committed/merged
(even if the merging will never be used, every file has a different
name once a modification has been applied);

Andrea,

   Conflicts will happen, but they won't be permitted to be committed to the
repository until resolved by people.

   Just out of curiosity, if you're modeling the same reservior using
different variables, constants, and time periods, why not name each data set
something like reservoir0001, reservoir0002, and so on. Then each data set
is distinct and can be retrieved for reuse whenever you want.

- DB: I have no idea here. I suppose that is possible for two persons
to work on a same database. Is that right?

   Yes, as long as the dbms is designed for network use and multiuser,
simultaneous access. SQLite3 is not, primarily because the network
connections (e.g., NFS) are not so solid and building into the dbms engine
correcting factors would make it too large for its intended purpose as an
embedded backend. I use PostgreSQL for multiuser access, local or remote.
And, if you add a spatial component to your models, you can incorporate
those geographic entities in the database using PostGIS. I've done this with
river basin hydrology/sedimentation models.

Rich

···

On Sat, 10 Feb 2007, Andrea Gavana wrote:

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc. | Accelerator(TM)
<http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863

I found Postgresql very easy to setup, configure and maintain and it works
like a charm with Python

For your purposes however it might be overkill, and sqlite or even the gnudbm
might suit you better with a lesser memory and hard disk foot print

Horst

···

On Saturday 10 February 2007 10:26, Andrea Gavana wrote:

If this is a viable approach, could someone please suggest a free, non
particularly complicated to use database with Python support which can
help me in tackling this problem?

Hi Andrea,

In reading your original description:

We are trying to keep track of all the modifications we make to some
input files for our reservoir simulator, in order to not lose the
history of our work and to be able to retrieve in the future old
results. The possible modifications are endless, so it happens that
after few months we are lost because we can not recover the exact
input file that generated a particular result.

I, like others, thought it sounded pretty much in the ballpark for a modern version control system like SVN. Such systems are designed to work with multiple files being continually modified, by an individual or a team, and have two main objectives: keep track of the history of modifications, and facilitate team development of changes to the file set. They also allow determining just what changed between any two versions, which may be valuable to you.

You say
> 2) We don't have a "central repository" *and* "a working copy" of it:
> the two main directories hold about 2 TB of data, and though I am
> interested only in input files, we can not afford to have a working
> copy of the central directories;

With SVN (for instance), you only have working copies of the files or directories you're working on at any time. If the repository is well organized, it's pretty easy to check out just the parts you need.

> 3) The software itself must be as simple and as fast to use as
> possible: I am the only "programmer" of our group of reservoir
> engineer (though I am only an amateurish-like Python coder), and we
> honestly don't have a lot of time to play with the software, or the
> work will deteriorate.

Under Windows, TortoiseSVN integrates into the Explorer with a right click menu, as you mentioned; it's pretty straightforward, once you have the basic concepts.

Also, no database system automatically gives you the kind of versioning, history, differencing, etc. that you get with a version control system -- you can create it, of course, but _you_ have to create it.

My feeling is that there are pretty sharp tradeoffs between starting with a DB and rolling your own vs. adapting a version control system to your needs. I'd recommend educating yourself a bit more before committing to one door.

> Knowing nothing about SVN, as we are working across 2 networks, will I
> need the support of our IT to set-up everything? If the answer is yes,
> that will definitely kill the SVN idea. I still have to find (in
> London as in Italy) an IT that is competent enough to avoid to screw
> up what we have done or that is able to make changes fast enough for
> our needs.

If both your London and Italy teams will need access to the same set of files, you'll need some competent IT support to set up and maintain your "file base" anyway, whether a SVN repository or a "heavyweight" database system that supports multi-user access with transactions, etc. (In this case, you should be looking at PostgresSQL, Firebird, or maybe a recent version of MySL.) I wouldn't make that a prime determiner of the direction you go.

One last suggestion: if you have access to Access (or OpenOffice Base), do some prototyping in it. It's a quick way to get a feel for the nature, power, and limitations of the relational model.

···

--
Don Dwiggins
Advanced Publishing Technology

I think that SVN with TortoiseSVN is probably going to be your best
bet. Get used to working from a repository instead of just a file
system.

You obviously can't make working copies of the entire repository, but
TortoiseSVN has a graphical repo browser you can use. So your new
workflow will be:

Open the Repository Browser and select the file you wish to edit.
Checkout that file via the right click menu.
Edit the file.
Right click on the file (From Windows explorer) and check it in.
Enter comments in the box.

SVN+Apache is quite easy to set up, even on Windows, so this shouldn't
be too difficult.

···

On 2/10/07, Andrea Gavana <andrea.gavana@gmail.com> wrote:

Hi All,

    thank you for all your useful answers. I will dig into the
database world with your suggestions in mind :smiley:
I see that most of you would have chosen the SVN/CVS way. I have
thought the same thing at the beginning, but when I looked at the
Python wrapper for SVN (PySVN), and when I tried the GUI that comes
with the wrapper (PySVN Workbench), I noticed a couple of strange
things. My level of knowledge about SVN/CVS is about nothing above
zero, so I may write silly things all over the way: please forgive my
ignorance and possibly correct my stupid assumptions. Globally, the
reasons for which I am thinking about databases are:

1) I know next to nothing about databases and SVN, but in some way
databases seem less intimidating to me;
2) We don't have a "central repository" *and* "a working copy" of it:
the two main directories hold about 2 TB of data, and though I am
interested only in input files, we can not afford to have a working
copy of the central directories;
3) The software itself must be as simple and as fast to use as
possible: I am the only "programmer" of our group of reservoir
engineer (though I am only an amateurish-like Python coder), and we
honestly don't have a lot of time to play with the software, or the
work will deteriorate. The fastest way I found until now is:
a) Integrate an application-specific menu in the right click Windows
Explorer menu;
b) Select the "father" input file and the child input file with the mouse;
c) Send the selection (via right click menu) to the software, showing
a small window for comments;
d) Silently update the database
Then, if one of us has time to play with it, we can add more comments,
links to output files, results and whatever.

I think that's all... I may be missing a lot of things here, so please
forgive my ignorance on this subject.

Andrea.

Just another suggestion: bazaar, which is like svn but I thought it
can be used also in a distributed way. Maybe other people here can
comment it more wisely.
http://bazaar-vcs.org/

I've never used myself, as I am happy with svn.

Stani

···

On 2/10/07, Andrea Gavana <andrea.gavana@gmail.com> wrote:

Thanks to you all for your useful suggestions.

Andrea.

--

http://pythonide.stani.be/screenshots
http://pythonide.stani.be/manual/html/manual.html