Make sure you’ve read the documentation on
intern() and that you are using it correctly. You need to call intern()
each time you want to access the string, of course, since that’s the only
way Python knows to do the lookup. If you do something like
s = intern(‘mystring’)
x = ‘mystring’
the second reference won’t refer to the interned
instance of the string. Perhaps some of your code is doing this sort
of thing. For later (i.e., non-initial) references you need
to do
x = intern(‘mystring’)
since the intern() function either returns
the reference to the already interned string or creates and returns the
reference if the string has not already been interned.
Also, you may want to be careful about exactly
which string you are interning so that you know when to call intern() and
when not to. Otherwise, you may be interning all your strings,
and you might not want to be doing that. Note that interned strings
are not garbage collected.
I’ve used interning very effectively. If
it’s now working well for you, I suspect that something isn’t being done
quite correctly. But there are circumstances in which it won’t be
the best approach. Fundamentally, if you have a set of strings that
you know you’ll be accessing repeatedly over time and that you won’t want
to be deleting or otherwise munging, interning should be a win. Otherwise,
maybe not.
The suggested dict implementation is just
your own implementation of intern() and consequently should be much slower.
But it may expose what your problem is with intern().
···
Gary H. Merrill
Director and Principal Scientist, New Applications
Data Exploration Sciences
GlaxoSmithKline Inc.
(919) 483-8456
“Chris Mellon”
arkanes@gmail.com
11-Aug-2004 22:23
Please respond to wxPython-users@lists.wxwidgets.org
To
wxpython-users@lists.wxwidgets.org
cc
Subject
Re: [wxPython-users] String
pooling
`On Wed, 11 Aug 2004 21:13:12 -0500, Jean Brouwers
jbrouwers@prophicy.com wrote:
One way to achieve this would be to build a dict with each line as
the key. The value can be anything, maybe the number of
instances of that line.
Another thing to try is ‘intern’ each string. That avoids storing
the same string in different objects. Make sure that the
first time each line is created you use
line = intern().
/Jean Brouwers
I tried this using intern() and performance dropped through the floor,
as well as memory usage soaring (more than storing everything
individually), I have no idea why. I hadn’t thought of the dict
implementation, I’ll give that a try.
Thanks a lot.
Chris Mellon wrote:
I have an application that needs to store a log buffer of up
hundreds
of thousands of lines, many of which may be identical. Does anyone
know of any implementations of string pooling I might be able
to use
to cut the memory footprint for this down to something reasonable?
To unsubscribe, e-mail: wxPython-users-unsubscribe@lists.wxwidgets.org
For additional commands, e-mail: wxPython-users-help@lists.wxwidgets.org
To unsubscribe, e-mail: wxPython-users-unsubscribe@lists.wxwidgets.org
For additional commands, e-mail: wxPython-users-help@lists.wxwidgets.org
To unsubscribe, e-mail: wxPython-users-unsubscribe@lists.wxwidgets.org
For additional commands, e-mail: wxPython-users-help@lists.wxwidgets.org
`