Hi all,
I'd like to ask about the differences and maybe limitations of the
wx.TextCtrl flags regarding richtext (using py 2.5.4 and 2.6.4;
wxPython 2.8.10.1 on win XPp SP3).
Sofar I always used TE_RICH2 as I occasionally needed richtext styling
and also in other cases I didn't noticed any drawbacks.
However, recently I noticed a performance problem when working with a
bit larger texts (~ 2 MB); it turned out that the bottleneck is
displaying the text in in gui rather than manipulating and analysing
it; I eventually found that using non-richtext variant instead of
TE_RICH2 reduces the problem greatly. However, the display speed seems
to depend on multiple circumstances:
I tried some simplified timings only using time.time() around the
SetValue() calls with different texts:
test_text = "".join(unichr(random.randrange(*unichr_range)) for _ in
range(text_length))
and got the following timings (the string varies in the test, but is
the same for the subsequent calls in both widgets):
there may be some inacuracies (probably from using time() (?) and
maybe also due to the non existent codepoints present in the random
samples, but it seems to correspond with the behaviour seen in the
real app. - for mainly ascii text TE_RICH2 is faster - especially with
shorter strings; for higher codepoints and longer texts this richtext
variant performs progressively worse.
(displaying 2 MB text indeed takes minutes in TE_RICH2 and seconds in
the non richtext control)
Can somebody maybe confirm this behaviour (or - even better - show the
mistake I'm possibly making in using and also testing this)?
Are there some other important differences depending on the richtext
flag in TextCtrl? (Sofar I noticed a different handling of newlines
(on windows) with consequences for GetSelection etc...)
Should I be using the plaintext controls implicitely and only take
TE_RICH2 if needed; or is the text of few hundreds kB near the limit
for this widget and one should use e.g. stc for longer texts?
there may be some inacuracies (probably from using time() (?) and
maybe also due to the non existent codepoints present in the random
samples, but it seems to correspond with the behaviour seen in the
real app. - for mainly ascii text TE_RICH2 is faster - especially with
shorter strings; for higher codepoints and longer texts this richtext
variant performs progressively worse.
(displaying 2 MB text indeed takes minutes in TE_RICH2 and seconds in
the non richtext control)
Can somebody maybe confirm this behaviour (or - even better - show the
mistake I'm possibly making in using and also testing this)?
Are there some other important differences depending on the richtext
flag in TextCtrl? (Sofar I noticed a different handling of newlines
(on windows) with consequences for GetSelection etc...)
On Windows the wx.TetCtrl uses a completely different native text control depending on which rich flag is used (or not) so I expect that the differences in timing is all due to the different native widgets being used under the covers.
BTW, are you sure that the whole 2MB of text is present in the non-rich textctrls? One of the limitations of the native text control used in that case is that it can only hold something like 32k or 64k of text (although that may have changed in more recent versions of Windows...) The rich versions are supposed to be able to handle up to 2**32 bytes, although you would probably run out of memory long before you get there.
Should I be using the plaintext controls implicitely and only take
TE_RICH2 if needed; or is the text of few hundreds kB near the limit
for this widget and one should use e.g. stc for longer texts?
There's no question that the native controls are designed to work well with smaller blocks of text, and that the design doesn't really facilitate dealing with very large amounts of text. Depending on what you need it to do (for example, does it really need to be editable?) then other solutions might be better.
Thanks for the information, Robin;
well, if the underlying components are different, it seems natural,
that they have their specific behaviour and capabilities.
I can load 2 MB texts in the TE_RICH2 TextCtrl as well as in the
"plaintext" version, however, for different text samples, the loading
times differ greatly, once it took around 4 seconds for the plain
version and almost 9 minutes for the TE_RICH2 TextCtrl ...
On the other hand for another 2 MB text file, the loading times are
much more balanced. Anyway, if the reasons for these are somewhere
deep in the in the components nad not the python code, I'll probably
leave it and will hope, not to encounter some pathological patterns
very frequenty ...
Thanks again,
regards
Vlasta
···
2010/5/15 Robin Dunn <robin@alldunn.com>:
On 5/14/10 4:09 PM, Vlastimil Brom wrote:
there may be some inacuracies (probably from using time() (?) and
maybe also due to the non existent codepoints present in the random
samples, but it seems to correspond with the behaviour seen in the
real app. - for mainly ascii text TE_RICH2 is faster - especially with
shorter strings; for higher codepoints and longer texts this richtext
variant performs progressively worse.
(displaying 2 MB text indeed takes minutes in TE_RICH2 and seconds in
the non richtext control)
Can somebody maybe confirm this behaviour (or - even better - show the
mistake I'm possibly making in using and also testing this)?
Are there some other important differences depending on the richtext
flag in TextCtrl? (Sofar I noticed a different handling of newlines
(on windows) with consequences for GetSelection etc...)
On Windows the wx.TetCtrl uses a completely different native text control
depending on which rich flag is used (or not) so I expect that the
differences in timing is all due to the different native widgets being used
under the covers.
BTW, are you sure that the whole 2MB of text is present in the non-rich
textctrls? One of the limitations of the native text control used in that
case is that it can only hold something like 32k or 64k of text (although
that may have changed in more recent versions of Windows...) The rich
versions are supposed to be able to handle up to 2**32 bytes, although you
would probably run out of memory long before you get there.
Should I be using the plaintext controls implicitely and only take
TE_RICH2 if needed; or is the text of few hundreds kB near the limit
for this widget and one should use e.g. stc for longer texts?
There's no question that the native controls are designed to work well with
smaller blocks of text, and that the design doesn't really facilitate
dealing with very large amounts of text. Depending on what you need it to
do (for example, does it really need to be editable?) then other solutions
might be better.