RichTextCtrl file sizes

I've got a file of 2000 lines of Chinese text, about 6 characters per
line on average. Doesn't seem like a lot of text, but you can't type
on an empty line at the top of the file - the letters appear one by
one extremely slowly. I've never noticed this before and believe I've
worked with longer and certainly bigger files before - but always in
English. Can anyone comment on what size files they work with? And/or
confirm a problem with long files in other fonts like Chinese? I'm
wondering if the structure matters, lots of short lines like so, you
can copy and paste this a bunch of times into the demo to see the
problem.

-啊
天啊!
今天天气真好啊!
你做的饭这好吃啊!

-矮
你比我矮好多。
七个小矮人
这些人有高有矮

-爱人
你和你爱人最近好么?
你爱人最近身体怎么样了?

I'm using 2.9.2.4 on windows 7.

I can't comment on using Chinese, but I comment on file size.

I use the RichTextCtrl for creating transcripts of video files. I've
created a 20 MB data file that is all the speech for a 2.5 hour long
video and includes about 85 embedded images, screen captures from the
video. I can type in this file just fine, with characters and images
showing up quickly and paragraphs reformatting as needed with out
delay.

I also use the RTC for my report engine. I can generate 400 page
reports that take some time to create but that are perfectly responsive
once created.

Because of RTC bugs in the 2.9 series, I'm still using wxPython 2.8.12.0
(I think) on both Windows and OS X.

David

···

On Fri, 2011-11-04 at 19:46 -0700, Mark wrote:

I've got a file of 2000 lines of Chinese text, about 6 characters per
line on average. Doesn't seem like a lot of text, but you can't type
on an empty line at the top of the file - the letters appear one by
one extremely slowly. I've never noticed this before and believe I've
worked with longer and certainly bigger files before - but always in
English. Can anyone comment on what size files they work with? And/or
confirm a problem with long files in other fonts like Chinese? I'm
wondering if the structure matters, lots of short lines like so, you
can copy and paste this a bunch of times into the demo to see the
problem.

-啊
天啊!
今天天气真好啊!
你做的饭这好吃啊!

-矮
你比我矮好多。
七个小矮人
这些人有高有矮

-爱人
你和你爱人最近好么?
你爱人最近身体怎么样了?

I'm using 2.9.2.4 on windows 7.

Is this all English (or at least Latin) text?

I wonder how wx, and RTC in particular, stores unicode internally. If it is using utf-8, for example, then english will almost always be one byte per character, and nice and fast, but chinese will be multi-byte characters, and that could have an impact on performance.

Just a thought.

-Chris

···

On 11/5/11 8:51 AM, David Woods wrote:

On Fri, 2011-11-04 at 19:46 -0700, Mark wrote:

I've got a file of 2000 lines of Chinese text, about 6 characters per
line on average. Doesn't seem like a lot of text,

I can't comment on using Chinese, but I comment on file size.

I use the RichTextCtrl for creating transcripts of video files. I've
created a 20 MB data file that is all the speech for a 2.5 hour long
video and includes about 85 embedded images, screen captures from the
video. I can type in this file just fine, with characters and images
showing up quickly and paragraphs reformatting as needed with out
delay.

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

I'm pretty sure that it uses wxStrings internally, and wxString's internal representation will be whatever the native UI APIs prefer. So, IIRC, on Windows it will be whar_t (16-bit values holding UTF-16) and on GTK and OSX I think it is char arrays holding utf-8 text.

···

On 11/5/11 11:04 AM, Chris Barker wrote:

On 11/5/11 8:51 AM, David Woods wrote:

On Fri, 2011-11-04 at 19:46 -0700, Mark wrote:

I've got a file of 2000 lines of Chinese text, about 6 characters per
line on average. Doesn't seem like a lot of text,

I can't comment on using Chinese, but I comment on file size.

I use the RichTextCtrl for creating transcripts of video files. I've
created a 20 MB data file that is all the speech for a 2.5 hour long
video and includes about 85 embedded images, screen captures from the
video. I can type in this file just fine, with characters and images
showing up quickly and paragraphs reformatting as needed with out
delay.

Is this all English (or at least Latin) text?

I wonder how wx, and RTC in particular, stores unicode internally. If it
is using utf-8, for example, then english will almost always be one byte
per character, and nice and fast, but chinese will be multi-byte
characters, and that could have an impact on performance.

--
Robin Dunn
Software Craftsman

I'd appreciate it if someone could copy that Chinese a bunch of times
into the RTC demo on 2.8 or 2.9 XP and let me know if the problem
still exists and I'll open an issue. This is not a lot of text so
there must be some kind of drawing problem with the font perhaps.

Mark

···

On Nov 4, 10:46 pm, Mark <markree...@gmail.com> wrote:

I've got a file of 2000 lines of Chinese text, about 6 characters per
line on average. Doesn't seem like a lot of text, but you can't type
on an empty line at the top of the file - the letters appear one by
one extremely slowly. I've never noticed this before and believe I've
worked with longer and certainly bigger files before - but always in
English. Can anyone comment on what size files they work with? And/or
confirm a problem with long files in other fonts like Chinese? I'm
wondering if the structure matters, lots of short lines like so, you
can copy and paste this a bunch of times into the demo to see the
problem.

-啊
天啊!
今天天气真好啊!
你做的饭这好吃啊!

-矮
你比我矮好多。
七个小矮人
这些人有高有矮

-爱人
你和你爱人最近好么?
你爱人最近身体怎么样了?

I'm using 2.9.2.4 on windows 7.

When I copied and pasted the Chinese text, RichTextCtrl just showed squares instead of the characters. (wxPython 2.8.10, XP).

Che

···

On Sat, Nov 5, 2011 at 6:12 PM, Mark markreed99@gmail.com wrote:

I’d appreciate it if someone could copy that Chinese a bunch of times

into the RTC demo on 2.8 or 2.9 XP and let me know if the problem

still exists and I’ll open an issue. This is not a lot of text so

there must be some kind of drawing problem with the font perhaps.

Most probably, your RTC font does not contain these
glyphs. If you see the text in a web browsee, use
the same font.

This proposed text is not very well formed. It contains
ZWNBSP in the middle of the stream.

As wx-ANSI user, I did not check. I can however copy
and paste this text between applications on Windows 7
without problem.

As Chris Barker said, if there is really a problem,
I suspect more a wxWidgets - Unicode issue.

jmf

···

On 5 nov, 23:33, C M <cmpyt...@gmail.com> wrote:

On Sat, Nov 5, 2011 at 6:12 PM, Mark <markree...@gmail.com> wrote:
> I'd appreciate it if someone could copy that Chinese a bunch of times
> into the RTC demo on 2.8 or 2.9 XP and let me know if the problem
> still exists and I'll open an issue. This is not a lot of text so
> there must be some kind of drawing problem with the font perhaps.

When I copied and pasted the Chinese text, RichTextCtrl just showed squares
instead of the characters. (wxPython 2.8.10, XP).

Che

> I've got a file of 2000 lines of Chinese text, about 6
characters per
> line on average. Doesn't seem like a lot of text, but you
can't type
> on an empty line at the top of the file - the letters appear one by
> one extremely slowly. I've never noticed this before and
believe I've
> worked with longer and certainly bigger files before - but
always in
> English. Can anyone comment on what size files they work
with? And/or
> confirm a problem with long files in other fonts like Chinese? I'm
> wondering if the structure matters, lots of short lines
like so, you
> can copy and paste this a bunch of times into the demo to see the
> problem.
>
> -啊
> 天啊!
> 今天天气真好啊!
> 你做的饭这好吃啊!
>
> -矮
> 你比我矮好多。
> 七个小矮人
> 这些人有高有矮
>
> -爱人
> 你和你爱人最近好么?
> 你爱人最近身体怎么样了?
>
> I'm using 2.9.2.4 on windows 7.

I'd appreciate it if someone could copy that Chinese a bunch
of times into the RTC demo on 2.8 or 2.9 XP and let me know
if the problem still exists and I'll open an issue. This is
not a lot of text so there must be some kind of drawing
problem with the font perhaps.

Mark

Hi Mark,

Sorry it took me a few days to get to this. But that's life.

I created over 2000 lines of chinese text from your sample text. I can type in my RTC-based control just fine eith all that chinese text there. The only display issues I see are when I paste large amounts of text and my RTF Parser gets behind in processing the paste. Then, I see boxes instead of characters until the processor catches up and the program can render the Chinese characters. I don't think that's wxPython's fault.

I'm using Python 2.6.6 and wxPython 2.8.12.1 (because of RTC bugs in the 2.9 series that I haven't had time to explore and fully document yet.)

David