what about dropping ascii builds?

Kevin_Altis · October 14, 2004, 4:28am

Just out of curiosity, what if there was no such thing as an ascii build of wxPython 2.5.x and 2.6.x, at least not a supported binary package? In other words, what are the downsides to moving to only supporting Unicode? Is it doable on all platforms?

I hadn't thought of this before, but it came up at the Python meeting tonight and I wonder if it would be beneficial for us to just move to Unicode and only Unicode for the future of wxPython?! Of course there would still be ASCII builds of 2.4.x

ka

Robin · October 14, 2004, 7:05am

Kevin Altis wrote:

Just out of curiosity, what if there was no such thing as an ascii build of wxPython 2.5.x and 2.6.x, at least not a supported binary package? In other words, what are the downsides to moving to only supporting Unicode? Is it doable on all platforms?

I hadn't thought of this before, but it came up at the Python meeting tonight and I wonder if it would be beneficial for us to just move to Unicode and only Unicode for the future of wxPython?! Of course there would still be ASCII builds of 2.4.x

I've thought about it off and on for years. The biggest problem is that Microsoft's MSLU library really isn't up to snuff for running arbitrary unicode apps on win98/Me. There are some messages that are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string conversions, encodings, etc.) that come up on the mail list from time to time, except everybody would be having them.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Matthew_Zaleski · October 14, 2004, 1:33pm

I’m new to wxPython, but as a developer I agree with Robin. One day the world of computers and software will be ready for Unicode-only solutions. I don’t think we are there yet. Until I did development for WinCE a few years back, I didn’t know Microsoft had the foresight to make it a Unicode-only product (they had the consumer’s best interest in mind for a change). Being a new platform, they could impose such rules. As Robin pointed out the pre-Win2000 Microsoft OSes are not robust enough for Unicode-only. And that is still a large portion of computers currently in use.

Out of curiosity, what is the new direction for Unicode and Python/wxPython? WinCE is UTF-16. Currently, UTF-8 and UTF-16 are popular but UTF-32 is lurking in the wings. I’ve never had a firm grasp of whether UTF-16 can provide all of the same encodings that UTF-32 can do (hopefully UCS-4 is dying on the vine); UTF-8 basically has variable character length (when looked at as bytes) but my understanding is that UTF-16 is fixed size, just like ASCII.

Matthew Zaleski

···

On Thu, 14 Oct 2004 00:05:25 -0700, Robin Dunn wrote:

Kevin Altis wrote:

Just out of curiosity, what if there was no such thing as an
ascii build of wxPython 2.5.x and 2.6.x, at least not a supported
binary package? In other words, what are the downsides to moving
to only supporting Unicode? Is it doable on all platforms?

I hadn’t thought of this before, but it came up at the Python
meeting tonight and I wonder if it would be beneficial for us to
just move to Unicode and only Unicode for the future of wxPython?!
Of course there would still be ASCII builds of 2.4.x

I’ve thought about it off and on for years. The biggest problem is
that Microsoft’s MSLU library really isn’t up to snuff for running
arbitrary unicode apps on win98/Me. There are some messages that
are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string
conversions, encodings, etc.) that come up on the mail list from
time to time, except everybody would be having them.

Paul_McNett · October 14, 2004, 1:46pm

Robin Dunn wrote:

I've thought about it off and on for years. The biggest problem is that Microsoft's MSLU library really isn't up to snuff for running arbitrary unicode apps on win98/Me. There are some messages that are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string conversions, encodings, etc.) that come up on the mail list from time to time, except everybody would be having them.

What if it was just easier to download/install the unicode version by default while still having the ascii version available. Call it the 'regular' version versus the 'non-unicode' version. Make a note that older versions of Windows want the 'non-unicode' version.

Then hopefully more people will use the unicode version by default, providing better support for everyone. Then again, it really is dependent on whether they have the unicode Python installed, right?

···

--
Paul McNett
http://paulmcnett.com

David_Woods4 · October 14, 2004, 2:59pm

Paul McNett wrote:

Robin Dunn wrote:

I've thought about it off and on for years. The biggest problem is that Microsoft's MSLU library really isn't up to snuff for running arbitrary unicode apps on win98/Me. There are some messages that are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string conversions, encodings, etc.) that come up on the mail list from time to time, except everybody would be having them.

What if it was just easier to download/install the unicode version by default while still having the ascii version available. Call it the 'regular' version versus the 'non-unicode' version. Make a note that older versions of Windows want the 'non-unicode' version.

Then hopefully more people will use the unicode version by default, providing better support for everyone. Then again, it really is dependent on whether they have the unicode Python installed, right?

I for one would vote AGAINST dropping the non-unicode version. I wasted several days at one point trying to get my program running under the unicode build before giving up entirely. While I'm sure I will try again at some point, there were just too many odd little things that broke when I changed to Unicode. Weird things, like drag-and-drop. I don't honestly remember all the details, I just remember it was a nightmare, and I concluded that it wasn't worth the hassle for now.

David

Kevin_Altis · October 14, 2004, 3:33pm

Cool, I just wanted to get the issue out of the way. I thought it would be nice if we had less variations to support, but clearly we can't do that.

Now I really do have to figure out when to manage conversions in the framework and tools for data and resources saved from a Unicode build that are running in an ASCII build and so on.

ka

···

On Oct 14, 2004, at 12:05 AM, Robin Dunn wrote:

Kevin Altis wrote:

Just out of curiosity, what if there was no such thing as an ascii build of wxPython 2.5.x and 2.6.x, at least not a supported binary package? In other words, what are the downsides to moving to only supporting Unicode? Is it doable on all platforms?
I hadn't thought of this before, but it came up at the Python meeting tonight and I wonder if it would be beneficial for us to just move to Unicode and only Unicode for the future of wxPython?! Of course there would still be ASCII builds of 2.4.x

I've thought about it off and on for years. The biggest problem is that Microsoft's MSLU library really isn't up to snuff for running arbitrary unicode apps on win98/Me. There are some messages that are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string conversions, encodings, etc.) that come up on the mail list from time to time, except everybody would be having them.

Robin · October 14, 2004, 3:56pm

Paul McNett wrote:

Then again, it really is dependent on whether they have the unicode Python installed, right?

Python always has unicode objects available.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Robin · October 14, 2004, 3:56pm

Matthew Zaleski wrote:

Out of curiosity, what is the new direction for Unicode and Python/wxPython? WinCE is UTF-16. Currently, UTF-8 and UTF-16 are popular but UTF-32 is lurking in the wings. I've never had a firm grasp of whether UTF-16 can provide all of the same encodings that UTF-32 can do (hopefully UCS-4 is dying on the vine); UTF-8 basically has variable character length (when looked at as bytes) but my understanding is that UTF-16 is fixed size, just like ASCII.

I think Python can be built such that it is either UTF-16 or UTF-32. wxWidgets uses whatever the wchar_t type is for the compiler it is built with.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

David_Fraser · October 14, 2004, 8:34pm

Robin Dunn wrote:

Kevin Altis wrote:

Just out of curiosity, what if there was no such thing as an ascii build of wxPython 2.5.x and 2.6.x, at least not a supported binary package? In other words, what are the downsides to moving to only supporting Unicode? Is it doable on all platforms?

I hadn't thought of this before, but it came up at the Python meeting tonight and I wonder if it would be beneficial for us to just move to Unicode and only Unicode for the future of wxPython?! Of course there would still be ASCII builds of 2.4.x

I've thought about it off and on for years. The biggest problem is that Microsoft's MSLU library really isn't up to snuff for running arbitrary unicode apps on win98/Me. There are some messages that are not translated correctly, or are not sent at all, or etc.

Then of course there would be the typical user issues (string conversions, encodings, etc.) that come up on the mail list from time to time, except everybody would be having them.

I wonder if we could grab some code from the wine project and implement the functions ...
Too many good hacky projects, not enough time to do them ...

David

Neil_Hodgson1 · October 14, 2004, 8:53pm

Matthew Zaleski:

I've never had a firm grasp of whether UTF-16 can
provide all of the same encodings that UTF-32 can

UTF-16 contains the full set of Unicode characters.

do (hopefully UCS-4 is dying on the vine);

UTF-32 is essentially a synonym for UCS-4.

UTF-8 basically has variable character length (when
looked at as bytes) but my understanding is that
UTF-16 is fixed size, just like ASCII.

UTF-16 is variable length with either one or two 16 bit values per
character.

Neil

Nick_Coghlan1 · October 14, 2004, 9:14pm

Robin Dunn wrote:

Paul McNett wrote:

Then again, it really is dependent on whether they have the unicode Python installed, right?

Python always has unicode objects available.

Unless the platform packagers (*coughRedHatcough*) do something odd. (Although, to be fair, I think the UCS2/UCS4 issues did get fixed - I can't recall seeing any gripes about this for recent versions of RHEL or Fedora)

However, if we're talking about the default Python configuration (i.e. what ./configure + make will give you on most platforms, and what the python.org binaries will provide on Windows), then Robin's right.

Cheers,
Nick.

Ron14 · October 15, 2004, 2:34am

Which just for the record makes it among the most *stupid* of all the
available encodings. It changes the type of char without fixing
the problems of multibyte characters and leaves you with the 'worst'
of both worlds. Of course that is offset necessarily by being the
default encoding on Windows platforms, but ISO platforms appear to have
agreed on the use of UCS-4 as the default wchar_t encoding.

Ron

···

On Fri, Oct 15, 2004 at 06:53:54AM +1000, Neil Hodgson wrote:

UTF-16 is variable length with either one or two 16 bit values per
character.

Matthew_Zaleski · October 15, 2004, 2:27pm

My understanding is that although the encodings for UCS-4 and UTF-32 are
identical, UTF-32 spec clears up a number of ambiguities that were present in
UCS-4 (i.e. multiple ways of achieving a sort pattern that actually gave
different answers when it shouldn't).

···

On Fri, 15 Oct 2004 06:53:54 +1000, Neil Hodgson wrote:

UTF-32 is essentially a synonym for UCS-4.

Neil_Hodgson1 · October 16, 2004, 12:05am

Matthew Zaleski:

My understanding is that although the encodings for UCS-4
and UTF-32 are identical, UTF-32 spec clears up a number
of ambiguities that were present in UCS-4 (i.e. multiple
ways of achieving a sort pattern that actually gave
different answers when it shouldn't).

Yes, Unicode adds extra semantics over ISO 10646.

Neil