File output of characters chr(128) through chr(255)

Bob Klahn escribió:

I have literally thousands of flat files which contain "extended ASCII" (ord = 128 to 255 inclusive) characters. I can read them all, but I can't seem to write them, as the typical error I get is "UnicodeEncodeError: 'ascii' codec can't decode character u'\xdb' in position 4: ordinal not in range(128)".

I simply need to write to disk the exact ordinal values, all between 0 and 255 inclusive, as is. E.g., a byte with value hex DB (decimal 219) needs to be written to disk as the byte hex DB.

How do I avoid codec issues? There's no codec out there (that I know of) that will allow ordinal values 0 through 255 to be written "as is."

A short interactive session shows that's not exactly your problem:

In [1]: output = open("test.txt", "wb")
In [2]: for f in range(32, 256):
   ...: output.write(chr(f))
In [3]: output.close()

Python is pretty capable of writting bytes to a file. No problems there, and the output is as expected. Your problem really is "I have a unicode character, how do I convert it to a simple non-unicode character so I can write it with that write function that expects non-unicode strings?

Well, if you are really sure that your unicode characters are below 256, you could do:

In [6]: a = u'\xdb'
In [7]: output = open("test.txt", "wb")
In [8]: ord(a)
Out[8]: 219
In [10]: output.write(chr(ord(a)))
In [11]: output.close()

There are many other ways of doing that without the efficiency penaly of translating individual characters which depend on how your input is formed and how you process it. The problem is most likely... why are you converting those characters to unicode in first place if you don't need to?

···

--
Rastertech España S.A.
  Grzegorz Adam Hankiewicz
/Jefe de Producto TeraVial/

C/ Perfumería 21. Nave I. Polígono industrial La Mina
28770 Colmenar Viejo. Madrid (España)
Tel. +34 918 467 390 (Ext.17) *·* Fax +34 918 457 889
ghankiewicz@rastertech.es *·* www.rastertech.es <http://www.rastertech.es/&gt;