Is this to do with the way PythonReports defines/uses the fonts?
Any hint on how to make this display correctly?
But, using the Wing
debugger, the unicode strings in the PDF show the e-acute as \u2020 - which
is indeed the Unicode character 'Dagger'
if you're using Wing, then this is the string after it was decoded
into a python unicode, object, yes?
In which case, the wrong encoding is being used to decode it.
So the question is, how are string encoded in PDF. From reading this thread:
That's a hard question to answer, but presumable it is either:
All PDF text is encoded with a particular encoding
or
There is a way to specify the encoding in a particular document.
I suspect it's the latter, or you would have this problem all the
time. It could also be that PythonReports is using the wrong encoding
or specifying it incorrectly but as Adobe Reader is the reference
implementation, to some extent, if it works in Reader, it's right.
So you need to figure out how reader determines the encoding, and
emulate that. Maybe the specs will help:
http://www.adobe.com/devnet/pdf/pdf_reference.html
So, I think the viewer is behaving correctly as far as it goes -
not really -- it's using the wrong encoding to decode the data in the
PDF -- that is not correct ( as long as you define correct as "same as
Adobe Reader" )
I'd make a tiny pdf with just a bit of non-ascii text in it, and take
a look at it. That may be easier than reading the spec!
-Chris
···
On Thu, May 23, 2013 at 8:58 AM, David Hughes <dfh@forestfield.co.uk> wrote:
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov