Pretty darn OT, but...
Is there any way to open the rtf file in browser?
Well, you can't open MSWord format in the browser, either. (though there may be an embedded plugin I don't know about).
In general, people can click on a Word file in a browser, and if their browser has been set up right, it will be brought up in Word. This is the same for any non-native file type. So it should be pretty straight forward to set up your browser to open an rtf in Word.
In firefox on OS-X, for instance, when I click on a file type it doesn't know what to do with, it brings up a dialog that lets me choose to download it or select and app to open it with, and if I do that, there is a checkbox for "do this every time with this filetype", or something like that.
But it sounds like you have conflicting requirements:
If you want people to be able to simply view it in the browser, you would be better off converting the pdf to HTML, or just leaving it as PDF, as most people have their browsers set up to handle pdf easily.
If you want folks to be able to easily open and then edit, etc. the doc in Word, then converting to Word format is a fine idea. RTF is OK, but I'd be inclined to really give people a Word file (*.doc or *.docx).
Personally, I have major objections to application specific, proprietary formats, but the truth is, casual users often don't really know what a "file type" is, or that Word can open/edit other types than its native one, don't know how or want to set up their browsers to deal with rtf, etc. So if your requirement is for naive users to get a Word doc, you should give them a Word doc.
So how to generate it? RTF may still be a good way -- but I'd take the next step and run the rtf through either Word or Open Office or something to convert to a *.doc -- that has got to be scriptable some how.
Do you need to be able to do just a particular set of PDF files? or any arbitrary one? The later is going to be really hard, as PDF is an inherently different data model than word processors -- it's really just instructions as to how to draw a page, not a structured view of the contents. For simple docs that are mostly text, you can often re-create the text structure, but that doesn't work in general.
It also depends of how much of the PDF you want to preserve - just the text is not too hard.
If you have a particular set of pdf to convert, you may be able to go straight to MS's XML format -- I understand it's a pretty complicated mess, but may not be bad to generate just the subset you need, and there are lots of tools for writing XML with Python.
Good luck!
-Chris
···
On 11/17/10 7:32 AM, usr root wrote:
Thanks,
Usr root
On Wed, Nov 17, 2010 at 11:06 PM, Nat Echols <nathaniel.echols@gmail.com > <mailto:nathaniel.echols@gmail.com>> wrote:
On Wed, Nov 17, 2010 at 6:13 AM, usr root <usr.root@gmail.com > <mailto:usr.root@gmail.com>> wrote:
I am sorry I am asking a question not about wxpython, but I know
most of you are very good in python field.
My question is,
I need convert a pdf file to word(MS doc) file for an
open-source project, is there any open source project I can use?
Or is there any way easy to do it, I know pypdf can control my
pdf part, but for the word part, I have no idea, since I am
Linux programmer, I know little about windows.
Don't bother with .doc format if you can help it - instead, try
using RTF, which I believe is an open standard, and should be
supported by any modern word processor (even MS Office). I started
using PyRTF for this recently, with good results so far. It doesn't
appear to be actively developed any more, but it's open-source
(GPL/LGPL) and written entirely in Python, and looks like it would
be easy to extend if necessary.
http://pyrtf.sourceforge.net/
-Nat
--
To unsubscribe, send email to
wxPython-users+unsubscribe@googlegroups.com
<mailto:wxPython-users%2Bunsubscribe@googlegroups.com>
or visit http://groups.google.com/group/wxPython-users?hl=en
--
To unsubscribe, send email to wxPython-users+unsubscribe@googlegroups.com
or visit http://groups.google.com/group/wxPython-users?hl=en
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov