[FM Discuss] FM in Safari Books
Andy Oram
andyo at oreilly.com
Mon Mar 23 04:58:15 PDT 2009
The output you showed looks like it will map pretty well to DocBook. As I said in earlier mail, a XSL transformation will have to handle headings in some fancy way to produce nested sections instead.
The <br /> and <br> tags can probably just be ignored. There seem to be some tables of contents that can be ignored, because other tools will generate them from DocBook after the document is submitted.
I think <strong> may have to be interpreted in different ways in different contexts. In text, it should become <emphasis role="bold"> (or you could ignore the bold and just use <emphasis> whereas in <pre> sections it should become <userinput>.
And as I said, <li> tags should contain <p> tags; then conversion to DocBook is a one-to-one mapping.
Andy
----- Original Message -----
From: "Douglas Bagnall" <douglas at paradise.net.nz>
To: discuss at lists.flossmanuals.net
Cc: tech at flossmanuals.net
Sent: Monday, March 23, 2009 6:21:24 AM GMT -05:00 US/Canada Eastern
Subject: Re: [FM Discuss] FM in Safari Books
Anne Gentle wrote:
> Is there a fast/easy way to get the XHTML content out of a book? I was doing
> some test runs with "Save as" html but of course I get all the navigation
> items around the actual content. If there's a Maintainer's way to do that,
> just let me know. Otherwise, please send me some sample FM HTML files that
> don't have all the twiki nav items surrounding the "guts" of the content. I
> think I see a good path from XHTML to DocBook on the DocBook wiki but I want
> to test it out with real files.
The first of these two links is an example of the "HTML" that currently
gets fed to Pisa (the pdf maker). The second has been manipulated so
the images show up.
http://en.flossmanuals.net/~douglas/gnu-latest.html
http://en.flossmanuals.net/~douglas/gnu-2009-03-23-fixed-links.html
This is not quite what you want though, as it is full of non-html
elements like "pdf:toc", "h0", and "fmsection". The actual chapters are
the contents of each <fmsection />, and they should be well-formed xhtml
snippets (unless someone has edited the source directly and carelessly).
I could write a script to output clean html versions of the books, but
*not* this week.
Douglas
_______________________________________________
Discuss mailing list
Discuss at lists.flossmanuals.net
http://lists.flossmanuals.net/listinfo.cgi/discuss-flossmanuals.net
--
----------------------------------------------------------------------
Andy Oram O'Reilly Media email: andyo at oreilly.com
Editor 10 Fawcett Street, Fourth Floor voice: 617-499-7479
Cambridge, MA 02138-1175 fax: 617-661-1116
USA http://www.praxagora.com/andyo/
----------------------------------------------------------------------
More information about the Discuss
mailing list