[FM Discuss] FM in Safari Books

Andy Oram andyo at oreilly.com
Mon Mar 23 04:58:15 PDT 2009


The output you showed looks like it will map pretty well to DocBook. As I said in earlier mail, a XSL transformation will have to handle headings in some fancy way to produce nested sections instead.

The <br /> and <br> tags can probably just be ignored. There seem to be some tables of contents that can be ignored, because other tools will generate them from DocBook after the document is submitted.

I think <strong> may have to be interpreted in different ways in different contexts. In text, it should become <emphasis role="bold"> (or you could ignore the bold and just use <emphasis> whereas in <pre> sections it should become <userinput>.

And as I said, <li> tags should contain <p> tags; then conversion to DocBook is a one-to-one mapping.

Andy

----- Original Message -----
From: "Douglas Bagnall" <douglas at paradise.net.nz>
To: discuss at lists.flossmanuals.net
Cc: tech at flossmanuals.net
Sent: Monday, March 23, 2009 6:21:24 AM GMT -05:00 US/Canada Eastern
Subject: Re: [FM Discuss] FM in Safari Books

Anne Gentle wrote:

> Is there a fast/easy way to get the XHTML content out of a book? I was doing
> some test runs with "Save as" html but of course I get all the navigation
> items around the actual content. If there's a Maintainer's way to do that,
> just let me know. Otherwise, please send me some sample FM HTML files that
> don't have all the twiki nav items surrounding the "guts" of the content. I
> think I see a good path from XHTML to DocBook on the DocBook wiki but I want
> to test it out with real files.

The first of these two links is an example of the "HTML" that currently
gets fed to Pisa (the pdf maker).  The second has been manipulated so
the images show up.

http://en.flossmanuals.net/~douglas/gnu-latest.html
http://en.flossmanuals.net/~douglas/gnu-2009-03-23-fixed-links.html

This is not quite what you want though, as it is full of non-html
elements like "pdf:toc", "h0", and "fmsection".  The actual chapters are
the contents of each <fmsection />, and they should be well-formed xhtml
snippets (unless someone has edited the source directly and carelessly).

I could write a script to output clean html versions of the books, but
*not* this week.


Douglas
_______________________________________________
Discuss mailing list
Discuss at lists.flossmanuals.net
http://lists.flossmanuals.net/listinfo.cgi/discuss-flossmanuals.net

-- 
----------------------------------------------------------------------
Andy Oram  O'Reilly Media                     email: andyo at oreilly.com
Editor     10 Fawcett Street, Fourth Floor         voice: 617-499-7479
           Cambridge, MA 02138-1175                  fax: 617-661-1116
           USA                         http://www.praxagora.com/andyo/
----------------------------------------------------------------------



More information about the Discuss mailing list