[Booki-dev] Does-booki-have-this-book server

adam hyde adam at flossmanuals.net
Mon Oct 18 04:59:55 PDT 2010


ok! so, it looks good. Tomorrow Raj can we talk about getting the server
we need and the next steps?

:)

adam

On Sat, 2010-10-16 at 20:29 -0700, raj kumar wrote:
> Wow.. awesome! Thanks, Douglas!
> 
> On Oct 16, 2010, at 4:00 PM, Douglas Bagnall wrote:
> 
> > hi,
> > 
> > In discussions between Raj, Adam, and myself, we realised that the
> > Internet Archive servers will often need to ask Booki whether an
> > Archive book has a) been imported into Booki, and b) been changed in
> > Booki.  The Internet Archive will use this to offer people an
> > opportunity to correct books and, if a corrected version exists,
> > download the corrected version.
> > 
> > The trouble with this is that Booki doesn't index its books by Archive
> > ID, nor by whether they have been edited, and even if it did, its
> > framework is more geared toward rich interaction than fast look-ups.
> > Meanwhile the Archive has two million books and goodness knows how
> > many readers.
> > 
> > We decided we needed an intermediate server that periodically asks
> > Booki for information about Archive-sourced books.  That means Booki
> > only needs to trawl through its database every minute or two, while
> > the Archive can get instant answers from an in-memory store.
> > 
> > The new server is provisionally called Boouncer (from gatekeeper ->
> > doorman -> bouncer -> splice with booki; pronunciation is up to you),
> > and is in git on the booki-dev server:
> > 
> > http://booki-dev.flossmanuals.net/git?p=boouncer.git
> > 
> > As this also involves changes to Booki, it is so far only running
> > against my test server. The examples below use
> > does-booki-have-this.halo.gen.nz and booki.halo.gen.nz, which are not
> > permanent urls.
> > 
> > How it works 1: Epub links
> > ==========================
> > 
> > Every time an IA book details page is loaded, the Archive server asks
> > whether a corrected epub exists.  If it does exist, it wants the URL.
> > "Corrected" means changed in Booki, and "exists" includes epubs
> > generated on the fly (which at this stage means all of them).
> > 
> > The general form for this type of request is:
> > 
> > /<id-scheme>/epub/<id>
> > 
> > For Internet Archive ids, the id-scheme is "archive.org".
> > 
> > If the book exists and has been edited, the reply is a "302 Found"
> > redirection to the epub.  This contains the epub URL in the Location
> > header. For example:
> > 
> > http://does-booki-have-this.halo.gen.nz/archive.org/epub/fairerthandayfo00hugggoog
> > 
> > If the book doesn't exist or hasn't been edited, the result is a "404
> > Not Found" error. E.g:
> > 
> > http://does-booki-have-this.halo.gen.nz/archive.org/epub/NOT-a-book
> > 
> > How it works 2: Edit links
> > ==========================
> > 
> > On every Archive book details page there will be a link to edit the
> > book in Booki.  If that book is already in Booki, the link should take
> > you to the existing edit page.  If more than one copy of the book
> > exists, it should take you to the "best" one, which is defined as the
> > first edited one, or the first overall if none have been edited.
> > 
> > If the book is not in Booki, this link should import it for you and
> > send you to the edit page.  At present Boouncer DOESN'T construct a
> > URL to import the book: either it can be changed so it does, or IA
> > and/or Booki can deal with that some other way.  In any case I think
> > Booki will need tweaking to allow people to jump straight into editing.
> > 
> > The form is:
> > 
> > /<id-scheme>/edit/<id>
> > 
> > where, as above, id-scheme is "archive.org" for our purposes.  This
> > redirects to the edit interface (via "302 Found"):
> > 
> > http://does-booki-have-this.halo.gen.nz/archive.org/edit/fairerthandayfo00hugggoog
> > 
> > and this does not, with "404 Not Found":
> > 
> > http://does-booki-have-this.halo.gen.nz/archive.org/edit/no-book-here
> > 
> > Other ID schemes
> > ================
> > 
> > This redirect system works for other kinds of ID.  The id-scheme
> > correlates to the scheme attribute of an epub's metadata element.  So
> > if the original epub had an identifier like this:
> > 
> > <metadata>
> >  <dc:identifier scheme="ISBN">978-0-14-050630-3</dc:identifier>
> >  <!-- possibly other identifiers here too... --> 
> > </metadata>
> > 
> > Then the URL /ISBN/edit/978-0-14-050630-3 would find its edit page in
> > Booki.  This is probably of no use to anyone, and ID schemes are not
> > widely used, but there you go.
> > 
> > Books imported from the Internet Archive are given an implicit
> > scheme="archive.org".  This is new, so previously imported books won't
> > be found.
> > 
> > What Booki provides
> > ===================
> > 
> > The following refers to the "booklist" branch of Booki at
> > http://booki-dev.flossmanuals.net/git?p=booki.git;a=shortlog;h=refs/heads/booklist
> > 
> > URLs like /list-books-by-id/<id-scheme>.json provide a JSON summary of
> > books possessing IDs in that scheme.  For example:
> > 
> > http://booki.halo.gen.nz/list-books-by-id/archive.org.json
> > 
> > gets all the archive books.  The JSON is structured like this:
> > 
> > {
> >  ID : {
> >      'epub': URL or null,
> >      'edit': URL or null   
> >  },...
> > }
> > 
> > That is, for each ID there is a mapping from modes ('edit', 'epub') to
> > URLs.  If there is no valid URL (e.g., a corrected epub is unavailable
> > because the book hasn't been changed), then null is used.  Currently
> > an edit link is never null.
> > 
> > Here's an example:
> > 
> > {
> >    "fairerthandayfo00hugggoog": {
> >        "edit": "http://booki.halo.gen.nz/fairer-than-day-for-sunday-school-and-revival-work/edit/", 
> >        "epub": "http://objavi.halo.gen.nz/objavi.cgi?destination=download&book=fairer-than-day-for-sunday-school-and-revival-work&mode=epub&server=booki.halo.gen.nz"
> >    }, 
> >    "bijdragenototde00valegoog": {
> >        "edit": "http://booki.halo.gen.nz/kauri/edit/", 
> >        "epub": null
> >    }
> > }
> > 
> > That shows one book with changes and one book without.
> > 
> > TODO, questions
> > ===============
> > 
> > 1. Perhaps "epub" should be changed to "corrected-epub", in case at
> > some point in the future we want to know about uncorrected epubs.
> > 
> > 2. Actually, all the url strings could be reviewed, and maybe the
> > Booki url needs to move to fit into the API plan.
> > 
> > 3. The import-and-edit mechanism needs to be worked out (see under
> > "edit links" above).
> > 
> > 4. Booki changes need to be merged into the mainline, and boouncer put
> > into production.
> > 
> > 5. Corrected epubs can be cached in e.g. the Archive S3 servers, but
> > Booki needs to learn about this.
> > 
> > 6. Is a redirection the right thing?  For the edit links, it means
> > there is a single link to follow for all the different cases (book not
> > in booki, book in booki, book in booki in several instances), but for
> > the epub links it is more likely to be "unpacked" than followed.  I
> > have assumed that reading a URL from an HTTP header is as easy as
> > reading it from an HTTP body, but possibly that is not so.
> > 
> > 7. Silly name. ideas?
> > 
> > that's all I think
> > 
> > Douglas
> > _______________________________________________
> > Booki-dev mailing list
> > Booki-dev at lists.flossmanuals.net
> > http://lists.flossmanuals.net/listinfo.cgi/booki-dev-flossmanuals.net
> 
> _______________________________________________
> Booki-dev mailing list
> Booki-dev at lists.flossmanuals.net
> http://lists.flossmanuals.net/listinfo.cgi/booki-dev-flossmanuals.net

-- 
vote for booki in the Open Web Awards!

Booki (the latest FM project - http://www.booki.cc - a collaborative
publishing platform) is in the final of the Open Web Awards.
Great! We are 1 of 3 projects. If we win we get 5000 which we will use
to do a code sprint on a tropical island somewhere ;)

please please please register :
http://www.drumbeat.org/user/register

and vote for us:
http://www.drumbeat.org/project/open-web-publishing

and pass this around!!!! :)

Adam Hyde
Founder FLOSS Manuals &
Booki Project Manager 

Contact Information
German mobile : + 49 177 4935122
Email : adam at flossmanuals.net
irc : irc.freenode.net #flossmanuals


"Free manuals for free software"
http://www.flossmanuals.net/about

Free Software for making Free Books
http://www.booki.cc/




More information about the Booki-dev mailing list