[FM Discuss] [Booki-dev] How to update EPUBs in Internet Archive?

raj kumar rkumar at archive.org
Sat Jan 15 14:27:29 PST 2011


Hi!

James, if you upload a pdf and epub to the Internet Archive at the same time, we won't generate an epub, but will serve the uploaded one instead. Additionally, as an item owner, you can remove the generated epub and replace it with your own. If this doesn't work for you, I can help you.

As Adam says, we are hoping to get first-class booki integration in the near future :)

Also, it would be nice if the IA deriver used the text in the pdf to generate the djvu.xml file if possible, instead of running ocr. The archive.org book scanning process had been  traditionally image-based, but we hope to change this in the future.

-raj

On Jan 15, 2011, at 3:20 AM, adam wrote:

> so. the thing is james you are talking about the future :)
> 
> We have a bit of a problem here at the booki dev team. its a question of
> capacity. the scenario you suggest is _exactly_ what the Internet
> Archive wants to do with booki. We have discussed this with them at
> length and they want to build booki into their system so that people can
> correct epubs just as you have done.
> 
> This is an enormous opportunity for booki but we have so far failed to
> realise it because, simply, we dont have the people power. It is
> _extremely_ frustrating.
> 
> What we need to do is get a few heads together and solve this and solve
> it fast. We have all the code in place what I think we need to do is
> this:
> * find a good server to host an archive instance of booki
> * work out some basic code to automate the import
> * install 
> * run test projects
> 
> if you would like to be a part of this greater picture i would *love* to
> work with someone to get this cracking...
> 
> adam
> 
> 
> 
> 
> On Fri, 2011-01-14 at 14:51 -0600, James Simmons wrote:
>> When I submitted "Make Your Own Sugar Activities!" to the Internet
>> Archive it created books in multiple forms, including an EPUB.
>> However, the way it makes an EPUB is defective in this case.  It
>> assumes that you will give it a PDF composed of page images, which it
>> then does OCR on to create the EPUB.  Doing OCR on a file that already
>> contains text is not going to give good results, and the resulting
>> EPUB is useless.  However, Booki can create a really good EPUB.  I
>> know I could donate that EPUB to the Internet Archive, but what I
>> really want to do is REPLACE the lousy generated EPUB with my good
>> one.  I know that Booki was created in part to do this very thing, but
>> I can't figure out how to make it happen.  The Internet Archive page
>> does not let you delete the existing EPUB, and when you upload a new
>> one it does not seem to replace the existing one.
>> 
>> I'll ask over at IA but I figured that whoever developed the function
>> in Booki would know the answer.
>> 
>> James Simmons
>> _______________________________________________
>> Discuss mailing list
>> Discuss at lists.flossmanuals.net
>> http://lists.flossmanuals.net/listinfo.cgi/discuss-flossmanuals.net
> 
> 
> _______________________________________________
> Booki-dev mailing list
> Booki-dev at lists.flossmanuals.net
> http://lists.flossmanuals.net/listinfo.cgi/booki-dev-flossmanuals.net




More information about the Discuss mailing list