[Booki-dev] [FM Discuss] How to update EPUBs in Internet Archive?

James Simmons nicestep at gmail.com
Mon Jan 17 07:16:45 PST 2011


Raj,

The attached screen grab shows what I see when I attempt to edit my
own item in IA for "Make Your Own Sugar Activities!"  As you can see,
no EPUB is shown, so there is no way for me to remove the existing
EPUB.  When I have tried to add my own EPUB it doesn't show up in the
list either, and does not get served.

I also question how I could upload a PDF and an EPUB at the same time.

Just FYI I have found that the only way to upload an item that works
consistently for me is to use the Flash method in Internet Explorer on
Windows.  When I have used the non-Flash method in Linux the file gets
uploaded but nothing gets derived.  I ended up posting emails to the
support address and someone there ran the derive jobs for me.  I went
through this several times before giving up and using IE on Windows
instead, which has never failed me.

I'd like to be of use to this project.  If you do a search on
"nicestep" in IA you'll see that I've donated a fair number of books
that I've scanned myself.  I've written a FLOSS Manual (E-Book
Enlightenment) on the subject, and I've created an Activity for Sugar
Labs called Get Internet Archive Books which lets children easily
search the Internet Archive catalog and download books in the
different formats.  I've also donated texts to Project Gutenberg and
PG Canada (most recently The Big Sleep by Raymond Chandler).

If you can explain how to replace a derived EPUB with a corrected one
by hand I'll document the process in my book.  I'll also do testing
for Booki (I have an installation in my home office) and make myself
useful in any other way I can.

Thanks,

James Simmons


On Sat, Jan 15, 2011 at 4:27 PM, raj kumar <rkumar at archive.org> wrote:
> Hi!
>
> James, if you upload a pdf and epub to the Internet Archive at the same time, we won't generate an epub, but will serve the uploaded one instead. Additionally, as an item owner, you can remove the generated epub and replace it with your own. If this doesn't work for you, I can help you.
>
> As Adam says, we are hoping to get first-class booki integration in the near future :)
>
> Also, it would be nice if the IA deriver used the text in the pdf to generate the djvu.xml file if possible, instead of running ocr. The archive.org book scanning process had been  traditionally image-based, but we hope to change this in the future.
>
> -raj
>
> On Jan 15, 2011, at 3:20 AM, adam wrote:
>
>> so. the thing is james you are talking about the future :)
>>
>> We have a bit of a problem here at the booki dev team. its a question of
>> capacity. the scenario you suggest is _exactly_ what the Internet
>> Archive wants to do with booki. We have discussed this with them at
>> length and they want to build booki into their system so that people can
>> correct epubs just as you have done.
>>
>> This is an enormous opportunity for booki but we have so far failed to
>> realise it because, simply, we dont have the people power. It is
>> _extremely_ frustrating.
>>
>> What we need to do is get a few heads together and solve this and solve
>> it fast. We have all the code in place what I think we need to do is
>> this:
>> * find a good server to host an archive instance of booki
>> * work out some basic code to automate the import
>> * install
>> * run test projects
>>
>> if you would like to be a part of this greater picture i would *love* to
>> work with someone to get this cracking...
>>
>> adam
>>
>>
>>
>>
>> On Fri, 2011-01-14 at 14:51 -0600, James Simmons wrote:
>>> When I submitted "Make Your Own Sugar Activities!" to the Internet
>>> Archive it created books in multiple forms, including an EPUB.
>>> However, the way it makes an EPUB is defective in this case.  It
>>> assumes that you will give it a PDF composed of page images, which it
>>> then does OCR on to create the EPUB.  Doing OCR on a file that already
>>> contains text is not going to give good results, and the resulting
>>> EPUB is useless.  However, Booki can create a really good EPUB.  I
>>> know I could donate that EPUB to the Internet Archive, but what I
>>> really want to do is REPLACE the lousy generated EPUB with my good
>>> one.  I know that Booki was created in part to do this very thing, but
>>> I can't figure out how to make it happen.  The Internet Archive page
>>> does not let you delete the existing EPUB, and when you upload a new
>>> one it does not seem to replace the existing one.
>>>
>>> I'll ask over at IA but I figured that whoever developed the function
>>> in Booki would know the answer.
>>>
>>> James Simmons
>>> _______________________________________________
>>> Discuss mailing list
>>> Discuss at lists.flossmanuals.net
>>> http://lists.flossmanuals.net/listinfo.cgi/discuss-flossmanuals.net
>>
>>
>> _______________________________________________
>> Booki-dev mailing list
>> Booki-dev at lists.flossmanuals.net
>> http://lists.flossmanuals.net/listinfo.cgi/booki-dev-flossmanuals.net
>
> _______________________________________________
> Booki-dev mailing list
> Booki-dev at lists.flossmanuals.net
> http://lists.flossmanuals.net/listinfo.cgi/booki-dev-flossmanuals.net
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MYOSA.jpg
Type: image/jpeg
Size: 88481 bytes
Desc: not available
URL: <http://lists.flossmanuals.net/pipermail/booki-dev-flossmanuals.net/attachments/20110117/f0e4ba38/attachment-0002.jpg>


More information about the Booki-dev mailing list