[FM Discuss] Manually creating diffs for translation updates

Daniel James daniel.james at sourcefabric.org
Wed Feb 1 04:11:31 PST 2012


Hi FM'ers,

When updating translations of a manual it is very useful to have
highlighted diffs between the latest version of the source language
version and the previous version (we have a feature request ticket open
for that).

In the meantime, here is a manual work-around - comments welcome :-)

1. Grab the individual pages from both versions (I tried doing it with
'_all' chapters in one page first, the diff was very hard to read). For
example:

wget -r http://en.flossmanuals.net/airtime/
wget -r http://en.flossmanuals.net/airtime-en-2-0/

2. Delete all the files that we don't care about for diff purposes:

rm -r en.flossmanuals.net/_booki/
rm -r en.flossmanuals.net/widgets/
rm -r en.flossmanuals.net/airtime/_*
rm -r en.flossmanuals.net/airtime-en-2-0/_*
rm -r en.flossmanuals.net/airtime/index.html
rm -r en.flossmanuals.net/airtime-en-2-0/index.html

3. (Optionally) Delete any old chapters with chapter number prefixes:

rm -r en.flossmanuals.net/airtime/ch0*
rm -r en.flossmanuals.net/airtime-en-2-0/ch0*

4. Change into the old version directory and diff the chapters:

cd en.flossmanuals.net/airtime
for filename in *; do diff $filename ../airtime-en-2-0/$filename >
$filename-diff.txt; done

5. Check the output for chapters which don't exist in the new version,
such as:

diff: ../airtime-en-2-0/setting-the-time: No such file or directory

6. Perform the diff the other way round to check for chapters that did
not exist in the old version, or have been renamed:

cd ../airtime-en-2-0/
for filename in *; do diff $filename ../airtime/$filename >
$filename-diff.txt; done

diff: ../airtime/setting-the-server-time: No such file or directory

You now have a directory of files called <chaptername>-diff.txt. Ideally
this diff would be done on the body text of the chapter only, rather
than the whole HTML file - maybe there's a way to script that with curl.

Cheers!

Daniel



More information about the Discuss mailing list