Technical Information

The source files that were used for initial input for this project were optical character recognition (OCR) text files based on the scans of the book available on Internet Archive.com. These text files contain tens of thousands of errors introduced by the optical character recognition process.

Nearly two years of part time editorial work were required to manually fix mistakes in the OCR files. A comparison was made with photographs of the original pages to ensure accuracy and recover the original text. The edited material was saved in ASCII text files using a formatting method called “Restructured Text”. This restructured text markup was then used as the source input to a program named Sphinx. Sphinx can take restructured text files and convert it into html web pages. The source resturctured text files are available for viewing by clicking on the “Show Source” link on the left sidebar of each page. One benefit of the restructured text conversion by Sphinx is that lists of children are automatically expanded and incremented correctly. Sphinx has “theme” support built in and the current implementation of these pages uses the “default” theme.