Friday, September 02, 2011

Polish the EPUB

Polish the EPUB: In EPUB documents, you cannot detect some problems with normal validation methods. As long as the document validates as well-formed XML and follows the EPUB standard, it can appear to be correct but might not read correctly in an e-Reader. Examples include broken paragraphs, bad page numbering, and spelling errors caused by OCR scanning. But you can view and correct errors using two methods: with the EPUB editor Sigil and with PHP in combination with SimpleXML and the Enchant libraries. Regular expressions provide the key to efficient processing.