A collection of tools used to convert html or text files into ebooks.
npm install ebukafy --save
ebukafy [-hv] command [args ...]
ebukafy look at the list of commands/tools in the Tools section below. The arguments needed change based on the command.
Normal use of
ebukafy would look something like this:
- Download some html that you want to convert into an ebook with a tool like
ebukafy create-skeletonto create a skeleton of an epub directory
- Put the downloaded html into the
EPUB/textdirectory of the skeleton
- Since most html files online will have a bunch of cruft at the beginning of the book, at the end of the book, and between the chapters, remove everything that you won't need. You essentially just want to keep all the paragraph tags: the actual text of the book
ebukafy splitto split the big html file into multiple files with correct headers
- You can use
ebukafy smarten-quotesto convert any straight quotes to curly/smart quotes
- After all the xhtml files in the
EPUB/textdirectory are as you want them to be, run
ebukafy generate-spineto populate the manifest and spine tags in the
content.opffile. While the manifest should be good as is, the spine needs to be reordered in reading order, not alphabetical as is the default (more info in the generate-spine README). This is also a good time to add any extra metadata you might find useful in the metadata tag of the
content.opffile. Here are some examples of what you can add
ebukafy generate-tocto generate the table of contents from the spine done in the previous step
- Optionally replace the cover image. The one provided in the skeleton is just an all black 1400 x 2100 jpg. To change it just replace the cover in the
EPUB/imagesdirectory. Keep the name as
cover.jpgor manually edit
content.opfif you know what you're doing
- After everything is done run
ebukafy buildto build this epub folder into an actual epub file
- Before reading, run
ebukafy epubcheckto make sure everything within the epub is up to the specification of the epub standard and fix any errors
- To read on a kobo or kindle please use a tool like Calibre to convert the book to the appropriate format (kepub and azw3 respectively)
Note that to run any of these tools you need to precede them with
ebukafy (e.g. if you want to run
create-skeleton, you need to run
NAME build -- builds the epub directory into an epub file SYNOPSIS build [-h] [-o output_file] epub_directory DESCRIPTION This tool builds the epub directory into an actual epub file. Ignoring a few important details, this is essentially done by zipping up the folder. Additionally, this updates the mandatory 'dcterms:modified' value in the content.opf file. The options are as follows: -h Display usage statement. -o Optional. Set the outputed epub to be called something other than 'output.epub', which is the default. EXAMPLES Note that the 'epub directory' is actually the parent directory of the 'EPUB' directory. In other words, the command should be run in this directory: epub-directory/ mimetype META-INF/ container.xml EPUB/ [etc] To build just go into 'epub-directory' and run build -o homer_the-odyssey.epub .
NAME create-skeleton -- creates a skeleton epub directory SYNOPSIS create-skeleton [-h] [-a author] [-u uid] -l language -t title target_directory DESCRIPTION This tool creates a new skeleton of an epub directory at target_directory The contents of this skeleton folder can be found at the GitHub repository. According to the epub standard the uid, title, and language are all required. Including the author is just good practice. The options are as follows: -h Display usage statement. -a Optional. Set the author of the book. -u Optional. Set uid manually. If not set, it will be randomly generated. A uid is mandatory according to the epub standard. -l Set the language of the book. This should be the IANA language tag. If you don't know the tag for a language look up 'IANA Language Subtag Registry'. -t Set the title of the book. EXAMPLES To create an epub skeleton in the current working directory you can run create-skeleton -a Homer -l grc -t 'The Odyssey' . Note the '' around the title since it's two words.
NAME epubcheck -- checks for any errors regarding the epub specification SYNOPSIS epubcheck [-h] target_epub DESCRIPTION epubcheck is a tool which 'validates the conformance of EPUB publications against the EPUB specifications.' The GitHub repository for epubcheck can be found here: https://github.com/w3c/epubcheck Note that this tool is written in java so you need java to run it. It will output any errors or just say that your epub is up to standard. The options are as follows: -h Display usage statement. EXAMPLES To check if your epub conforms to the epub standards just run epubcheck homer_the-odyssey.epub
NAME generate-manifest -- generates the manifest part of the content.opf file SYNOPSIS generate-manifest [-hi] epub_directory DESCRIPTION This tool generates the manifest part of the content.opf file. It goes through the text, css, and images folders and the toc.ncx and toc.xhtml files and adds them as items between the two manifest tags. Without the 'i' option it just prints it to stdout. With the 'i' option it replaces the manifest in-place. The options are as follows: -h Display usage statement. -i Instead of outputting the manifest into stdout, it overwrites the manifest in the actual content.opf file, in-place. EXAMPLES Note that the 'epub directory' is actually the parent directory of the 'EPUB' directory. In other words, the command should be run in this directory: epub-directory/ mimetype META-INF/ container.xml EPUB/ [etc] To generate the manifest just go into 'epub-directory' and run generate-manifest -i .
NAME generate-spine -- generates the spine part of the content.opf file SYNOPSIS generate-spine [-hi] epub_directory DESCRIPTION This tool generates the spine part of the content.opf file. It does this simply by going through the text folder. Without the 'i' option it just prints it to stdout. With the 'i' option it replaces the spine in-place. NOTE: You will most likely want to manually reorder the spine. By default it will order the spine in alphabetical order. So, for example, 'chapter-10.xhtml' will come before 'chapter-2.xhtml'. However, the spine needs to be in the order that the book is supposed to be read, so 'chapter-2.xhtml' should come before 'chapter-10.xhtml'. The options are as follows: -h Display usage statement. -i Instead of outputting the spine into stdout, it overwrites the spine in the actual content.opf file, in-place. EXAMPLES Note that the 'epub directory' is actually the parent directory of the 'EPUB' directory. In other words, the command should be run in this directory: epub-directory/ mimetype META-INF/ container.xml EPUB/ [etc] To generate the spine just go into 'epub-directory' and run generate-spine -i . And don't forget to manually reorder it into reading order!
NAME generate-toc -- generates the toc.ncx and toc.xhtml files SYNOPSIS generate-toc [-hio] epub_directory DESCRIPTION This tool generates toc.ncx and toc.xhtml files. Without the 'i' option it just prints it to stdout. With the 'i' option it replaces the files in-place. This command looks at the <spine> tag in the content.opf file so make sure to run generate-spine before generating the table of contents. NOTE: The titles of each chapter are set as whatever is in the <title> tag. If the <title> tag is not present, the toc will simply say "MISSING TITLE". The options are as follows: -h Display usage statement. -i Instead of outputting the toc into stdout, it overwrites the tocs in the toc.xhtml and toc.ncx files, in-place. EXAMPLES Note that the 'epub directory' is actually the parent directory of the 'EPUB' directory. In other words, the command should be run in this directory: epub-directory/ mimetype META-INF/ container.xml EPUB/ [etc] You can generate the toc files like so: generate-toc -i . And make sure that the spine in the content.opf file is complete before you run generate-toc!
NAME smarten-quotes -- convert straight quotes to smart quotes SYNOPSIS smarten-quotes [-h] target_file ... DESCRIPTION Online transcriptions often use straight quotes since they're easier to type on the keyboard. However, books look bad if you just use straight quotes so this tool, for purposes of better typography, allows you to convert them to smart quotes (aka curly quotes). Note that this tool looks specifically between <p> tags, so if you have any quotes outside of a <p> tag that you want to smarten, you will have to do so manually. Also, if your book is in a language with a different quote system you can first use this tool to convert them to the standard english ones and then do a search and replace to your language's quotes („…“, «…», »…«, etc.) using a tool like sed. The options are as follows: -h Display usage statement. EXAMPLES Say you have a bunch of text files in the text/ folder. You can use this tool like so: ebukafy smarten-quotes EPUB/text/*
NAME split -- split file at every '<!--split-->' into separate files SYNOPSIS split [-h] target_file DESCRIPTION This tool splits a file at every '<!--split-->' into chapter-n.xhtml, where n is just an index starting at 1. Oftentimes you start making your ebook with a gigantic text/html file. It is much better practice to split your file into separate chapters. This tool will also put in the appropriate header information into each chapter-n.xhtml file such as the chapter's roman numeral. The options are as follows: -h Display usage statement. EXAMPLES Say you have a file like this and you put in the splits like so foofoo <!--split--> barbar <!--split--> bazbaz This will result in three files, chapter-1.xhtml to chapter-3.xhtml. Besides the header, the content of chapter-1.xhtml will be 'foofoo', the content of chapter-2.xhtml will be 'barbar', and the content of chapter-3.xhtml will be 'bazbaz'. The best way to see exactly what this does is to try it with an example such as the one above.