README

ebukafy

A collection of tools used to convert html or text files into ebooks.

Installation

npm install ebukafy --save

Usage

ebukafy [-hv] command [args ...]

To use ebukafy look at the list of commands/tools in the Tools section below. The arguments needed change based on the command.

Normal use of ebukafy would look something like this:

Download some html that you want to convert into an ebook with a tool like wget
Run ebukafy create-skeleton to create a skeleton of an epub directory
Put the downloaded html into the EPUB/text directory of the skeleton
Since most html files online will have a bunch of cruft at the beginning of the book, at the end of the book, and between the chapters, remove everything that you won't need. You essentially just want to keep all the paragraph tags: the actual text of the book
Run ebukafy split to split the big html file into multiple files with correct headers
You can use ebukafy smarten-quotes to convert any straight quotes to curly/smart quotes
After all the xhtml files in the EPUB/text directory are as you want them to be, run ebukafy generate-manifest and ebukafy generate-spine to populate the manifest and spine tags in the content.opf file. While the manifest should be good as is, the spine needs to be reordered in reading order, not alphabetical as is the default (more info in the generate-spine README). This is also a good time to add any extra metadata you might find useful in the metadata tag of the content.opf file. Here are some examples of what you can add
Run ebukafy generate-toc to generate the table of contents from the spine done in the previous step
Optionally replace the cover image. The one provided in the skeleton is just an all black 1400 x 2100 jpg. To change it just replace the cover in the EPUB/images directory. Keep the name as cover.jpg or manually edit content.opf if you know what you're doing
After everything is done run ebukafy build to build this epub folder into an actual epub file
Before reading, run ebukafy epubcheck to make sure everything within the epub is up to the specification of the epub standard and fix any errors
To read on a kobo or kindle please use a tool like Calibre to convert the book to the appropriate format (kepub and azw3 respectively)

Tools

Note that to run any of these tools you need to precede them with ebukafy (e.g. if you want to run create-skeleton, you need to run ebukafy create-skeleton).

`ebukafy build`

NAME
     build -- builds the epub directory into an epub file

SYNOPSIS
     build [-h] [-o output_file] epub_directory

DESCRIPTION
     This tool builds the epub directory into an actual epub file. Ignoring
     a few important details, this is essentially done by zipping up 
     the folder. Additionally, this updates the mandatory 'dcterms:modified'
     value in the content.opf file.

     The options are as follows:

     -h       Display usage statement. 
 
     -o       Optional. Set the outputed epub to be called something
              other than 'output.epub', which is the default.

EXAMPLES
     Note that the 'epub directory' is actually the parent directory of the
     'EPUB' directory. In other words, the command should be run in this
     directory:
     
     epub-directory/
        mimetype
        META-INF/
          container.xml
        EPUB/
          [etc]

     To build just go into 'epub-directory' and run 
     
          build -o homer_the-odyssey.epub .

`ebukafy create-skeleton`

NAME
     create-skeleton -- creates a skeleton epub directory

SYNOPSIS
     create-skeleton [-h] [-a author] [-u uid] -l language -t title 
     target_directory

DESCRIPTION
     This tool creates a new skeleton of an epub directory at target_directory
     The contents of this skeleton folder can be found at the GitHub 
     repository. According to the epub standard the uid, title, and language
     are all required. Including the author is just good practice.

     The options are as follows:

     -h       Display usage statement. 
 
     -a       Optional. Set the author of the book. 

     -u       Optional. Set uid manually. If not set, it will be
              randomly generated. A uid is mandatory according to the
              epub standard.

     -l       Set the language of the book. This should be the IANA language
              tag. If you don't know the tag for a language look up 'IANA 
              Language Subtag Registry'.

     -t       Set the title of the book.

EXAMPLES
     To create an epub skeleton in the current working directory you can run
     
          create-skeleton -a Homer -l grc -t 'The Odyssey' .

     Note the '' around the title since it's two words.

`ebukafy epubcheck`

NAME
     epubcheck -- checks for any errors regarding the epub specification

SYNOPSIS
     epubcheck [-h] target_epub

DESCRIPTION
     epubcheck is a tool which 'validates the conformance of EPUB 
     publications against the EPUB specifications.' The GitHub repository
     for epubcheck can be found here: https://github.com/w3c/epubcheck
     Note that this tool is written in java so you need java to run it.
     It will output any errors or just say that your epub is up to standard.

     The options are as follows:

     -h       Display usage statement. 

EXAMPLES
     To check if your epub conforms to the epub standards just run
     
          epubcheck homer_the-odyssey.epub

`ebukafy generate-manifest`

NAME
     generate-manifest -- generates the manifest part of the content.opf file

SYNOPSIS
     generate-manifest [-hi] epub_directory

DESCRIPTION
     This tool generates the manifest part of the content.opf file. It goes
     through the text, css, and images folders and the toc.ncx and toc.xhtml
     files and adds them as items between the two manifest tags. Without the
     'i' option it just prints it to stdout. With the 'i' option it
     replaces the manifest in-place.

     The options are as follows:

     -h       Display usage statement. 

     -i       Instead of outputting the manifest into stdout, it overwrites
              the manifest in the actual content.opf file, in-place.

EXAMPLES
     Note that the 'epub directory' is actually the parent directory of the
     'EPUB' directory. In other words, the command should be run in this
     directory:
     
     epub-directory/
        mimetype
        META-INF/
          container.xml
        EPUB/
          [etc]

     To generate the manifest just go into 'epub-directory' and run 
     
          generate-manifest -i .

`ebukafy generate-spine`

NAME
     generate-spine -- generates the spine part of the content.opf file

SYNOPSIS
     generate-spine [-hi] epub_directory

DESCRIPTION
     This tool generates the spine part of the content.opf file. It does
     this simply by going through the text folder. Without the 'i' option
     it just prints it to stdout. With the 'i' option it replaces 
     the spine in-place.

     NOTE: You will most likely want to manually reorder the spine. By 
     default it will order the spine in alphabetical order. So, for example,
     'chapter-10.xhtml' will come before 'chapter-2.xhtml'. However, 
     the spine needs to be in the order that the book is supposed to be read,
     so 'chapter-2.xhtml' should come before 'chapter-10.xhtml'.

     The options are as follows:

     -h       Display usage statement. 

     -i       Instead of outputting the spine into stdout, it overwrites
              the spine in the actual content.opf file, in-place.

EXAMPLES
     Note that the 'epub directory' is actually the parent directory of the
     'EPUB' directory. In other words, the command should be run in this
     directory:
     
     epub-directory/
        mimetype
        META-INF/
          container.xml
        EPUB/
          [etc]

     To generate the spine just go into 'epub-directory' and run 
     
          generate-spine -i .

     And don't forget to manually reorder it into reading order!

`ebukafy generate-toc`

NAME
     generate-toc -- generates the toc.ncx and toc.xhtml files

SYNOPSIS
     generate-toc [-hio] epub_directory

DESCRIPTION
     This tool generates toc.ncx and toc.xhtml files. Without the 'i'
     option it just prints it to stdout. With the 'i' option it replaces 
     the files in-place. This command looks at the <spine> tag in the
     content.opf file so make sure to run generate-spine before generating
     the table of contents.

     NOTE: The titles of each chapter are set as whatever is in the <title>
     tag. If the <title> tag is not present, the toc will simply say
     "MISSING TITLE".

     The options are as follows:

     -h       Display usage statement. 

     -i       Instead of outputting the toc into stdout, it overwrites
              the tocs in the toc.xhtml and toc.ncx files, in-place.

EXAMPLES
     Note that the 'epub directory' is actually the parent directory of the
     'EPUB' directory. In other words, the command should be run in this
     directory:
     
     epub-directory/
        mimetype
        META-INF/
          container.xml
        EPUB/
          [etc]

     You can generate the toc files like so:
     
          generate-toc -i .

     And make sure that the spine in the content.opf file is complete
     before you run generate-toc!

`ebukafy smarten-quotes`

NAME
     smarten-quotes -- convert straight quotes to smart quotes

SYNOPSIS
     smarten-quotes [-h] target_file ...

DESCRIPTION
     Online transcriptions often use straight quotes since they're easier
     to type on the keyboard. However, books look bad if you just use
     straight quotes so this tool, for purposes of better typography, allows 
     you to convert them to smart quotes (aka curly quotes). Note that this
     tool looks specifically between <p> tags, so if you have any quotes
     outside of a <p> tag that you want to smarten, you will have to do so
     manually.
       Also, if your book is in a language with a different quote system 
     you can first use this tool to convert them to the standard english 
     ones and then do a search and replace to your language's quotes 
     („…“, «…», »…«, etc.) using a tool like sed.

     The options are as follows:

     -h       Display usage statement. 

EXAMPLES
     Say you have a bunch of text files in the text/ folder. You can use
     this tool like so:

          ebukafy smarten-quotes EPUB/text/*

`ebukafy split`

NAME
     split -- split file at every '<!--split-->' into separate files

SYNOPSIS
     split [-h] target_file

DESCRIPTION
     This tool splits a file at every '<!--split-->' into chapter-n.xhtml, 
     where n is just an index starting at 1. Oftentimes you start making
     your ebook with a gigantic text/html file. It is much better practice 
     to split your file into separate chapters. This tool will also put in
     the appropriate header information into each chapter-n.xhtml file
     such as the chapter's roman numeral.

     The options are as follows:

     -h       Display usage statement. 

EXAMPLES
     Say you have a file like this and you put in the splits like so

          foofoo
          <!--split-->
          barbar
          <!--split-->
          bazbaz

     This will result in three files, chapter-1.xhtml to chapter-3.xhtml.
     Besides the header, the content of chapter-1.xhtml will be 'foofoo',
     the content of chapter-2.xhtml will be 'barbar', and the content of
     chapter-3.xhtml will be 'bazbaz'. The best way to see exactly what
     this does is to try it with an example such as the one above.

Usage no npm install needed!

README

ebukafy

Installation

Usage

Tools

ebukafy build

ebukafy create-skeleton

ebukafy epubcheck

ebukafy generate-manifest

ebukafy generate-spine

ebukafy generate-toc

ebukafy smarten-quotes

ebukafy split