The eBbook readers use many formats, but have not very powerful software and hardware. It is highly recommended to use tagged formats since scrolling might not work as desired.

eBook readers are slow and are more to read sequentially and slowly a document. They are not (yet) suitable to read all kind of files to browse and search.

To get free eBooks see: http://www.gutenberg.org/wiki/Main_Page


Epub from http://idpf.org/about-us is the recommended open format for eBook readers. The specifications are freely available however not in epub format. So I ended up reading the epub spec not on my ebook reader but on my desktop PC. Maybe the epub authors were not able to write epub? Epub is a container format that can hold different items. You can unzip a epub file (maybe you need to rename it first to *.zip) then you get a directory with:

Epub is specified using 3 specifications: the Open Publication Structure (OPS), Open Packaging Format (OPF) and Open Container Format (OCF)

Open Container Format

This specification http://www.idpf.org/epub/20/spec/OCF_2.0.1_draft.docdescribes how the document is prepared as directory (File System Container or Abstract Container) that will be compressed to a single file (ZIP Container or Physical Container) using zip.

The file system container (or in simple words the directory where the epub is created) contains a file mimetype with a single text line identifying epub as epub:


There is also the mandatory META-INF subdirectory where just the only mandatory file is container.xml. It has a content as:

<?xml version="1.0" encoding='utf-8'?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
        <rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml"/>

Important is the rootfiles element with its attribute full-path. This describes the subdirectory and file containing the epub table of contents that serves as entry point of the epub document.

Open Packaging Format

The open packaging format is used to define the contents of a epub file. This file resides in the directory defined in the OPC data. A good name for such an XML file is therefore contents.opf. For details see: http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm

<?xml version='1.0' encoding='UTF-8'?>
<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookId">
  <spine toc="ncx"/>

Metadata follows https://en.wikipedia.org/wiki/Dublin_Core to describe stuff about the document that are used to find classify and do all other things than what will be seen when reading the book. The mandatory meta data are title, language and an unique identifier that can be of different kind as ISBN number or an uuid (see: https://en.wikipedia.org/wiki/UUID. Using python an uuid can easily be created using time and computer info.

Manifest lists all files (html,css, jpeg, ncx, ... ) and their location, that make part of the epub. One of the files inside the manifest is

 <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>

This refers to the spine element and a file toc.ncx. The spine element contains the list of file id defined in the manifest element to defines the sequence of how the individual files form the document. The toc.ncx file is an other xml file that defines the global navigation structure used for navigation.

The guide element is optional and allows to set a cover, table of contents and other features.

Open Publication Structure

The Open Publication Structure http://www.idpf.org/epub/20/spec/OPS_2.0.1_draft.htm has the most pages of all the ebup specification, however it seems that it is not necessary to be bothered with it. Make sure the file inside the epub are clean xhtml data. If you want to know what this means red the mentioned spec.


Sigil https://sigil-ebook.com/ is a a WYSIWYG ebook editor. It does not force you to put the ebooks in a library, it simply behaves as users like. However can modifies the ebup files when read, as moving all xhtml files under the directory Text and modifies the contents.opf file.

Sigil allows to open html and text files and save then as epub. Since sigil is an editor, it allows to add metadata, cover pictures, table of contents, ... . It is worth to edit the html files once imported and converted to epub, especially delete surroundings of the actual text by tables cells or frames.

To convert from other formats as odt, doc, pdf it is recommended to export, convert and save them as html first and import html in a second step.

Sigil shows nicely that epub is a container format, when going to code view html pop's up. Next to html, different items can be added as cover page, meta data and css.


In the portage but masked is calibre, see https://calibre-ebook.com/. Calibre can convert from CBZ, CBR, CBC, CHM, EPUB, FB2, HTML, LIT, LRF, MOBI, ODT, PDF, PRC**, PDB, PML, RB, RTF, SNB, TCR, TXT to CBZ, CBR, CBC, CHM, EPUB, FB2, HTML, LIT, LRF, MOBI, ODT, PDF, PRC**, PDB, PML, RB, RTF, SNB, TCR, TXT.

Calibre is proud to offer its own database where files can be imported and exported. But for users as me, this is a pain. Files need to be added to this database and are therefore copied. Luckily this database is in a folder where the files are accessible. The aim behind this database is that it allows calibre to search better for document attributes as authors.


Fbreader is a simple ebook reader from https://fbreader.org/. However it also forces to have a library of your books.

Unmask and emerge fbreader might fail depending on your useflag settings when gtk and qt4 are set at the same time so make a USE="-qt4" emerge fbreader.


There is also emerge ebook-tools that has conversion tools to convert .lit (Microsoft Reader format) files to epub using the clit package from http://www.convertlit.com/. Be aware that using clit might cause a legal issue due to copyright laws.

Linurs Hosttech startpage