What it this WP/PWP/EPUB4 hydra?

In fact, it is not a multi-headed creature, rather a nested set of specifications, like a Russian doll.

A vision paper has been written by the Interest Group and I’ll just summarize the key concepts here:

WP stands for Web Publication

A Web Publication is “a collection of one or more constituent resources, organized together in a uniquely identifiable grouping that may be presented using standard Open Web Platform technologies”.

In other words, a Web Publication is a closed set of web resources (html pages with CSS styles and javascript, images …); plus, what is necessary to express the logical ordering of these resources; plus, metadata, including a global identifier for the Publication. The constituents of a Web Publication can be fetched from a Web Server using http(s).

A structured manifest is the most efficient way to represent such metadata and constituents of a Web Publication. The Readium community has developed such a manifest format using JSON, as part of the Readium-2 project: this work has been proposed to the W3C Publishing Working Group as a blueprint for the W3C Web Publication Manifest. And another W3C Working Group is working on a similar manifest for Web Apps: a convergence between the Web Publication Manifest and the Web App Manifest will be studied.

A Web browser becomes an “reading system” in two cases: the user may download a Web Application (e.g. Readium Cloud Reader) which is ready for Web Publications; or the browser itself may have been adapted to handle such Publications. In the first case, one can imagine that a Web Application (described by a Web App Manifest) references a Web Publication (described by a Web Publication Manifest). This would constitute a clever combination of the two technologies.

PWP stands for Packaged Web Publication

Streamed access to Web Publications is great, but:

  • The Publishing Industry needs a B2B interchange format for publications; this is what EPUB was all about initially.
  • Users want to be able to store Publications on a hard drive, transfer them on a USB key, side-load them in their preferred e-ink reader.

Therefore, it is mandatory to define a file format for Packaged Web Publications. The constituents of a Web Publication are then “packaged” as a single file for easy storage and transfer.

Because the W3C promotes reuse of existing W3C specifications, the classical ZIP format used for EPUB 2 and 3 will be challenged by the package format formerly drafted by a W3C Working Group in 2015 as Packaging on the Web, and still worked on by a group of people as Web Package. It will be the responsibility of the Publishing WG to choose one solution or the other.

As of June 2017, several participants to the new WG believe that PWP should be a concrete implementation, which defines a container format and a specific file format, plus detailed specifications; Therefore, .pwp files would be exchanged in many B2B or B2C use cases, for e.g. business reports. Other participants believe that PWP should be an abstract specification, which defines blueprints for concrete implementations like EPUB 4, but does not define a complete interchange format. The position of EDRLab is that PWP should define a simple but complete interchange format, which should be marketed as EPUB 4, so that end users never face two almost identical formats (.pwp and .epub). The Interest Group didn’t solve that issue before and the Business Group and Working Group will therefore have this responsibility.

EPUB 4 will be a profile of PWP

EPUB 3 was created in 2011, but it didn’t replace EPUB 2 so far on most ebook distribution channels.

The WG charter states that EPUB 4 will be a profile of PWP, i.e. a specialization of PWP, with some additional features specific to the publishing industry (if any). EPUB 4 should be the ultimate interchange format for ebooks and other kinds of publications. It will keep most features of EPUB 3 (if not all), will make use of HTML5, CSS 3, javascript, media overlays, etc.

With some care and duplication of internal structures, it will be possible for a publisher to release EPUB files simultaneously compatible with versions 2, 3 and 4 of the format.

The modifications of such internal plumbing will not change much for publishers of simple ebooks and round-trip transformation between EPUB 3 (or EPUB 2) and EPUB 4 will be made available by the Readium community.

But EPUB 4 wouldn’t have a great interest for publishers and users if it was only a matter of plumbing. EDRLab will therefore push two innovations:

  • A solution for Web comics (and manga); an internal EDRLab Working Group has been created in June 2017 for preparing proposals to the W3C for such concept and structure; this will include page transitions and much more.
  • A solution for audio-books, currently never published using EPUB; an internal EDRLab Working Group has also been created in June 2017 on this subject.

There may be other profiles of PWP

Other profiles of PWP may be created by different companies. This aspect is currently (June 2017) only promoted by Adobe systems, so that they can build their “Next Generation PDF” on PWP. From the public information available, one can imagine this Adobe format as a large package containing a manifest, the html/css/js resources of a Web Publication, plus a set of PDF 2.0 documents, each optimized for a specific screen resolution.

Conclusion

As on June 2017, the Publishing Working Group has just begun its work on these three specifications. Currently, no representative of the browser vendors has joined the group, something which must be addressed quickly, as some issues like a clean pagination mechanism (CSS Fragmentation?) and a great layout both depend on the integration of paged content in multiple browsers.

Web standardization should be agile and based on software prototypes. We hope that the developments already made by the Readium-2 community will foster a rapid pace of development for Web Publications and EPUB 4 format.

Copyright © 2016 EDRLab. Legal informations