The Web is the format, but will books be in HTML?

Aaron Miller of the Book Glutton, has an interesting post on books, file formats and the web at the TeleRead blog. The comments are also worth reading. Aaron makes the point that both Google and Amazon, two of the webs giants, are heavily committed to books and to putting books straight on the web, page by page (people tend to overlook the extent to which Amazon is doing this in all the hoo-ha about the still experimental Kindle).

Aaron has some cogent criticisms of the IDPF and its ePub format ("... the IDPF is peddling a format that search engines can’t index, browsers can’t natively open, and experienced web developers can’t figure out.") and he smacks Google ("...one of the ugliest interfaces book readers have ever known") and Amazon ("...despite Amazon’s offering of the Kindle they are still a Web company") just as hard.

But he misses some key points. It is hardly correct to say that Google's Book Search has us reading PDFs (PDFs are available from Google but with GBS we are reading JPEGs with the help of a database, PDFs are there purely for the convenience of printers) and he seems to be suggesting that going with the Web means going with texts as HTML. This is just not what Google and Amazon are doing. Google and Amazon get their power and their reach by putting all texts into a database system. This point is perhaps missed in some of the comments. Books are going onto the Web in the cloud, but the literary cloud comprises JPEG droplets not a shower of HTML. Aaron also has a rant against ISBNs and 'retrofitting' old problems. But this is to miss the continuing and lasting importance of our print legacy. Google is creating a huge advantage for its Book Search approach by retrofitting the literature of the past. And the ISBN system has its advantages. Google and Amazon know this and they are both very good at resolving ISBNs to their appropriate target.

Pages, books and ISBNs: these are elements of our print legacy that will still be useful when all literature is primarily digital. Who knows: we may still have the concept of recto and verso. And it will not mean flipping the eBook reader.

Hate to be OUP, CUP and Sage

Dorothy Salo who pens an insightful blog, Caveat Lector (what a brilliant name for a blog) has the headline "Hate to be Georgia State". She is referring to the controversy surrounding the action against that University, for blatant copyright infringement, taken by the two biggest University Presses, OUP, CUP and Sage, one of the most respected publishers of high level social science. The action is backed, perhaps underwritten, by the AAP.

If you read the formal Complaint for Declaratory Judgement and Injunctive Relief, you may feel a tremor of concern for Georgia State's officers and librarians. As I read the documents prepared by the publishers' lawyers the case looks pretty strong. But of course that is what lawyers do. They prepare suits that make a strong case. We have not yet heard the Georgia State side of the case. I suspect that the dispute, even if it is won or more probably settled out of court, will in the long run be as bad for publishers as it appears to be for the university. No publisher wants to sue good customers and Georgia State is certainly a good customer for many publishers. It can not be a happy sight for two prestigious British university presses to be suing an American State University. Questions will be asked.

This story was broken in the New York Times a week ago, and their reporter elicited an interestingly specific comment from CUP's Editorial Director, a well respected publisher:

Frank Smith, editorial director for academic books at Cambridge University Press, said that for electronic use in a course, Cambridge typically charges 17 cents a page for each student, and generally grants permission for use of as much as 20 percent of a book.
If you do the math on 17c per page, per student, you get an astronomical price for a course pack in a popular subject. A thousand page e-reserve for a course taken by 100 students is going to cost the university $17,000 per annum. This pricing policy, per student per page, per annum, may once have been appropriate (when universities produced local print anthologies in lieu of buying books) but it is inherently unreasonable for the kind of ambient access that web-based teaching requires. The libraries for the most part already have the books and publishers absolutely need to find ways of encouraging and facilitating reasonable access to those books and periodicals. Suing the university is a terrible idea, nor is it sensible to price to the limit on what students might be recommended to read. E-reserves, digital course packs, are good ideas and publishers should not be pricing in such a way as to prevent them from working.

The London Book Fair

Last week we had the London Book Fair, and the book publishing business seemed to be in fine shape. The organisers, Reed Exhibitions do a pretty good job and the web site is getting better (more usable and useful year by year). But it is a fact that Exhibitions and Conference organisers have not yet done a really good job on putting the key documentation on line in an easily searchable and linkable format: why not? But there were many signs of digital progress by the publishers themselves. Here were a few random comments I noted in the course of the week (with some ripostes in italics):

"The London Book Fair is now the ideal size for a fair in which you can do business. But we wonder what Google are planning to do next." (an American Harper Collins Executive). Google will surely make a surprising and bold move in the books market soon. They will not wait for the copyright law cases to reach judgement.

"It is easy for Wiley, they just decide to put out digital editions of all their books. It is not so easy for us, we have to decide which titles to invest in first and how much to invest at this stage." (a middle-sized academic publisher commenting on the way the digital tide is turning). But Wiley are surely right. Making a mistake at this stage is less expensive than not making a move at all.

"I am not convinced that digital editions of our books will sell, Cambridge tell me that their humanities titles are only getting 5% of the revenues from digital sales." (a small to middle-sized publisher of humanities titles). Five per cent, sounds to me like take-up. I am not sure that the second-hand comment was very clear about what this was 5% of....Its when the take-up is less than 1% that one has to say that the jury is still out.

This was a show in which Penguin, Macmillan, Random House and other large publishers made big statements about putting many of/all their new books into digital format. One has the impression that digital publishing is now assumed to be a must for all main-stream publishers. The smaller academic publishers and small to medium size trade publishers do not yet quite see how to implement the new wave, but the future is clearly going to be digital. There is going to be a lot of devilment in the detail.

Faster Page Turning

Windows is Collapsing?

Really? Well two Gartner analysts think that it is. The codebase is too vast and unwieldy. It will not run on small systems. Customers are reluctant to upgrade to Vista. Three or four years ago, this would have seemed like really important news. Now it looks somewhat unsurprising. Exact Editions is a young business (three years) and only typical in the way that canaries are typical in coal mines, but as it happens we scarcely use Windows in our operation.

Our accounts are run through Excel. We word process contracts in Word (but we hanker after a 'click through' alternative). MS Office running on the Mac may be more important to us than Windows. But all the crucial day-to-day software running our platform is web-based (email and web-accessed internal processes). The operating systems that matter to us are Linux first, Mac second, and Windows nearly 'nowhere' -- just ahead of Symbian, which is running our Nokia phones. But all these operating systems are just 'tools'. What overwhelmingly matters to us as we look forward, is the web, and the emergent cloud computer that we all use. Maybe the concept of a monopolistic global operating system has had its day?

Single Track

Virago, Sphere, Abacus, Piatkus et al

Music and Print

Fred Wilson, A VC, has some very interesting things to say about music. Something Important is On the Horizon in the Music Business. He may be right, our musical culture may be embracing the web as a system for streaming access and not for file-based distribution (Apple has won that war and the big music publishers are unhappy about that). I am not so sure about his espousal of the 'Radio Station' as the new/new thing, with the consequence that "Everyone who wants to be a radio station will be one".
But his manifesto has a convincing ring:

Here’s what we need. We need someone to create an easy to search streamable library of all the recorded music in the world. We need to be able to grab a track and embed it on our blog. We need to be able to see how many people played it. We need others to be able to crawl these user pages with the embedded music and create algorithms based on who posted it, how often it was played, and how often it was reblogged and linked to. The services that do all of that need to be able to play the music that flows out of these social algorithms in the same way. This all has to be licensed and legal and it has to result in money flowing to the artists...

It surely doesnt follow that books must go the same way as music. On the other hand, that could happen. You could rattle through the same litany of requirements for books, as he does for music and it makes a lot of sense. The system has to be legal and it has to result in money flowing to the artists, and I would add that it has to be 'open' in the sense that anyone can become an artist, a publisher, or a librarian. Open also in the sense that the system will not be owned by a monopolist.

There is a good deal of anxiety amongst book publishers that Amazon with its 30 Billion $ market cap is going to be the books/printed matter monopolist. Much handwringing recently about the seemingly obscure issue of Amazon insisting on a proprietary and monopolistic source for the supply of POD books (print on demand), from its subsidiary company BookSurge. Outrage from WritersWeekly, marxist analysis from PersonaNonData, and a useful review of the various forms of lock-in that Amazon could be aiming at from ToolsOfChange.

But Andrew Savikas in his ToC posting does not point out that playing for lock-in is tricky, it is always potentially 2-way. It can have the effect of tying a business to a losing strategy. You may end up being Sony with Blu-ray, or you may end up being Sony with Betamax. If I were an Amazon shareholder (I am not) I would be worried that the real strength of that business (in logistics and physical warehouses) is going to be a source of complacency and weakness in the digital world. Its other conspicuous strengths -- scale and customer service -- will certainly be crucial. What if Kindle and BookSurge are losers, more Betamax than Blu-ray? If you really cared about the longtail in books would you not be embracing all the Print On Demand suppliers? Henry Blodget voices similar doubts, and yesterday's New York Times has an article 'Amazon Accelerates its Move to Digital', which suggests that Amazon knows that it has its work cut out.

One of their managers (Steven Kessel in overall charge of digital strategy) reckons that it may take 5 to 7 years to build their digital businesses to maturity. But the lock-in (or co-relative lock-out) will happen sooner than that. Amazon will know within the next 18 months whether or not they are going to win big in digital print and digital music. It is by no means a foregone conclusion.

Hearst: First E-Ink and now DigitalPaper?

Paid Content has a story about a Hearst financed venture called FirstPaper, which has big ambitions. It is operating in stealth mode, and little is being said about what is up: but here is a glimpse from one of the recruitment ads:

A startup company which is well-funded by a huge media conglomerate located near Columbus Circle is looking for highly skilled, professional C# developers to help build software services around an innovative hardware product which will revolutionize the media publishing industry.

This is truly an exciting opportunity to get into a well-funded startup and help define the next generation of media products....
Most of the big magazine and newspaper publishing companies have been somnolent in their attitude towards web innovations. What little they have done generally has not worked too well. Hearst would seem to be an exception, early investments in E-Ink and now FirstPaper.

I like the name, I wonder what it will do? It does sound as though it might be a new kind of digital reading system (Hearst/First, E-Ink and now E-Paper), a competitor for the Kindle perhaps (noticed via the Read20 list).

WiFi in the air and on the tracks

The idea of seamless web access from a plane or a train has been 'round the corner' for years. But it is not yet quite there. Perhaps 2008 is the year in which things really will change. There has been a flurry of reports which suggest that the dam could soon break.

Section 108 Copyright report

A band of experts has spent a lot of time constructing a thoughtful report on possible reforms to the US law of copyright. The excellent Open Access News blog gives you the essential links.

I have not read it all; it is a lot of reading -- 150 dense pages. And it is all recommendations for the Copyright Office to consider before possibly asking Congress to change the law. But whoever was charged with finalising the report for publication did not think very carefully about how it would be read. Perhaps the publishers on the panel were not closely enough involved in the final stages. Under the normal default settings for Preview/PDF on my Mac, the light blue box in which the recommendations appear completely occludes the text. It is wearisome to have to cut and paste the text out of its blue boxes in order to read it.

It would be mischievous to suggest that they did not want their recommendations to be read (so they put it with a blue background which disagrees with monitors and photocopying systems). The specific recommendation on this page will give scant encouragement to Google in its arguments with Publishers and others on the Library project. The recommendation rather explicitly disowns the Google Book Search modus operandi and the Google grab of copies which have not been explicity sought from rights holders.

Here is the text of the invisible recommendation that will be disagreeable to Google -- all three sub clauses are each enough to give GBS some heartburn.

. Section 108 should be amended to allow a library or archives to authorize
outside contractors to perform at least some activities permitted under
section 108 on its behalf, provided certain conditions are met, such as:
a. The contractor is acting solely as the provider of a service for which
compensation is made by the library or archives, and not for any
other direct or indirect commercial benefit.
b. The contractor is contractually prohibited from retaining copies
other than as necessary to perform the contracted-for service.
c. The agreement between the library or archives and the contractor
preserves a meaningful ability on the part of the rights holder to
obtain redress from the contractor for infringement by the contrac-
(Section 108 Study Group Report p. iv of the Executive Summary)

If this is the way copyright experts are thinking, it would seem to be very clear that Google needs to find a way of backing out of its aggressive stance on rights in library copies.

Hart Publishing

