Friday, December 19, 2008

How 2008 Changed the Market for Digital Editions

Keeping the list to manageable lengths, there have been five really important developments this year. Let us work back through the year which as it happens is pretty much the order of importance:

  1. October/November: The depression/recession. Hardly anybody saw this coming and, according to one who did, it may be even worse than we have yet grasped. But the depression is a shot in the arm for digital editions. For exactly the same reason that it is now a good time to be in the business of designing wind farms, building solar panels or battery powered vehicles. Terrible news for Detroit, for airlines, for big city newspapers and dire for the advertising budget of consumer magazines: but there is a substantial new business for the magazines, newspapers and book publishers that can deliver efficient digital services. Digital editions tick all the boxes: energy efficiency, global access, low investment, consumer facing, distance learning, transparency, economies of scale, network enhancing etc.
  2. October. Google settles with the Author's Guild and AAP. A completely game-shifting agreement. It has still to be finalised and approved by a judge, but it seems likely that the Google system and the new Book Rights Registry will define the shape of the publishing industry for years to come. Digital books will be database-driven, page-oriented, url-guaranteed, access-managed and completely searchable for free. Books will not be freely readable but how they will be sold and subscribed to has yet to be determined. Although the agreement was limited to books in the USA, it seems very likely that the same general regime will apply to books, newspapers and magazines (and other print objects) everywhere. Google Book Search has always been scalable. Game defining and game just started....Publishers of all stripes have still not grasped how much this changes their market, and how big the potential opportunities are for digital publishing in which every publication can be searched, read and referenced through the web. Book publishers are begining to get it, magazine and newspaper publishers are straggling and struggling.
  3. July. Apple launches the 3G iPhone in 22 countries. Perhaps as important the first Android phone appeared in October of this year. The iPhone and the Android phones have been designed to be completely web-capable and this takes the digital edition into the pockets of billions of people who will in the next three years be buying mobile phones with instant web connectivity. If books are on the web as digital editions (see point 2 above) billions of people will be able to read them.
  4. May. Amazon S3. Amazon's cloud computing was launched in 2006, but this is the year in which it really took off. In May and again in October they reduced charges at all tiers. Cloud computing is the next shape for collective computing and Amazon's cloud computing infrastructure is increasingly attractive to information providers and web services of all kinds. Digital libraries and digital editions are moving into the cloud, and in the same move these buckets of books become shareable and potentially usable by other web applications.
  5. January. Amazon's Kindle was out of stock despite being one of the must-have presents for Christmas 2007 (except that it was out of stock in December and remained unavailable until April 08, and is unavailable again until March 09). But since that broadly successful launch the device has gone rapidly sideways. Amazon has not been able to launch the product outside the US. They have been very coy about their sales and their plans for the future of the device. This has definitely not been the year of the Kindle. One could say that it has been sidelined by the iPhone -- why buy a dedicated device to read books when you can read any web-deliverable object through the device already in your pocket? But they have also been scrunched by the Google Book Search proposition -- do we need downloadable books when they are all, always, in the cloud?

Institutional Licenses

We started offering Institutional, or Site Licenses a year ago. This was not a part of our original business model, but the Exact Editions founders know that market from previous experience. We have, even so, been surprised at the strength of interest for consumer magazines. To judge by the range of libraries subscribing to our services (government departments, international organisations and schools as well as universities) there is a large global market for subscriber content delivered via IP to private networks. It is a 'rule of thumb' in STM publishing that there are 2/3,000 universities world wide that constitute the market for periodical literature. The market for digital books and consumer magazines in libraries world wide is potentially two or three orders of magnitude higher (think schools and public libraries).

There is a large market, but there must be real pressure on the budgets to acquire information. At the high end, STM academic databases are very, very expensive (individual universities are often spending millions on their scientific periodical budgets). It seems likely that new entrants in the books and digital magazines space will thrive if they keep their prices down (by which I mean significantly less than $1,000 per title, per institution). Digital books may need to be priced in the region of $100 pa, per site, if they are to achieve widespread adoption.

Perhaps much lower? When Google Book Search starts selling its massive archive of mainly 'out of print but in copyright' books it will be fascinating to see how that is priced. On a per title basis the collection will need to be priced at much less than $1 a title if individual universities and colleges are to subscribe for access (there are already 7 million titles in the aggregate).

It is at least possible that the advent of the Google system will lead to a two-tier market. A huge pile of mainly little-used books in the Google mass, and slivers of high value and highly current content which will be marketed and promoted direct by the publishers, or perhaps by Google in a 'premium stream'. That will pose problems of channel contention and regulation. It may not be too long before authors, publishers and even Google are saying that books in the 'slush pile' should be free. One suspects that this is what Google may have wanted as an outcome from the beginning.

PS 'Slush pile' is here merely a technical term. Slush piles contain many books of outstanding quality. The Google collection will be very valuable and useful, though, of course, with plenty of rubbish in it!

Thursday, December 18, 2008

Phone Numbers and Books

Yesterday one of our publishing partners asked that the phone numbers in their catalogue should not be 'live'. Incidentally, they were very pleased to have the ISBNs targetting their e-commerce engine, but for some reason not so keen on the phone numbers being immediately usable. Ours not to reason why..... the numbers were switched off. We have had such requests before, and it is not difficult for the database to switch phone numbers 'off', (ie to 'non-clickable') if this is requested. It can be done for a whole publication, or even for individual phone numbers within an otherwise 'live to call' publication: (my colleagues will not thank me if this leads to a torrent of requests for shielded phone numbers).

It seems to me absolutely right that authors or publishers should decide for themselves whether links are to be live in their publications. This is not a decision that should be taken by other players in the process. But the reason for having the numbers in the image but not clickable, not immediately usable, struck me as strange: "I honestly do not think anyone will be calling us from Skype or from an iPhone".

I am probably a minority in using Skype (Skype Out) and the iPhone every day, but I know plenty of academics who Skype a lot, and I am sure there are lots of librarians and booksellers getting into the habit of requesting books, even placing orders, from their iPhone. There will be a LOT more next year when the Android phones start landing.

It is surprising to me how often one goes to web sites that have lots of phone numbers and there is no automatic way of using the phone numbers -- except 'cut and paste' (which is not a lot of good to iPhone users). The idea that phone numbers should be dead is by no means limited to book publishers. Web publishers make the same mistake. I guess not supporting 'cut and paste' on the iPhone is another example of underestimating the way that users will use resources that we make available to them. Users will always do more with whatever is given to them than we can anticipate.

Usage Statistics from Exact Editions Institutional Accounts

Librarians who subscribe to our institutional content licenses now have some convenient statistical tools which summarize monthly page use and search terms.

The statistical reports are only available to Librarians who have a personal management account with Exact Editions (typically the librarian who places the original order). When such registered librarians log in to Exact Editions with their username and password they will find a new "librarian: stats" link on the toolbar.

This link takes them to our stats page, with traffic reports on each available issue (we stop at this level - we don't show exactly which pages of each issue are being read). We also display the principal search terms used by the library's members in any period (this data can be quite informative and is not intrusive -- no individual data is logged).

The figures are available by month, by issue, and by publication and can be exported to Excel.

Wednesday, December 17, 2008

Gift Subscriptions

We are seeing a surge in gift subscriptions. Could this be a sign of Christmas? Or is it the first loosening of consumer budgets as we climb out of the recession? Your guess is as good as mine.

If Christmas shopping is on your agenda, here are some last minute ideas (we are 24x7 so you can even delay your shopping to Christmas morning if you expect to wake up before your nearest and dearest are online). Here are some suggestions: Whitelines for that troublesome snowboarding nephew. Taste Italia for the brother-in-law who wants to open an Italian restaurant. Prospect for your intellectual friends. Opera or The Wire or Jazzwise for your musical buddies. Finally, the Ecologist for anyone who cares about the environment and Red Pepper for the anti-capitalists in your network.

We don't yet cater for every special interest and taste (I wish we had something on Tropical Fish for Uncle Fred), but we are getting there.

Tuesday, December 16, 2008

Does XML really matter?

There is a new burst of enthusiasm for XML amongst book publishers. Mike Shatzkin, who often has cogent things to say, has produced a little encomium for XML in Publisher's Weekly.

Here's what we call the Copernican Change. We have lived all our lives in a universe where the book is “the sun” and everything else we might create or sell was a “subsidiary right” to the book, revolving around that sun.

In our new universe, the content encased in a well-formed XML file is the sun. The book, an output of a well-formed XML file, is only one of an increasing number of revenue opportunities and marketing opportunities revolving around it. It requires more discipline and attention to the rules to create a well-formed XML file than it did to create a book. But when you're done, the end result is more useful: content can be rendered many different ways and cleaved and recombined inexpensively, unlocking sales that are almost impossible to capture cost-effectively if you start with a “book.” What the Hell Is XML? Publisher's Weekly 15 Dec 08

At the risk of being taken to be the kind of oaf who burps loudly in the presence of royalty (questioning the supreme value of XML is a bit like breathing garlic all over her majesty), I am inclined to pour cold water over this.

XML has been with us for 10 years. It certainly has its uses, especially in managing large complex texts and integrating text databases. But XML has not been and is not the be-all and end-all of digital publishing. XML is a property of texts, a style of handling them for flexible representation. In the last five years (especially since Google Book Search started motoring) it has become increasingly apparent that the book-as-book is the critical output of book publishers. Indeed PDF's are still a crucial component of the book publishing process and for many of the most useful applications of the digital book, the PDF file is the crucial starting point. Copernicus, after all, was right, the sun is the centre of the solar system. Books really do matter and they are at the centre of the GBS system.

In one crucial respect XML has been and is a damagingly misleading tool for publishers (as deleterious in its effects on newspapers and magazines as on books) it has encouraged the mistaken view that text objects can only be used on the web if they are repurposed. XML was invented primarily because it was seen as a flexible way of 'marking up' the incredibly diverse world of print in ways that could be reconciled with HTML and the web. Everything printed would be repurposed for the web and XML would facilitate this step. This now looks like it may not be an efficient way to look at things. Google Book Search and other digital representation platforms are showing us that repurposing a book or a magazine is not necessary and usually results in the loss of important information. It is certainly a mistake to suppose the XML is necessary if books are to be effectively used in the web or in databases -- as Google Book Search, the largest print database, demonstrates. Above all, XML, and any particular implementation of XML is only as good as the design for which it was crafted, XML is not future-proof, and it is highly misleading of Shatzkin to recommend:

"You'll save the most money right away if you create many books that are similar in structure and thus can be rendered from the same “style sheet.”
Books should only be similar in structure, and their texts should only share the same style sheet, if they are similar in purpose. A rigid XML style sheet for the whole of a publisher's list is for many publishers a lousy idea. Designing, or selecting, your books to fit your style sheet is putting the cart before your horse.

Thursday, December 11, 2008

Google Books (and Magazines)

I had a wry chuckle on noticing the form of the url's that Google is using in its magazine service. They are something like this: (http) // which is more or less gobbledeygook because it doesnt need to be anything else. But the chuckle was over the way that Google give magazines a 'book' id.... Our system is not too dissimilar and having worked from 'magazines' towards books, we were until a week ago putting a 'magazine' moniker in the url for the books. We must have had a thorough parse and replace, spring-clean, of all our code, because we now label book pages and magazine pages in a neutral style. This is a small 'presentational' issue, nothing really hangs on the fact that a 'magazine' url looks a bit odd with book id's in it, or vice versa, but I will not be surprised to see Google evolve their nomenclature, both in the code and in the way they characterise the service. Perhaps, when they have a good archive of newspapers and magazines, they will change the name back to Google Print?

Since the Exact Editions architecture and our method of presenting books and magazines is very similar to Google's in Google Book Search, some people ask us whether we are worried by having a model and a business proposition that could be 'blown out of the water' by GBS. Maybe we should be more worried than we are, but here are some of the reasons why not:

  • Any web-based business could be blown out of the water by Google. Being Google-incompatible is not a good choice. Having a completely contrary model for book/magazine representation to Google is a far worse place to be in at the moment.
  • Google is not going to be a monopoly distribution route for books, magazines and other titles, because its not too difficult to distribute digital books in that way. Its just hard to do it well. Google Book Search will certainly not be the only distribution game in town.
  • Google is not going to be a preferred commercial distribution partner for many publishers because their commercial terms are quite stiff: 37% is a significant margin chunk to allocate to a distribution partner. Premium content will often not want to go that way
  • Google is going to be used by everybody. Google Book Search should be used by everybody, that does not mean that publishers will not want to control their own markets and manage their own quality of service. Alternative distribution is an 'and' choice, not an exclusive 'or'. Google will accept that, and Google would be, should be, very worried at the prospect of becoming an exclusive or sole distribution resource for digital content.
There are a few more reasons for not being too worried about Google competition, which is a fact of life but not a death-ray, but those are enough to be getting along with.

Wednesday, December 10, 2008

A Big Cloud Descends Over Europe

And this is not about the recession. This is Amazon's news that they are now putting some of their cloud in Europe. I have to say that I always thought that it always was in Europe. But I guess it is more in the EU now than it was (probably so that businesses that have to say that they host stuff in the EU can stay with that). Even so, it seems to be going against the grain a bit (if clouds have grains). I thought that half the point of cloud computing was that you were never too certain where the computing was. I guess we will move our buckets to the European of these days (that is the other thing about clouds, no need to be too precise about when they are wherever they are).

Google Does Magazine Search

Google Blogs the news and Danny Sullivan has an excellent summary of what the service does and currently doesn't do.

Here is a 2-page spread from New YorK Magazine on Tom Stoppard.

Google is mainly (entirely?) working from scans (yes I know that there were no PDFs in the 50's, 60's and 70's). I am not sure that there are any current issues in the archive I couldnt see anything yet from the noughties (correction Popular Science is there up to Feb 2008). In this respect the magazine service is rather like the historic newspaper archive that Google has also been working on. It is building up a large 'long tail' whilst the magazine and newspaper publishers dither about what to do with the short head (hint: think about selling subscriptions -- that is what Google is soon going to be doing for new books). Isn't this the strongest possible wake-up call for magazine publishers? Hey folks, time to get your current issues up and running on the web. Make your magazines searchable through Google and sell subscriptions to them through Exact Editions!

Thanks to Google for pointing out the way things should go. Thanks to Google for once again creating a proper web version of print objects.

Footnote: Google is now indexing magazines in the Exact Editions service where the pages have been made open access by the publisher. Google reads the ascii version of the pages, which is also there to aid the print-impaired audience. It would be practical for Google to index all our pages and produce search results for the whole content even the stuff behind subscription barriers. If any Google publisher-liaison person reads this perhaps they can tell us how to link up with Google on that!

Monday, December 08, 2008

Government and Innovation

Why does this Guardian article about the Government putting aside £1 Billion to fund technology startups give me a gloomy feeling?

Somehow one knows that if the government does this, the bureaucracy will kill or at least stifle too many of the innovative companies that go with it. When we started in 2005/6 we wasted too much time (not a great deal thank heavens) talking to and negotiating with VCs and getting a small firms loan guarantee. That was a particularly dispiriting experience because we found out in the end that although our bank pretended to be part of the scheme, it really was not. It turned us down because it had done hardly any such guarantees in the previous two years but remained officially 'part of' the scheme because the government expected the scheme to have the support of the main banks. We managed our start up funding by keeping our commitments very low and at least we do not have to worry about VCs and second or third rounds of lending/funding.

Here is a really good blog by Paul Graham (who is behind the very early stage and light-weight investor Y combinator) on why innovators in information technology now dont need VCs. Here is his argument:

......many Internet startups don't need VC-scale investments anymore. For many startups, VC funding has, in the language of VCs, gone from a must-have to a nice-to-have.

This change happened while no one was looking, and its effects have been largely masked so far. It was during the trough after the Internet Bubble that it became trivially cheap to start a startup, but few realized it because startups were so out of fashion. ......
VCs and founders are like two components that used to be bolted together. Around 2000 the bolt was removed. Because the components have so far been subjected to the same forces, they still seem to be joined together, but really one is just resting on the other. A sharp impact would make them fly apart. And the present recession could be that impact.
If the UK Government wants to give a real impetus to IT innovation it should emulate Barak Obama's commitment to 100% broadband availability so that all American kids have access to state-of-the-art broadband. The UK Government could make that happen next year. In schools, colleges, libraries and the streets next year that would make a real difference to thousands of innovators, inventors and startups, a much bigger difference than scores of subsidised VC-style startups, that would only get going in 2010 (if you haven't negotiated with them you have no idea how slow VC's can be, even when they are moving).

Friday, December 05, 2008

Information Sources -- so many Sources

This week I have been dabbling with Twitter and its a very good source of topical references. If you would like to know what Tim O'Reilly or John Battelle are thinking about, then you will get lots of good ideas following them on Twitter. Or you can pick up JAFurtado's tweets, which scatter plenty of topical technology and publishing references (I like the fact that some of them are in French or Spanish and Portuguese, although I can only cope with the French).

Twitter is on a roll, partly because Obama had 140,000 followers on Twitter at the end of his campaign also because lots of Mumbai news came from tweets. Tim O'Reilly recently had a good blog posting about Why I love Twitter. He is right to emphasise that the model of social interaction that Twitter encourages is beguiling and simple (followers and following). Tim does not make much of the fact that Twitter's brevity (its messages are restricted to 140 characters, the SMS limit), while an obvious limitation, in fact has a lot to do with its deep appeal. That and the compressed syntax of tinyurl and similar services.

Tweets in their allusive conversational style reinforce the deeply 'referential' character of the web, just as much as Google's page rank algorithms for search. But twittering is just another layer, and if you want deeper analysis we are still pretty reliant on blogs to pick up on what is happening in the technical areas that are of special interest. For me PersonaNonData, OpenAccessNews, Techmeme and Ars Technica continue to be among the best news sources, in the area between publishing, the web and technology that most concerns me.

I have also been experimenting with a free trial of the daily news briefing Insights service provided by Outsell. This week, I found particularly interesting their notes on Twitter (again), and a summary of a company, unkown to me Ringgold, that is specialising in information about institutional IP ranges and open identity solutions. These are matters of considerable interest to libraries, aggregators and periodical publishers. Outsell's service is commercial and its coverage of big commercial publishers is particularly good. A free private email list with a similar remit, Read 2.0, is run by Peter Brantley. Brantley has a good eye for new trends and developments (broad view -- almost a fish-eye lense), but you have to be asked to join his now quite large list, and if you join it you will then see a lot of email from sometimes over loquacious participants. Some of the best parts of the Read 2.0 conversation appear at O'Reilly's Tools of Change. I wonder if Brantley's talents would shine better if he Twittered?

When we have done with twittering, blogging and email lists we come back to personal meetings and it is still true that the traditional Trade Show has a lot to offer. This year's Online show in London was smaller than I can recall, but there was certainly a buzz of innovation and a lot of useful information came my way. Some of it will get twittered or blogged about.

Wednesday, December 03, 2008

Google's Gravitational Pull

Google were not much in evidence at the Online show at Olympia this week. But they were much in the mind of publishers and aggregators that I met. Its rather odd to be at a trade show where by far the biggest actor is not present, but the influence is pervasive. This got me to wondering how one might measure the Gravitational Pull of Google Book Search. One measure would be the number of published titles available through GBS compared to the Kindle, Sony's eReader and other systems such as the iPhone. Here is a guess:

The Google figure of 7 million titles is based on the claims that have recently been made about the number of titles which they have added to Google Book Search through their library project ("and we're just getting started"). While it is true that the figures are not strictly comparable (and my guess for the iPhone is highly conjectural) the graph is telling. Google has a huge arsenal of titles in digital format and once they have a functioning way of licensing and selling access to their resources the Google Book Search library will dwarf all others.

Some people might read this graph and, shrugging shoulders, say that Google is the only game in town. But that does not seem quite right. A more balanced response would be to start with the proposition that any good digital books strategy is going to work alongside Google Book Search. The best strategies will work with it, even take advantage of the fact that it works to everyone's advantage.

Tuesday, December 02, 2008

Naming of Parts

We now have a flexible system of web access management that allows a publisher to select areas of a book which can be assessed and sampled in full view before a purchase. For example here are some full size pages from the Time Out City Guide to London.

From the book's home page, there are some named links which allow the user to grasp the context of sample pages that might be of interest. Bloomsbury and Fitzrovia is a better handle in a city guide to London than pages 106-113.

So now we need a good methodology for encouraging publishers to name and open relevant parts of their books for sample access. The obvious solution, the one we have adopted, is probably the right one: Chapter 1, or Chapter 1's title, Bibliography or Table of Illustrations etc.... Is there a way of extending this nomenclature to readers and users? Is a vocabulary for chunked reference in books something that they will want? When every print page is a web page its a simple enough matter to provide names to groups of pages. Will each web-published book aquire its own patina of user-generated tags in the way that Flickr and now have clouds of very helpful handles? Its an intriguing possibility, especially sine the handles would be used by other programs and resources.

Henry Reed's Naming of Parts

To-day we have naming of parts. Yesterday,
We had daily cleaning. And to-morrow morning,
We shall have what to do after firing. But to-day,
To-day we have naming of parts......

Monday, December 01, 2008

The Warehouse Test

Thirty years ago, when I was a rookie editor at Oxford University Press, there were quite serious discussions within that august organisation about electronic publishing. I remember being astonished when the then Finance Director, now deceased, wondered aloud whether it was wise to be building a new warehouse facility at Corby if the whole market for books was to be computerised by digital editions within five years.

At the time (this was when the standards for CD ROMs were still being defined) this struck me as a wildly extravagant concern. I am not so sure now. Are many publishers planning this year to build large warehouses for books? It seems pretty likely that the market for printed books will be transformed and perhaps for educational and academic books, largely replaced by digital books within 5 or 10 years. That is the kind of time horizon that impacts on warehouse investments.