Tuesday, April 29, 2008

The Web is the format, but will books be in HTML?

Aaron Miller of the Book Glutton, has an interesting post on books, file formats and the web at the TeleRead blog. The comments are also worth reading. Aaron makes the point that both Google and Amazon, two of the webs giants, are heavily committed to books and to putting books straight on the web, page by page (people tend to overlook the extent to which Amazon is doing this in all the hoo-ha about the still experimental Kindle).

Aaron has some cogent criticisms of the IDPF and its ePub format ("... the IDPF is peddling a format that search engines can’t index, browsers can’t natively open, and experienced web developers can’t figure out.") and he smacks Google ("...one of the ugliest interfaces book readers have ever known") and Amazon ("...despite Amazon’s offering of the Kindle they are still a Web company") just as hard.

But he misses some key points. It is hardly correct to say that Google's Book Search has us reading PDFs (PDFs are available from Google but with GBS we are reading JPEGs with the help of a database, PDFs are there purely for the convenience of printers) and he seems to be suggesting that going with the Web means going with texts as HTML. This is just not what Google and Amazon are doing. Google and Amazon get their power and their reach by putting all texts into a database system. This point is perhaps missed in some of the comments. Books are going onto the Web in the cloud, but the literary cloud comprises JPEG droplets not a shower of HTML. Aaron also has a rant against ISBNs and 'retrofitting' old problems. But this is to miss the continuing and lasting importance of our print legacy. Google is creating a huge advantage for its Book Search approach by retrofitting the literature of the past. And the ISBN system has its advantages. Google and Amazon know this and they are both very good at resolving ISBNs to their appropriate target.

Pages, books and ISBNs: these are elements of our print legacy that will still be useful when all literature is primarily digital. Who knows: we may still have the concept of recto and verso. And it will not mean flipping the eBook reader.

2 comments:

Anonymous said...

Hey Adam,

You make a good point. Saying Google is using PDF in their Reader is imprecise at best. My point was that Google is doing harm for books on-line by focusing solely on data and failing to address two of the crucial reasons that people read books: escape and knowledge. Those are different vectors, and while you could argue that data is useful for knowledge, there's a distinction between the data the book gives you, and the data about the book, which is not essential. Metadata is where Google excels. Google Maps is a great example. That kind of data is more than handy, but that doesn't mean we'll always want all of that data when we look at a map. We want to access some of it when it's convenient. The same goes for books. One day it will be invaluable for some people to be able to correlate screen passages with original physical page data, but for now, the search is on for good ways to read on screen.

Aaron

Adam Hodgkin said...

Aaron I would agree with you much more if you said that two of the crucial reasons that people NEED (not read) books is escape and knowledge. I think that the way that we read is changing enormously as we move to the web library, and in a paradoxical sense we dont need to read as much (certainly not in the same way) but we definitely need books, scholarship, history and science organised in the digital library. That is one reason why metadata is becoming more important, as you suggest, precisely because we will be less able to read what we need to know. My thinking about 'reading' was greatly changed by reading (yes I did read it) the brilliant book by Pierre Bayard: How to Talk About Books You Haven't Read. (its funny and wise). Google has been much better about its metadata for Maps than it has been about metadata for Books (so far -- we have been promised improvements).