Friday, October 21, 2005

Google Library

I just read John Battelle's post on the AAP's lawsuit. The comments are particularly interesting, with a couple of very strident ones criticising Google. I have a theory about how Google plans to justify their actions:
  1. Libraries are allowed, under copyright law, to make a single copy of any work in their possession. This is called the Library Exemption. There is a nice outline of the terms here. The libraries themselves can't get in trouble for contracting with Google to do this for them, because they are receiving no commercial advantage from it. Google clearly is receiving a competitive advantage from it, BUT:
  2. They may be able to make a good case for Fair Use, depending on the nature of what they keep from the book. There are four aspects to be weighed in any Fair Use defense (see Wikipedia):
    1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
    2. the nature of the copyrighted work;
    3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
    4. the effect of the use upon the potential market for or value of the copyrighted work.
Clearly Google hopes for commercial advantage from the use of the scanned books, so they might fail the first test. The second doesn't really apply: these are clearly books subject fully to copyright law. It's the third and fourth aspects that I think are the center of Google's defense. A copy is a copy, but a searchable index created from a scanned copy is arguably a transformative use of the book. A human being can neither read the index, nor reconstruct the original from it, so Google may be able to successfully defend themselves on aspect #3. Their main weakness is the existence of page images from the original scan. These may or may not be stored and accessible in such a way that a whole copy of the original could be reconstructed and read. Aspect #4 is another winner for Google. The clear effect of this system will be to sell more copies of the publishers' books. The only (theoretical) commercial harm caused to the publishers is that they are effectively prevented from rolling a Google Print of their own, which might bring them in more money than simply selling their books. So Google wins on at least two of the four counts, and the act of copying itself is protected under the Library Exemption.

I suspect the AAP would have an uphill battle in winning this one. I wouldn't be surprised if they wanted Google to license their books for their index at some fairly exorbitant rate, and Google refused to pay because they're doing the publishers a favor. That would make the lawsuit a negotiating tactic.

Thursday, April 28, 2005

When SEOs Attack

Search Engine Foo: iUniverse Book Publishing: Book Publisher for Self Publishing and Print on Demand. Care to guess what terms they're optimizing for? It does seem to work. They show up #1 for a Google search on "self publishing." So clearly this sort of spamming works. But it leads to pretty hilarious prose:

iUniverse, the leading online book publisher, offers the most comprehensive book publishing services in the self-publishing industry—awarded the Editor's Choice award by PC Magazine and chosen by thousands of satisfied authors as the leading print-on-demand book publisher.

We help authors to prepare a manuscript, design and self-publish a book of professional quality, publicize and market their book, and print copies of their book for sale online and in bookstores around the world.

As an innovative book publisher, we also offer exclusive services such as our acclaimed Editorial Review and our revolutionary Star Program, designed to discover and nurture exceptional new talent within our growing author community.

Don't wait any longer to get that manuscript off your desk and into the marketplace. With iUniverse as your book publisher, you can become a published author in a matter of weeks. Why not get started today?

Yes, indeed. Publish your book with a publishing publisher and be published. Ouch. Not sure I'd pay them an exorbitant fee to edit my book.

Friday, March 18, 2005

Writing Code

I'm coming to the conclusion that writing code, as an activity, really is like writing prose. I find myself treating code projects just like writing projects:
  1. I spend the first part of the project thinking about it and being (apparently) very unproductive. (25-30%)
  2. After I reach some sort of critical mass in my thinking, I very quickly pour out everything into code/onto the page. The project is 80% done as far as volume goes at this point. (10-20%)
  3. I spend the rest of the time editing, bugfixing, refining, etc. (50-60%)
For larger projects, this cycle gets repeated for each component of the project. This is precisely the pattern I followed when writing my dissertation. I don't know if this kind of working method is in any way typical, but it does seem to produce the desired results. It makes giving project completion estimates next to impossible though, because I really have no idea how long the project will take until I enter the hyper-productive phase, and when that's complete, I often still have a lot of work to do, even though the bulk of the code/writing is done.

This is why it's best for me if I've got some variety in a job. The hyper-productive phase really can't overlap with anything else: if I'm interrupted then, I'll get off track and it may blow the whole day, but in phase 1 or 3 I'm better off not spending all my time focusing on the project, because I'll just end up web surfing. Or blogging.

Wednesday, February 23, 2005

Tagging Notes

From a conversation this afternoon: the tags used in folksonomies are deliberately stupid. They are atomic units of information. So a tag can be any atomic unit--it doesn't have to be a word, it could be a URL, a zip code, or anything else you can think of that isn't reducible.

adaptive path » ajax: a new approach to web applications

adaptive path » ajax: a new approach to web applications. Web application development is starting to get really exciting again. The funny thing is that a lot of this technology has been around for a while, and even though IE supported it, you didn't see tools like these. I wonder whether the development of Firefox is really what's pushed it. Certainly nearly all the developers I know shun IE. So perhaps having a capable Free browser was what sparked all this innovation.

Tuesday, February 22, 2005


Recently, I've been seeing a number of companies and projects springing up around the idea of publishing blogs as books. The examples I'm aware of are Blogbinders, qoop, LJBook, and most recently, book this blog, but I'd bet there are more. What I'm wondering about is how useful the blog-directly-to-book pathway is. Wouldn't an application that aggregates your blog posts into an editing environment (like Word or OpenOffice) be more useful? Can we really smooth over the formatting differences between web and print well enough to produce (automatically) a nice-looking book 100% of the time? I'm a little skeptical.

From my (admittedly cursory) browsing, it looks like blogbinders have a human in the middle of the process, and qoop certainly did for their only title so far, John Battelle's SearchBlog. Requiring a human being in the loop raises costs and introduces scaling issues.

There's a fine tradition of publishing diaries, going back at least to Caesar, but unless you happen to be famous, or have a blog that's truly interesting a high percentage of the time, the way to monetize blogs is more likely to be on the Hardball Times model, where you build up an audience, and then sell them work they're interested in.

But if blogs really are the ultimate vanity presses, then there may indeed be money in printing them, if you charge the blogger enough up front. It will be interesting to see how all this shakes out.