Tuesday, February 18, 2003

If you'd like to comment on the article below, you can write to me.

G o o g l e   &   B l o g s

I just saw that Google bought Pyra. Interesting ... Google and blogs have been an issue for a while. The intensely cross-linked nature of blogs serves to elevate the PageRank of blogs, leading to them appearing increasingly often in the first page of Google search results (see article on Slate for a discussion of this). And for most users the first page is all that matters.

For bloggers, I guess this is a good thing ... but I've lately heard several people tell me, in response to my questions to them about what's good and bad about Google, that they really, really  don't like getting blogs in their search results.

These users were mostly high school students, and their irritation was that they wanted hits from “authoritative sites,” not from some random individual's personal thoughts on the topic. When prompted, they couldn't define what they meant by “authoritative,” and that's the crux of the issue. To them it was just “you know, authoritative—someone whose opinion really counts, someone who knows.” Of course, that's essentially what PageRank is (or can be)—a voting system that defines authoritative references.

But who are the voters today? They're a group that are not representative of the whole user base of the web; they are by definition a self-selected group that has been computer literate enough, with resources enough, to be content authors. The growing number of blogs have altered that mix, a change which has had (at least in the view of my small, statistically non-significant sample of high school students) a deleterious effect on Google's ability to pinpoint “authoritative sites.” As yet unrepresented are those that are primarily or solely content consumers—these are the digitally disenfranchised, whose participation is prevented by the poll tax of technological knowledge. But that's going to change now that Google's bought Pyra. More on this in a moment ... but first:

G o o g l e   T o d a y

Let's take a look at where Google is today: the most used web search engine, consistently one of the top properties on the web in almost every geographical region, and just voted the best brand. Sounds great, right?

But Google is going to be faced with significant challenges to its business model over the next two years. Every major search site is aping Google's clean, minimalist design (which is so ironic, since the design is itself an accident, a marvelous piece of serendipity that, according to an anecdote related by Marissa Mayer, a Google Product Manager, at one of Terry Winograd's PCD Seminars at Stanford,  Sergey Brin didn't know much HTML when coding the page, so he kept it very simple).

The things that make Google a success are being copied (at least superficially), and for many people the other search engines will be ‘good enough’ that going to Google won't matter anymore.

And, I don't know if it just happens to me, or if other folks experience it too ... over the last six months I've seen increased incidence of occasions where Google just doesn't get back to me on a query ... I get a browser message that the site is inaccessible. The only time I ever use another search engine, e.g., FastWeb or Teoma (or even AltaVista) is when the Google connection goes off into limbo-land and is gone-gle. But lately that's more and more, which is a dangerous trend; it was just such a situation that led me to use Google back in the twentieth century in the first place. The switching cost for Google users is extraordinarily low.

Then there are the portals, especially Yahoo, which bought Inktomi (nice timing, BTW ... cancel your contract with them, ruining their revenue flow, then buy them), will presumably not squander this new asset (it has already hired the pay-for-placement ad technology VP from Overture, thus signaling its intent to roll-its-own in this space) and may well sever its relationship with Google (whom it presumably views as a major competitor for user mindshare). Google's being voted the #1 brand probably will hasten such a move.

So what's Google to do, with it's friends starting to view it as competitor? It's not a portal ... although it's certainly a very popular site—if you look at the A.C. Nielsen NetRatings for Google, it's number four in the US, and way up there in every other geography, but with one notable difference from everyone else in the top 10—the amount of time that people spend on Google properties is extraordinarily low.

Yes, I know many of the others on this list are ISPs and/or portals, but you have to drop down all the way to number 16 (Gator, for goodness sake!) to find a
property with a lower time per person.

Part of that is by design; Google wants the query processing to be very quick, with results so relevant that you'll find what you're looking for in the first page of 10 results. But it's wasting the potential impact of the Google brand, at this instant the best brand in the world. The recent brand survey shows that; so does my informal polling of friends—people really love Google.

<digression>

See the results of a search for "I love Google" vs. "I hate Google". You get a Google-LOHR (Love Over Hate Ratio) of 1270/55 = 23.09. That's pretty darn good, as Google-LOHRs go. Try it yourself—even “Santa Claus” only gets a Google-LOHR of 65/68 = 0.95+ (it's not as good as “chocolate,” though; it rates a Google-LOHR of over 28).

It's interesting that Google-LOHRs are different from the results of the brand survey through the top five places (as far as I bothered to check); although Google is still strongly in first place at 23.09, #3 Coca Cola at 6.45 ranks above #2 Apple's 4.73, and #5 Ikea's 3.91 beats #4 Starbucks' 1.08.

</digression>

But sic transit gloria googli. Google needs to find a way to occupy more of each user's time. All without turning themselves into another “me-too” portal. But how to do that? Here's how:

G o o g l e   T o m o r r o w

The Google experience today is mostly about an individual and his/her quest for information; and about the super-quick self-gratification that Google offers, which only reinforces the good feelings that most people have about Google. But Google (except for Google Groups, which is of course just Usenet News) is mainly about the individuals and their heroic quest for info. Google needs to become an experience that the user takes with them after they click on a search result, and an experience they can share with others.

I've found from experience that the best way to evolve the feature set of a software product or a service is with an intensely customer-centric focus, one that strives to truly improve usability, which is defined as the quality of use of a program by a specific individual with specific goals, while performing a particular task in a particular environment (this is my take on the standard definition). This definition, BTW, encompasses the Googlers' own distinction between useful and usable.

The practical way to do this is to look at how users employ your product/service, and then discover what it is they do just before, during, and after they use your product. For Google, one fruitful area to examine is what users do after they click away from the Google site.

A user will typically acquire some piece of information, and then either file it away for future personal use or share it with someone else. Why else were they looking it up in the first place? That aspect of sharing is something that Google has to become part of, that it needs to enable. And I don't mean just emailing the info. I mean collaboration. I mean being able to automate pulling that info into any form of collaboration that the individual employs: email, IM, document creation/distribution, blogging, etc.

There's another element of this, too; everyone stores the info they have gathered differently, in different taxonomies/hierarchical/chronological filing mechanisms. These all represent different “places” that will increase the ability to re-find and re-use the information gathered. But the web itself is a collection of virtual “places” that can be used as the basis for a filing mechanism. By this, I don't mean that people should store their files on somebody else's website, but rather that the act of visiting a website should “activate” the awareness of files that are logically connected from the user's personal information store with that web location.

You can implement this in a variety of ways, but if the technical details are properly hidden, the net effect to a user is that it looks like you can store stuff in different places on the web and just go there to retrieve it. Even better, it could become possible to share this information with other others, so (with proper authentication and authorization) others can see this just by going there.

There are some superficial similarities to the failed ThirdVoice, but I'm not really talking about the same thing. It's closer in some ways to the kind of annotation and markup products like the one from iMarkup, a San Diego based company with a nifty little product.

But it will be tricky to do this without “doing evil” (in the Sergey Brin sense) and spawning a wave of opposition from outraged webmasters and lawsuits from content owners. But it can be done. Just avoid ever altering the appearance of a webpage, place annotations at the side, and implement a PageRank like voting system to rate/block comments and/or commentators (hand waving here).

Only Google has the worldwide presence to make such a thing truly effective. Of course, to do this you'd have to have a really easy way to allow people to create “content/annotations” ... not unlike what Pyra affords with their blogging product. And, since one of the best ways to collaborate with people is by having a well-known place that is associated with you, where people can come to work/comment/talk, the concept of offering everyone a free home page (a blog), that can become interactive via an annotation version of the blogging tool, this is an essential element.

In fact, it is the key element in a strategy that will allow Google to increase the amount of time that people spend on Google properties, using Google tools, by several orders of magnitude. It can even become an effective IM tool; although it's not realtime, it's close enough for most purposes. An alliance with AOL around AIM could make this even more effective. Although I personally like Yahoo! IM better (and the MindAlign product even more), AIM is what most folks use, especially the generation of kids coming out of high school and college.

G a g g l e

gag·gle n. A cluster or group (from dictionary.com).

Google already has Google Groups; what does it need blogs/annotation for? One word: CONTEXT. Let me explain.

Google Groups (really Usenet groups) are contextual by topic. They are a threaded or time-sequential series of posts on a major topic, and a minor subject/thread. But not every kind of discussion/collaboration is well suited what is fundamentally just a broadcast email type of medium.

E.g., Google, according to public reports, doesn't use newsgroups for internal communication, it uses Sparrow, an edit-in-place mechanism that goes beyond simple annotation to shared content creation. More importantly, they support the context of a workgroup, whether a formal one like a project team or an ad hoc one like a special interest group. They are typed in the sense that they are not just a chronological series of comments, but rather a collection of content elements each of which can have a time-based editing stream associated with them.

Gaggle (my made-up name for what Google could do in terms of group support) affords the capability to contextual by group. Imagine a scenario in which groups of people can collaborate as easily as using a search engine, just with a web browser. But wait, doesn't this sound like Yahoo! Groups, or even Groove or Lotus Notes? Maybe, but even though all of them may have useful products, they have (personal opinion) missed the mark on making them usable.

They all make the group the thing, the organizing paradigm for their products. But the key is the group as context ... the things and content and tools and collaboration that are there with you, as you do other things, as you search on the web, and as you utilize the information you've obtained from searching, in filing or sharing it.

Gaggle could have three modes of operation: personal, private group, and public. The greatest opportunities to monetize the service could come from the private group, but there would be opportunities for premium service on the personal mode and ad placement opportunities in the public mode.

The personal mode is one in which you get to make your own notes on the web, using the structure of the web as your own personal filing system, and your blog is the head end of that, and your home base for keeping track of all your notes. It becomes your own portable content system, which you can access from anywhere. If Google gets it right, then it will be so simple (and easy and reliable) to use that people will stop keeping notes on their hard disk and keep them on Personal Gaggle.

I don't know if this has ever happened to you, but it has to me, and frequently. I need to refer to a copy of a document that I downloaded from the web, but I can't remember exactly where I filed it on my PC; and I find that it's faster to search on Google and find another copy than to find the one I already have. There's something seriously wrong with this scenario, and it points up not my personal deficiencies in organizing information, but a fundamental flaw in how people store and organize information today.

The principal advantage to Google here is that the user is spending much more time on/with a Google property than before. Instead of searching, and then navigating away from Google, the Gaggle piece stays with you, supporting the rest of your task of information foraging and gathering and sharing. A huge advantage to the user is that it's now faster to do not just searching, but also the rest of the task that searching was only a small part of. An opportunity to monetize this with premium service comes out of another form of context, the context of task, role, and interest.

If the search engine already knew more about you, in the sense that it knows what kind of job you do, what kind of industry you're in, what kind of interests you have, then it ought to be able to do a better job of retrieving results for you that are truly relevant ... not just relevant in matching the query terms (basic search relevance), or even in the sense of popularity/authoritativeness (PageRank), but also in terms of you—in terms of what you're interested in, in terms of what you're trying to accomplish at the moment.

Now I'm sure that privacy advocates have just thrown up their arms in alarm on reading that last paragraph, as well they should. The potential for abuse of such a system is enormous. Look at the controversy that exists around the tracking that the Google Toolbar does, even though Google has a very clear privacy policy in place and even allows you to turn off that feature.

But if Google does it right, with no sale or use of individual information, no tying of this personal profile to anything that identifies a user, rigorous auditing by an independent third party, and the use of a strong security architecture, I can certainly see people signing up for this kind of service, even for a fee.

Why? Because it gets you one step closer to the “queryless query”, where the system knows what you want to find out, before you even ask the question. It's what human assistants have done for people for countless years, employing their knowledge of an individuals needs and interests to have just the right thing for them at the right time. You can do this with intense user modeling as well as explicit user preference setting. That is, you don't have to tell the system everything, you can let it infer a lot via machine learning from your behavior.

Now, Google has already experimented with a limited form of such context. There are multiple search sites that have an implied context of Microsoft, Linux, BSD, href="http://www.google.com/mac.html">Apple. And even Froogle and Catalogs are other forms of contextually-oriented search, although in this case it's task-oriented. But imagine how much more effective Google could be for you if Google really knew you. Conceivably you could even have a separate computation for PageRank that was uniquely suited to your interests.

The private group mode is the one that could be most easily monetized. Most corporations I know would never consent to having their private discussions hosted or physically residing outside their walls; so you sell them a Gaggle Appliance that supports intracompany (or close partners) groups in the same way that the Google Appliance supports searching company intranets.

The public mode is problematic. How do you keep this from being ruined the way that Usenet News has deteriorated over the last decade and a half, by public rudeness? I suppose you can implement some kind of “BozoRank” or something that lets people's opinions and personal preferences about viewing levels create some sort of online reputation system, but this sounds like a lawsuit attractor. I don't know the answer to this problem.

H u r d l e s

Standards and patents.

The W3C has some work in this space, namely Annotea. And Microsoft Research's publication a year and a half ago of several papers on the topic imply not only that they have an interest in this space, but that they also have probably filed patents covering CAF (the Common Annotation Framework). Google is going to have to figure out how to do all this in a standards-based way and negotiate the minefields of intellectual property.

B o t t o m   L i n e

Google's acquisition of Pyra was brilliant; they now have the chance to create the next killer app for the web, and build upon their brand strength to ensure their survival in the coming years. I'm waiting to hear what Eric, Larry, and Sergey have to say about their plans.