What does the future hold for
search engine technology companies? What about larger companies
like Yahoo which are still heavily identified with search?
Can they survive on their own? How will they prosper? What
technologies loom on the horizon? These are the kinds of
open-ended questions I'm asked fairly often. If I had to
answer them at a holiday cocktail party, I'd probably keep
it short and snappy (taking a cue from our long-forgotten
Cocktail Party Crib Sheet).
But for the slightly more attentive reading public, this
is what I would offer as a tentative answer in short essay
format.
The stock market boom made search engines famous before
there was very much to brag about. That's unfortunate because
it obscures the deep changes Internet search and retrieval
technology have introduced into our daily routines.
Our love of simple definitions and recognizable brand names
sometimes leads analysts to describe the technological status
quo with a sense of finality. A few years back, everyone
"knew" that about ten search engines like Infoseek,
Lycos, Excite, Webcrawler, AltaVista, and Open Text were
"important." Most engines (and directories like
Yahoo) had cute names, so it was a natural tendency to talk
about them almost as if they were people. But there were
not many important differentiators amongst these technologies.
Many of us knew less about them than we thought we did.
What's more, we continued to waste our time talking about
some of them as if they were living breathing entities,
failing to notice that some of them were all but dead, and
that new technologies, as yet without cute names, were emerging.
Inktomi rose, fell, stabilized. Google displaced Inktomi
as Yahoo's spider engine of choice. The Open Directory and
LookSmart came along to challenge, but not defeat, the Yahoo
directory.
The end user's persistent uncertainty about the reliability
of results from any particular search engine fed into the
popularity of the Internet's first major "metasearch"
engine - Metacrawler, developed by Oren Etzioni, a University
of Washington scientist. Metasearch is still popular. What's
more important than the presence of specific brand name
metasearch engines (other popular ones include Dogpile,
Ixquick, and Mamma), however, is the concept behind metasearch:
polling a variety of search technologies so that one need
not be committed to a single method of ranking results.
There really can be no single best way of classifying or
ranking information culled from a huge database such as
the World Wide Web. In a similar vein, Steve Thomas, CEO
of Wherewithal, a taxonomy software company, has argued
that there is a "fixed taxonomy problem" associated
with Yahoo-style categorized directories. Metasearch is
one way of addressing the problem; another is a rethinking
of the technology and assumptions behind categorized directories.
Lurking somewhere underneath the surface appeal of brand
names and user interfaces, then, is the core functionality
of search and retrieval technology in key applications in
the scientific, business, and consumer worlds. Most of us
are now sophisticated enough to know the difference between
a directory and a "crawler-based" search engine,
and more debate now takes place about what constitutes a
relevant search result.
Google has captured the world's imagination with a results
ranking technique that might be referred to as reputation
analysis. The presence or absence of the user's key phrases
still determines which pages will contend for a high ranking
in the search results. But the ranking within those results
depends on an analysis of the linking structure of the web,
on the assumption that a link to a page from another important
page is a measure of importance or external reputation.
No one will disagree that Google is awesome. Consumers
loved it for a host of reasons; one of them was a drastic
reduction in search engine "spam." But Google
is far from the last word on the subject. They're pioneering
the idea of reputability measurement. It's such an important
trend that most of their major competitors - Inktomi, FAST
Search, AltaVista, Teoma - are also measuring "off-page
factors." It's a particularly impressive feat given
that Google is doing it in the wild west atmosphere of the
whole Internet - a very public, very large database with
plenty of incentives for spoofing and spamming on the part
of particular web site owners. Google is cool, much in the
same way a Model T Ford was cool. As long as you want it
in black.
Search engines aren't yet very customizable. The Google
I'd like to see would offer all manner of dials and switches
so that I could test out different versions of the algorithm.
If today's "power searcher" is a bit like yesterday's
librarian (an intimate familiarity with Boolean searches),
tomorrow's will be more like a cross between a mathematician
and a forensic scientist - someone who wants to pick and
choose amongst five different ways of measuring page reputability.
Custom metasearch will make its present felt in the coming
years. We'll also see more options in terms of our ability
to retrieve multimedia and other varied file types, and
to access the so-called "invisible web."
There is unwarranted concern today about the economic viability
of search engine companies, perhaps because conglomerates
like ExciteAtHome (to shut down forever on Feb. 28) and
Yahoo (struggling to replace advertising revenues with fee-based
services) have had financial struggles. But even in a time
of severe recession and limited access to capital, there
are a growing number of success stories at various stages
of development. Google is purported to be profitable already,
and much of the revenues derive from a stream that is supposed
to be dead: CPM-based advertising. Bob Thomas, Director
of Marketing for the I-Business Unit of FAST Search, notes
that IBM's public web sites saw significant increases in
page views following the installation of improved site search
functionality. If their recently rejuvenated stock market
valuations are any indication, even struggling companies
like Ask Jeeves, Inktomi, and LookSmart may be turning the
corner.
Corporations are beginning to realize that whether it's
on a public e-commerce web site or behind the corporate
firewall, the Internet is primarily a mechanism for classifying,
searching for, and retrieving information. The search technology
sector is neither a basket case nor a charity case, given
the ROI that is associated with allowing users to access
the information they need.
Search companies will do better if they and their venture
backers can remain fiercely independent against the forces
that want to turn them into something they aren't. In the
last wave, AltaVista fell victim to the pressure to compete
with AOL and Yahoo as a shopping mall and/or media company.
There will be more of these pressures, and they should be
resisted. Yahoo itself might even want to resist some of
them.
Open Text, founded in 1991, exited the consumer search
engine business in 1996 to focus on developing integrated
corporate intranet software, in which search played a key
but not exclusive role. Today, they're a profitable example
of how a low-key approach and a sustained process of discovering
the needs of corporate customers can pay off in the long
run.
I recently spoke with Jason Liebman, President of Naming
Solutions, a division of Applied Semantics. Since its inception,
Applied Semantics has been focused on one aspect of search
engine technology - the ability to discern the meanings
of search queries based on a "concept map" or
proprietary lexicon rather than merely recognizing keywords.
Last March, Applied Semantics co-founder Gil Elbaz expressed
a refreshing degree of surprise that so many of the people
interested in search engines are coming at things from a
marketer's perspective, seeing search technology as interesting
only insofar as they can help sell one's products. "I've
had my head so much into meaning-based search technology,"
offered Elbaz, "I hadn't realized how many people were
so focused on the marketing-to-search-engines side of things."
Good for him for being so oblivious. The fact is, search
engines were born to help users find stuff. In the context
of the whole web, search engines work best when they cater
to marketers' interests only indirectly by playing the role
of neutral referee amongst the many sources of information
clamoring for the search engine user's attention. If companies
like Inktomi or Google start spending too much time figuring
out how to make money for their advertisers, they'll forget
what excited consumers about search engines in the first
place. As in any technological field, the right balance
needs to be struck between basic and applied research. Too
little focus on basic research, and you might not wind up
with anything innovative or interesting to sell.
As it turns out, there have been plenty of applications
for Applied Semantics' technology. Their Naming Solutions
division, which offers a variety of products to help domain
name registrars provide name suggestions to their clients,
and thereby increase their sales, is exceeding revenue targets.
98% of domain name searches on registrars' sites "don't
result in a sale," according to Liebman. Applied Semantics
wants to help registrars improve on that figure, and have
already partnered with most of the major players, including
Register.com, Verisign, and Yahoo.
Maybe the lesson here echoes the title of a popular career-planning
guide: do what you love, and the money will follow.
Large diversified media companies such as AOL, MSN, and
Terra Lycos, admittedly, do not necessarily need cutting
edge search technology as long as they can keep their databases
relatively free of spam. There is even a theory that erratic,
unreliable search engine and directory technology helps
companies like these make more money by forcing more advertisers
to pay for placement if they want to be seen. Thankfully,
companies like Google and Applied Semantics are showing
that you can not only make the world a better place by building
a better mousetrap, but you can also thrive financially
without having to dumb down your pursuit of new horizons
in search.
|