|
|
|
|
Wednesday, June 23, 2004 |
|
The problem with search and how to fix it |
|
|
|
To answer this question I think we first need to take a step back and understand what people use search engines for. Most people will tell you something along the lines of "I use search to look for information on X". At one time this seemed pretty straight forward. You type in string and the search engine would look for occurrences of that string in indexed websites. Then came the web-spammers that would sneak in huge blocks of unrelated keywords into their webpages so say a porn site would come up when you search for "equestrian training".
In an ongoing tug-a-war, the search engines would implement new features to neutralize the web spammer's tactics, then the web spammers would find another way to cheat their way up on results pages. Thus was born the concept of "relevance". Google pioneered this space with their PageRank system, a series of algorithms that attempts to predict a websites relevance to a search string by how many other pages link to it, thus taking into account legitimacy as a factor of its relevance.
This is a great way to come up with relevant search results when everyone defines something the same way. For instance if you're looking up information on a Honda car dealerships in California all you need type is "Honda California" into Google and the first result is Honda's own website followed by slew local dealerships. The problem arises if you happen to live in Honda, California and you are looking for a map of your town. Even a refined search for "Honda California Map" merely results in a listing of maps with instructions to get to various dealerships around California.
The problem here has to do with how you define what you are looking for. With most products or concepts that have a common cultural definition today's search engines work great. But when you start delving into searches involving something more nebulous where terminology may vary, or the terminology may be eclipsed by some greater social meaning, search engines are actually pretty bad at serving up what you are looking for.
So short of making a brainwave interface that connects to a computer using some advanced form of AI to determine what you are looking for, how will we ever have truly relevant searches? Simple... Build a personal context database for each search user that takes into account linguistic idiosyncrasies, historical search successes, and personal preferences, and use it to predict a webpage's relevance. The database could be built based on a user?s past searches, what they tended to click on, and more importantly how they rated the pages relevance in relation to what they were actually looking for. Over time this database would create a user's search profile and then compare it to other users who with similar search profiles.
Applying this to our earlier example, a person who lives in Honda, California would be likely to rate the pages relating to their city as highly relevant so the next time the search is performed the results are more targeted. The next step would be to use this information to predict future search results. At a micro level, by putting Bob in a mathematical "neighborhood" with Joe and Mary, because they both found pages about Honda California relevant, the search engine can now reference Joe and Mary's preferences as it relates to Honda, California for Bob's searches thus increasing the likelihood of Bob finding something he wants the next time he searches. At the macro level this would be happening on billions of searches conducted by millions of users, so it's statistically sound as well.
I can almost picture the day when I'll be able to type "crack pipe" into a Google search box and have it pull up a list of plumbers instead of drug paraphernalia sites;-P
Posted by
Marc @
4:56:00 PM --
|
|
|
|
|