Ethanz on Google Coop
Ethan Zuckerman discovers that Google Coop’s roll your own search engine has high precision but poor recall, i.e., it gives few irrelevant returns, but misses stuff it should find.
[Tags: ethan_zuckerman google ]A little poking solves the mystery pretty quickly. Google Coop Search works by searching against the main Google search catalog, retrieving 1000 results and filtering them against the sites you’ve included in your catalog. This makes sense, computationally – these searches are fast, almost as fast as normal Google searches. Rather than conducting 3000 “site:” searches and collating and reranking the results, Google is sacrificing recall, getting 1000 results and discarding those not in your set of chosen sites, which requires one call to the index and a really big regular expression match.
…
…In other words, the little engine I’ve built is useful only if the sites I’ve chosen are relatively high ranking and authoritative sites on the topics I’m searching on.
Categories: Uncategorized dw