Up until a few days ago, I thought that there were a lot of alternatives to Google.
I’ve seen Bing commercials, heard about the no-data-collecting policies of Duck Duck Go, and remember the days when I could ask “Jeeves” a question and P.G. Wodehouse’s famous butler would scour the internet for the answer.
But as I talked to Princeton computer science professor David Blei, I began to realize that these weren’t really ‘alternatives.’
All of these search engines– Google, Yahoo, Ask.com– they have the same general premise: You type in the search terms that you’re looking for, and the engines use various more or less refined techniques to find information matching those terms.
When you search through an archive, it’s the same kind of deal: you type your search term into the library catalog or the database search, and related documents get pulled.
But what if you don’t know what you’re looking for?
What if you want to know what the general trend or gist of an archive that is thousands or millions of documents strong?
That’s the kind of work that Professor Blei is researching. He’s putting together algorithms that can search through archives and databases and develop their own ‘topic models,’ word-cloud-type structures that can be broadened or narrowed.
It’s a kind of independent machine intelligence that might revolutionize research… or just make the NSA’s job a little bit easier.
Listen to my conversation with David Blei, above, to learn more.