Joho the Blog » [MS] Microsoft Research
EverydayChaos
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

[MS] Microsoft Research

[NOTE: I have to the plane and am posting this without rereading it for typos or thinkos. Sorry.]

Susan Damais is from Microsoft Research, a group of 700 people in 55 scientific areas in 5 labs, from Redmond to Beijing. Susan says that each group does some search research. And she tells us that her presentation isn’t under the 20-ton Microsoft NDA.

“What’s missing with search?” she asks? Now it consists of a query box and results list. You can do processing in between, but you can also do user modeling, domain modeling, and the context of information use. What are the user’s intent? What are the relationships among people and informaion in the world? And what is the user trying to accomplish? (“Despite what you informavores believe, most people don’t search for the sake of searching. They search to get answers.” — rough paraphrase.)

She and others have been working on “Stuff I’ve Seen” (SIS). Right now, each domain has its own way of searching — we search blogs differently than we search email. SIS wants to give “unified access to heterogeous, distributed content” (mail, files, rss, etc.). It has to be fast and flexible. And you should be able to search from whatever context you’re in, preferably with “implicit queries”: “Queries are generated in context and results are shown in context.” “We’ve learned that people, time and metadata are important.”

She talks about “memory landmarks.” Cognitive science has shown that we organize memory around landmarks, e.g., not “It happened on Oct. 5” but “It happened when Mt. St. Helen’s blew up.” They want to use this to facilitiate search. In one prototype, there’s a set of memory landmarks, both general (world and calendar info) and personal (appointments and photos) they show to the right of the results list to provide cues to help you home in on the information you’re looking for. “I think we can get computationally at some of hese things that make information memorable.”

With personalized, users don’t all get the same results, she says. How can you personalize search without requiring people to maintain a profile, she asks. They can compute similarity based on a variety of apramets, combvine the personalized and standard results, and redisplay. In the prototype, as she moves a slider toward the “personalized” side, the results change, so that a query on “bush” starts to show more results about Vannevar Bush. (The personalization is based on looking at the user’s email, etc., and doing some form of ) It is not changing the query; it is re-ordering the results. [It’s a cool demo, but anything with a slider just has to be cool.]

She talks about novelty analysis, using the “Difference Engine” they’ve built. E.g., how do east and west coast newspapers cover Mt. St. Helen’s, or how blogs cover the election vs, how the media do. They have a prototype called News Junkie that pesonalizes news using onfromation novelty: What’s interesting and different? Given a lead article, how different are the new news articles coming out?


Eric Brill heads the Text Mining, Search and Navigation Group of Microsoft Research. He talks about the “paradigm shift” from thinking that documents are the chief object of search to think that it’s about information. That is, there may be information in documents that isn’t really what the difference about. Or, instead of getting a list of documents in response to a query trying to decide if the invasion of Iraq was justified, suppose you were to get a page that displayed the relevant information.

They have a question-answering system called “AskMSR” that gets 70-75% accurate. So, if you ask “Who did Britney Spears marry in Las Vegas,” you get a list of answers with probabilities; click on the suggested answer and you get the documents from which it’s derived. It gets right “Where as Steve Ballmer born” but wrong “Who ran against Bush in 1988,” to which it answers “Bush,” with “Dukakis” as the third suggestion. But, as Eric says, if you have a grain of intelligence, you’ll actually figure out what the right answer is.

It works by mining a text base looking for strings that look like potential answers. But then they run into natural language issues. But information redundancy helps. E.g., the answer to the question “Who killed Lincon?” shows up tens of thousands of times on the Web, so they don’t have to disambiguate “Booth altered history with a single shot at Lincoln”; they can pick their answer. Or, “How many times did Bjorn Borg win Wimbledon?” You can find lots of strings on the Net like “Bjorn Borg blah blah blah wimbledon bla bla blah 5 blah blah,” that, in their redundancy, make the answer (5) more probable. [Cool way to harness the wisdom of the web, at a statistical level.]

If you have a source like Encarta that is reliable but not redundant (i.e., it only has one entry that answers the question “When did the Titanic sink”?), you can combine it with the words found (statistically) on the Web around the Titanic to do precise queries against Encarta.

Finally, he talks about noisy channel information finding. For example, in spelling correction the system tries to figure out what the original (intended) spelling was based on the misspellings introduced by a “noisy channel.” We’re moving from “doe sthe page contain the query terms” to “Does the page satisfy the information need.”


Lili Cheng heads the Social Computing Group. Her group created “the personal map,” an app that models your social relationships as exhibited in your inbox/outbox, clustering based on who you email together. Her widget also has a slider that moves people into groups based on their groups. [Also cool. I tell ya, that slider thing never fails.] Then they took all of Microsoft’s public email lists and did the same sort of charts, a sort of “Six Degrees of Bill Gates.”

She shows us Wallop: “Socializing and sharing media in the context of your social network.” It’s a slick social network app with a bit of an emphasis on photo sharing. So far, only 424 have been invited into Wallop and 219 have added content. Wallop pays attention to implicit as well as explicit data, inferring your social network. People want to control it, but not have to control it all the time.

Wallop aims to integrate with the rest of your desktop, especially email.

They;re currently looking at how you scale it while maintaining privacy, and how to enable the system to adapt to user behaviour patterns rather than vice versa.

Previous: « || Next: »

8 Responses to “[MS] Microsoft Research”

  1. More From The Microsoft Search Champs

    Microsoft’s Search Champs have been hearing from Microsoft Research today about various projects, none of which is under NDA. And that’s not surprising, given that nothing I’ve read blogged so far hasn’t already been mentioned in a variety of stories…

  2. Did Eric Brill give you any indication of when askMSR will be available? I did a post on it a while back and thought the concept was pretty compelling.

  3. Gary, nope, no indication of when. Or even if.

  4. Maybe I missed it, I did not see any mention of a search engine that can learn from how a user does search and the text arrangement of the search criteria. Are there any AI (Artificial Intelligence) integration to search planned by Microsoft?

  5. Nardo, the MSR folks did talk about ways to gather implicit metadata in order to guide searches and ways to mine the Web to guide answers. That’s AI-ish. Of course, MSR is a research arm, so what they do may or may not show up in products eventually.

  6. How do you search?

    David Weinberger gives a nice description of the search tools that Microsoft is working on. It clearly indicates that Microsoft has given a great deal of though to the problems of search. However, it is also clear that they are…

  7. More From The Microsoft Search Champs

    Microsoft’s Search Champs have been hearing from Microsoft Research today about various projects, none of which is under NDA. And that’s not surprising, given that nothing I’ve read blogged so far hasn’t already been mentioned in a variety of stories…

  8. Could you tell me what type of research is performed at the Microsoft building in Reston, VA?

    I am researching the design of an indoor classroom amphitheater. I was given information that the Microsoft Center in Reston, VA could help me with my research.

    Thanks ! ! !

Leave a Reply

Comments (RSS).  RSS icon