Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Rank Merging for Type X Queries: A First-Pass Solution - Prof. Nigel Ward, Assignments of Computer Science

In this document, students are presented with an assignment to find a solution for ordering results for type x queries, which are important to an information exploitation company but also return useful documents on google that are not seen in mamesearch. The assignment includes data for a typical type x query and requires a first-pass solution in 20 minutes, as well as an analysis of the solution's likelihood to work for general type x queries and potential improvements if more time is available.

Typology: Assignments

Pre 2010

Uploaded on 08/19/2009

koofers-user-r4a
koofers-user-r4a 🇺🇸

10 documents

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Search Engine Technologies name ______________
CS 5319, Spring 2007
Nigel Ward
Inclass Exercise 3: Rank Merging
Suppose you are working for an information exploitation company, and one day your boss comes to
you excited. “Look at this”, she says, “We’ve discovered a new algorithm that is good at classifying
user queries: it can reliably separate them into type X and type not-X. This is good news, because
type X queries are important to our business, and since we also know that queries of type X are
generally handled better by mamesearch.com, than by any other search engine!”
“Great,” you say. “Then we can just route all X-type queries to mamesearch.com and make our
analysts happy. It seems too good to be true.”
“It is,” she replies. “On most type X queries, google still returns some useful documents not seen in
mamesearch, and ranks some good ones higher. We think that google is doing a better job of finding
value in long documents. So here’s your next assignment: find a way to order the results for type X
queries that uses all the information. Here’s some data to get you started. Give me a first-pass
solution in 20 minutes. Oh, and by the way, whatever solution you come up with, give me a rough
analysis of how likely it is to work for general type X queries, and tell me what you could do to
improve the solution if we had more time.”
She leaves, dropping the following data on your desk, and saying over her shoulder “sometimes
people say that utility is proportional to the reciprocal of rank, for some queries on some engines, in
case that helps.”
query: “snowboarding vacations near West Texas” (a typical type X query)
mamesearch google desired document
results results results lengths (kilobytes)
1. A V A A 1
2. B Y B B 2
3. C D V C 9
4. D A D D 6
5. E F Y E 1
6. F G F F 1
7. G B G G 2
8. H Q E H 2
9. I N Q I 1
10. J Z C J 1
N 4
Q 3
V 9
Y 26
Z 1

Partial preview of the text

Download Rank Merging for Type X Queries: A First-Pass Solution - Prof. Nigel Ward and more Assignments Computer Science in PDF only on Docsity!

Search Engine Technologies name ______________ CS 5319, Spring 2007 Nigel Ward

Inclass Exercise 3: Rank Merging

Suppose you are working for an information exploitation company, and one day your boss comes to you excited. “Look at this”, she says, “We’ve discovered a new algorithm that is good at classifying user queries: it can reliably separate them into type X and type not-X. This is good news, because type X queries are important to our business, and since we also know that queries of type X are generally handled better by mamesearch.com, than by any other search engine!”

“Great,” you say. “Then we can just route all X-type queries to mamesearch.com and make our analysts happy. It seems too good to be true.”

“It is,” she replies. “On most type X queries, google still returns some useful documents not seen in mamesearch, and ranks some good ones higher. We think that google is doing a better job of finding value in long documents. So here’s your next assignment: find a way to order the results for type X queries that uses all the information. Here’s some data to get you started. Give me a first-pass solution in 20 minutes. Oh, and by the way, whatever solution you come up with, give me a rough analysis of how likely it is to work for general type X queries, and tell me what you could do to improve the solution if we had more time.”

She leaves, dropping the following data on your desk, and saying over her shoulder “sometimes people say that utility is proportional to the reciprocal of rank, for some queries on some engines, in case that helps.”

query: “snowboarding vacations near West Texas” (a typical type X query)

mamesearch google desired document results results results lengths (kilobytes)

  1. A V A A 1
  2. B Y B B 2
  3. C D V C 9
  4. D A D D 6
  5. E F Y E 1
  6. F G F F 1
  7. G B G G 2
  8. H Q E H 2
  9. I N Q I 1
  10. J Z C J 1 N 4 Q 3 V 9 Y 26 Z 1