Friday, March 21, 2014

Week 9: Reading Notes

IIR Chapter 19
19.2 Web Characteristics
1. Web graph
We can view the static Web consisting of static HTML pages together with the hyperlinks between them as a directed graph in which each web page is a node and each hyperlink a directed edge.
2. Spam
19.3 Advertising as the economic model
19.4 The search user experience:
1. User query needs
19.5 Index size and estimation

Chapter 21
21.1 The Web is as a graph
1. Anchor text and the web graph
21.2 Page Rank
1. Markov chains
2. The page Rank computation
3. Topic-specific Page Rank
21.3 Hubs and Authorities
1. Choosing the subset of the Web

Week 9: Muddiest Point

In this lecture, I got one question. When talking about the Document Surrogate, the length of complexity of URL was mentioned. But as we know, many URLs are generated automatically with the data from database and the URLs in general would only be copied and pasted by users, few people will type the whole URLs or change the URLs' query, in this case, how would the length of URLs influence the Document Surrogate?