CSCI 2531: Internet and Web Algorithms
Instructor: Eli Upfal
When: TuTh 1-2:20
Where: CIT 506
Description
Graduate course/seminar focuses on the mathematical foundations of
algorithms for handling large amounts of data over networks. We'll read and
discuss recent papers in information retrieval, search engines, link
analysis, probabilistic modeling of the web and social networks, and
more.
Presented Papers
Last Update: 04/22
If you are presenting a paper not listed here, please send an email with
the title of the paper and a link to it to Matteo, matteo@cs.brown.edu
Most titles and links of proposed papers can be found in the Collection
below, but check whether the link is working before sending it. Thanks.
Searching, Ranking, and Crawling
- [1/27] Kleinberg, J. M. (1999) Authoritative sources in a hyperlinked
environment J. ACM 46, 5 (Sep. 1999), 604-632
- [1/29] Page, L. and Brin, S. (1998) The
Anatomy of a Large-Scale Hypertextual Web Search Engine
Proceedings of the 7th International WWW conference, 1998
- [1/29] Page, L. and Brin, S. and Motwani, R. and Winograd, T. (1999) The PageRank Citation Ranking: Bringing
Order to the Web
- [2/3] Dwork, C. and Kumar, R. and Naor, M. and Sivakumar, D. Rank Aggregation Methods for the Web
(or An alternative version) 10th
International World Wide Web Conference, May 2001
- [2/5] Borodin, A. and Rosenthal, J. S. and Roberts, G. O. and
Tsaparas, P. Link Analysis Ranking:
Algorithms, Theory and Experiments , ACM Transactions on Internet
Technologies (TOIT), Vol 5, No 1, February 2005
- [2/10] Azar, Y. and Fiat, A. and Karlin, A. and McSherry, F. and Saia,
J. Spectral Analysis of
Data. 33rd ACM Symposium on Theory of Computing, 2001
- [2/12-17] Broder, A. Z. Identifying and
Filtering Near-Duplicate Documents CPM 2000
- [2/17] Gong, C. and Huang, Y. and Cheng, X. and Bai, S. Detecting Near-Duplicates in
Large-Scale Short Text Databases PAKDD 2008
- [2/19-24] Dasgupta, A. and Ghosh, A. and Kumar, R. and Olston, C. and
Pandey, S. and Tomkins, A. The
Discoverability of the Web WWW 2007
- [2/24] Bar-Yossef, Z. and Broder, A. Z. and Kumar, R. and Tomkins, A. Sic Transit Gloria Telae: Towards an
understanding of the web's decay WWW 2004
The Structure of the Web Graph
- [2/26] Mitzenmacher, M. A
Brief History of Generative Models for Power Law and Lognormal
Distributions Internet Mathematics Vol. 1, No. 2: 226-251
- [3/3] Broder, A. and Kumar, R. and Maghoul, F. and Raghavan, P. and
Rajagoplan, S. and Stata, R. and Tomskin, A. and Wiener, J. Graph Structure in the
Web Proceedings of the 9th International WWW conference,
2000
- [3/5] Kleinberg, J. and Kumar, R. and Raghavan, P. and Rajagopalan, S. and
Tomkins, A. S. The WEB as a graph:
measurements, models, and methods Intntl. Conf. on
Combinatorics and Computing, 1999
- [3/5] Kumar, R. and Raghavan, P. and Rajagopalan, S. and Sivakumar, D. and
Tomkins, A. and Upfal, E. Stochastic
models for the web graph 41th IEEE Symp. on Foundations of
Computer Science, 2000, pp. 57-65
- [3/10-12] Fetterly, D. and Manasse, M. and Najork, M. and Wiener, J. L. A large-scale study of the evolution
of web pages WWW 2003
- [3/10-12] Ntoulas, A. and Cho, J. and Olston, C. What's new on the web? The evolution
of the web from a search engine perspective WWW 2004
Small World
- [3/17] Kleinberg, J. The small-world phenomenon: an algorithmic
perspective In Proceedings of the 32nd ACM Symposium on Theory of
Computing, 2000
- [3/19] Liben-Nowell, D. and Novak, J. and Kumar, R. and Raghavan, P. and
Tomkins, A. Geographic routing in social networks
In Proceedings of the National Academy of Science, volume 102, pages
11623-11628, 2005
- [3/31] Kempe, D. and Kleinberg, J. and Tardos, E. Maximizing the Spread of Influence through
a Social Network Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge
Discovery and Data Mining, 2003
- [4/2] Talk by Matt Lease
- [4/7] See [3/31]
- [4/7] Sandberg, O. Neighbor Selection and
Hitting Probability in Small-world Graphs Annals of Applied
Probability, 2008
- [4/7] Mogren, 0. and Sandberg, O. and Verendel, V. and Dubhashi, D. Adaptive
dynamics of realistic small-world networks ArXiv
- [4/9] No class
- [4/14] See [4/7]
- [4/16] Zhang, H. and Goel, A. and Govindan, R. Using
the small-world model to improve Freenet performance In Proc.
IEEE Infocom, 2002, and Clarke, I. and Sandberg, O. and Wiley, B. and Hong, T. W. Freenet: A distributed anonymous information
storage and retrieval system
Deep Web
Collection of Papers
(Some links may be broken. Working on that)
Searching, Ranking, and Crawling
- Kleinberg, J. M. (1999) Authoritative sources in a hyperlinked
environment J. ACM 46, 5 (Sep. 1999), 604-632
- Page, L. and Brin, S. (1998) The
Anatomy of a Large-Scale Hypertextual Web Search Engine
Proceedings of the 7th International WWW conference, 1998
- Page, L. and Brin, S. and Motwani, R. and Winograd, T. (1999) The PageRank Citation Ranking: Bringing
Order to the Web
- Dwork, C. and Kumar, R. and Naor, M. and Sivakumar, D. Rank Aggregation Methods for the Web
(or An alternative version) 10th
International World Wide Web Conference, May 2001
- Borodin, A. and Rosenthal, J. S. and Roberts, G. O. and Tsaparas, P.
Link Analysis Ranking: Algorithms, Theory
and Experiments , ACM Transactions on Internet Technologies
(TOIT), Vol 5, No 1, February 2005
- Azar, Y. and Fiat, A. and Karlin, A. and McSherry, F. and Saia,
J. Spectral Analysis of
Data. 33rd ACM Symposium on Theory of Computing, 2001
- Achlioptas, A. F. and Karlin, A. and McSherry, F. Web
Search via Hub Synthesis 42nd IEEE Symposium on Foundations of
Computer Science, 2001, p.611-618
- Kleinberg, J. and Tardos, E. Approximation
Algorithms for Classification Problems with Pairwise Relationships: Metric
Labeling and Markov Random Fields Proc. 40th IEEE Symposium on
Foundations of Computer Science, 1999
- Broder, A. and Krauthgamer, R. and Mitzenmacher, M. Improved
Classification via Connectivity Information ACM-SIAM Symposium on
Discrete Algorithms, 2000
- Kleinberg, J. and Raghavan, P. Query
Incentive Networks Proc. 46th IEEE Symposium on Foundations of
Computer Science, 2005
- Broder, A. Z. Identifying and
Filtering Near-Duplicate Documents CPM 2000
- Gong, C. and Huang, Y. and Cheng, X. and Bai, S. Detecting Near-Duplicates in
Large-Scale Short Text Databases PAKDD 2008
- Dasgupta, A. and Ghosh, A. and Kumar, R. and Olston, C. and
Pandey, S. and Tomkins, A. The
Discoverability of the Web WWW 2007
- Bar-Yossef, Z. and Broder, A. Z. and Kumar, R. and Tomkins, A. Sic Transit Gloria Telae: Towards an
understanding of the web's decay WWW 2004
- Aizen, J. and Huttenlocher, D. and Kleinberg, J. and Novak, A. Traffic-Based
Feedback on the Web Proceedings of the National Academy of
Sciences 101(Suppl.1):5254-5260, 2004
- Bharat, K. and Broder, A. A
technique for measuring the relative size and overlap of public Web search
engines Proc. 7th International World Wide Web Conference,
1998
- Henzinger, M. and Heydon, A. and Mitzenmacher, M. and Najork, M. On
Near-Uniform URL Sampling 9th International World Wide Web
Conference, May 2000
Sampling and the Web
- Lovasz, L. Random
Walks on Graphs: A Survey Combinatorics: Paul Erdos is Eighty
(vol. 2), 1996, pp. 353-398
- Bar-Yossef, Z. and Berg, A. and Chien, S. and Fakcharoenphol, J. and
Weitz, D. Approximating Aggregate
Queries about Web Pages via Random Walks 26th International
Conference on Very Large Databases (VLDB), 2000, pages 535-544
- Chakrabarti, S. and Joshi, M. and Punera, K. and Pennock, D. M.
The
structure of broad topics on the Web 11th World Wide Web
conference, May 2002
Small World
- Kleinberg, J. The small-world
phenomenon: an algorithmic perspective In Proceedings of the 32nd
ACM Symposium on Theory of Computing, 2000
- Liben-Nowell, D. and Novak, J. and Kumar, R. and Raghavan, P. and
Tomkins, A. Geographic
routing in social networks In Proceedings of the National Academy
of Science, volume 102, pages 11623-11628, 2005
- Newman, M. and Watts, D. Renormalization
group analysis of the small-world network model Phys. Lett. A,
263:341-346, 1999
- Watts, D. J. and Dodds, P. Newman, M. Identity and
search in social networks Science 296:1302-305, 2002
- Watts, D. J. and Strogatz, S. Collective
dynamics of small world networks Nature 393:440, 1998
- Zhang, H. and Goel, A. and Govindan, R. Using
the small-world model to improve Freenet performance In Proc.
IEEE Infocom, 2002
- Menczer, F. Growing
and Navigating the Small World Web by Local Content Proc. Natl.
Acad. Sci. USA 99(22): 14014-14019, 2002
Networks
- Mitzenmacher, M. A Brief History of
Generative Models for Power Law and Lognormal Distributions
Internet Mathematics Vol. 1, No. 2: 226-251
- Lieberman, E. and Hauert, C. and Nowak, M. A. (2005) Evolutionary
Dynamics on Graphs Nature 433: 312-316
- Kumar, R. and Raghavan, P. and Rajagopalan, S. and Sivakumar, D. and
Tomkins, A. and Upfal, E. Stochastic
models for the web graph 41th IEEE Symp. on Foundations of
Computer Science, 2000, pp. 57-65
- Kleinberg, J. and Kumar, R. and Raghavan, P. and Rajagopalan, S. and
Tomkins, A. S. The WEB as a graph:
measurements, models, and methods Intntl. Conf. on
Combinatorics and Computing, 1999
- Fabrikant, A. and Koutsoupias, E. and Papadimitriou, C. Heuristically
Optimized Trade-offs: A New Paradigm for Power Laws in the
Internet 29th International Colloquium on Automata, Languages,
and Programming (ICALP), 2002
- Ohtsuki, H. and Hauert, C. and Lieberman, E. and Nowak, M. A. (2006)
A
simple rule for the evolution of cooperation on graphs and social
networksNature 441: 502-505
- Vazquez, A. and Gama Oliveira, J and Dezso, Z. and Goh, K.-I. and
Kondor, I. and Barabasi, A.-L. Modeling bursts and heavy
tails in human dynamics Phys. Rev. E 73, 036127 (2006)
- Gibson, D. and Kleinberg, J. and Raghavan, P. Inferring Web Communities from Link
Topology Proc. 9th ACM Conference on Hypertext and Hypermedia,
1998
- Kempe, D. and Kleinberg, J. and Tardos, E. Maximizing the Spread of Influence through
a Social Network Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge
Discovery and Data Mining, 2003
- Clarke, I. and Miller, S. G. and Hong, T. W. and Sandberg, O. and
Wiley, B. Protecting Free Expression
Online
- Clarke, I. and Sandberg, O. and Wiley, B. and Hong, T. W. Freenet: A distributed anonymous information
storage and retrieval system
- Fetterly, D. and Manasse, M. and Najork, M. and Wiener, J. L. A
large-scale study of the evolution of web pages WWW 2003
- Ntoulas, A. and Cho, J. and Olston, C. What's new on the web? The evolution
of the web from a search engine perspective WWW 2004