Publications

 

Posters

PeerCrawl Poster for ICDE 2007

 

Related Publications

[1]     L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Computer Science Department, Stanford University, 1998

[2]     R. Miller and K. Bharat. SPHINX: A framework for creating personal, site-specific web crawlers. In Proceedings of the 7th World-Wide Web Conference (WWW7), 1998.

[3]     Chakrabharti, S., Van Den Berg, M., AND Dom, B. 1999. Focused crawling: A new approach to topic-specific web resource discovery. In Proceedings of the Eighth International Conference on The World-Wide Web.

[4]     A. McCallum, K. Nigam, J. Rennie, and K. Seymore, “Building domain-specic search engines with machine learning techniques,” in Proc. AAAI Spring Symposium on Intelligent Agents in Cyberspace, 1999.

[5]     J. Rennie and A. McCallum, “Using reinforcement learning to spider the web efficiently,” in Proc. International Conference on Machine Learning (ICML), 1999.

[6]     Vladislav Shkapenyuk and Torsten Suel. Design and implementation of a high-performance distributed web crawler. In IEEE International Conference on Data Engineering (ICDE), 2002.

[7]     J. Cho and H. Garcia-Molina. Parallel crawlers. In Proceedings of the 11th International World Wide Web Conference, 2002.

[8]     Paolo Boldi, Bruno Codenotti, Massimo Santini, and Sebastiano Vigna. UbiCrawler: a scalable fully distributed Web crawler. Software, Practice and experience, 34(8):711–726, 2004.

[9]     James Caverlee and Ling Liu. Resisting Web Spam with Credibility Based Link Analysis

[10]  A. Heydon and M. Najork. Mercator: A scalable, extensible web crawler. World Wide Web, 2(4):219–229, 1999.

[11]  A. Singh, M. Srivatsava, L. Liu, and T. Miller. Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web. Lecture Notes in Computer Science, 2924, 2004.

[12]  Martijn Koster. The Robot Exclusion Standard. “http://www.robotstxt.org/”.

[13]  Gnutella Network http://www.gnutella.com

[14]  V. J. Padliya and L. Liu. Peercrawl: A decentralized peer-to-peer architecture for crawling the world wide web. Technical report, Georgia Institute of Technology, May 2006

[15]  Jialun Qin , Yilu Zhou , Michael Chau, Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method, Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2004, Tuscon, AZ, USA

[16]  Bergmark, D., Lagoze, C. and Sbityakov, A. (2002b).“Focused Crawls, Tunneling, and Digital Libraries”, in Proc.of the 6th European Conference on Digital Libraries, Rome, Italy