Skip to main content

A Faceted Crawler for the Twitter Service

Researchers, nowadays, have at their disposal valuable data from social networking applications, of which Twitter and Facebook are the most prominent examples. To retrieve this content, the Twitter service provides 2 distinct Application Programming Interfaces (APIs): a probe-based and a streaming one, each of which imposes different limitations on the data collection process. In this paper, we present a general architecture to facilitate faceted crawling of the service, which simplifies retrieval. We give implementation details of our system, while providing a simple way to express the crawling process, i.e., the crawl flow. We experimentally evaluate it on a variety of faceted crawls, depicting its efficacy for the online medium. 

George Valkanas, Antonia Saravanou, Dimitrios Gunopulos, "A Faceted Crawler for the Twitter Service ", Web Information System Engineering (WISE 2014), 2014
Published at
Web Information System Engineering
Related research area
No related research area
Related Organizations
No related organizations