Density-Contour Based Framework for Spatio-Temporal Clustering and Event Tracking in Twitter



Journal Title

Journal ISSN

Volume Title



Due to the advances in remote sensors and sensor networks, different types of spatio-temporal datasets have become increasingly available. Revealing interesting spatio-temporal patterns from such datasets is very important, as it has broad applications, such as understanding climate change, epidemics detection, and earthquake analysis. The main focus of this research is the development of spatio-temporal clustering frameworks.

In this dissertation, we introduce a density-contour based framework for spatio-temporal clustering including several novel serial, density-contour based spatio-temporal clustering algorithms: ST-DCONTOUR, ST-DPOLY, and ST-COPOT. They all rely on a three-phase clustering approach, which takes the point cloud stream as input and divides it into batches based on fixed-size time windows. Next, a density estimation approach and contouring algorithms are employed to obtain spatial clusters as polygon models. Finally, spatio-temporal clusters are formed by identifying continuing relationships between spatial clusters in consecutive batches. The framework was successfully applied to New York City (NYC) taxi trips data. The experimental results show that all the algorithms can effectively discover interesting spatio-temporal patterns in taxi-pickup-location streams.

Recently, Twitter, one of the fastest-growing microblogging services, induced lots of research; one hot topic was event detection from tweets. Since geo-tagged tweets can be viewed as location streams with time tags and the content of tweets, we propose a novel two-stage system to detect and track events from Twitter by integrating an LDA-based approach with the density-contour based spatio-temporal clustering approach we introduced earlier. In the proposed system, events were identified as topics in tweets using an LDA-based (Latent Dirichlet Allocation) topic discovery step. Next, each tweet was assigned an event label. After all locations were extracted from each event, the spatio-temporal approach was employed to obtain event clusters and track their temporal continuity. Through some case studies, we demonstrated the effectiveness of the proposed system. In summary, we aimed to acquire not only the semantic aspect of the events, but also the geographic distribution of the events and their continuity along time. Such information can be used to help individuals, corporations, or government organizations to stay informed of ``what is happening now" and to acquire actionable knowledge.



Spatio-temporal clustering, Event tracking


Portions of this document appear in: Zhang, Yongli, and Christoph F. Eick. "Novel clustering and analysis techniques for mining spatio-temporal data." In Proceedings of the 1st ACM SIGSPATIAL PhD Workshop, p. 2. ACM, 2014. And in: Y. Zhang and C. F. Eick. St-copot: Spatio-temporal clustering with contour polygon trees. In Proc. ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 84:1-84:4, Redondo Beach, CA, USA, November 7-10 2017. And in: Y. Zhang and C. F. Eick. St-dcontour: a serial, density-contour based spatiotemporal clustering approach to cluster location streams. In Proc. ACM SIGSPATIAL International Workshop on GeoStreaming, page 5, San Francisco, CA, USA, October 31 - November 3 2016. And in: Y. Zhang, S. Wang, A. M. Aryal, and C. F. Eick. "Serial" versus "parallel": a comparison of spatio-temporal clustering approaches. In Proc. International Symposium on Methodologies for Intelligent Systems, pages 396-403, Warsaw, Poland, June 26-29 2017. And in: Y. Zhang and C. F. Eick. A novel two-stage system for detecting and tracking events in twitter. In Proc. IEEE International Conference on Artificial Intelligence and Knowledge Engineering, pages 77-84, Laguna Hills, CA, USA, September 26-28 2018. And in: R. Banerjee, K. Elgarroussi, S. Wang, Y. Zhang, and C. F. Eick. Tweet emotion mapping: Understanding us emotions in time and space. In Proc. IEEE International Conference on Artificial Intelligence and Knowledge Engineering, pages 93-100, Laguna Hills, CA, USA, September 26-28 2018.