TRACLUS Trajectory Clustering in C++ and R

Trajectory Clustering is an important non-trivial operation in the spatial analysis of movements. The main challenge is that sensible distances of trajectories are very difficult to compute. This is related to the fact that trajectories are not spatially local and that it is difficult to define and compute even basic clustering operations like nearest neighbors or even aggregation (of multiple trajectories into a representative one).

A classical approach called TRACLUS (Lee, Han, & Whang, 2007) avoids these problems by first splitting trajectories into sufficiently local objects (which can in practice be treated much like points) followed by a density-based clustering of these (DBSCAN with some special distances being proposed) cleaned up by an aggregation and representation phase.

As we did not find a clear and efficient implementation, we decided to create one. It is written in modern C++ with OpenMP for multiprocessing. It does not include spatial indices for speeding up the queries as this is only sensible when you know properties of the segmentation. However, boost::geometry can provide you with nicely bulk-loaded R*-trees if it comes to you.

In addition, we encapsulate this into the trajcomp R package, which is under active development, not yet ready for CRAN, but ready to user by scientists..

Downloads / Further Information

  1. Lee, J.-G., Han, J., & Whang, K.-Y. (2007). Trajectory clustering: a partition-and-group framework. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data (pp. 593–604). ACM. [BibTeX]