TrajMesa: A Distributed NoSQL-Based Trajectory Data Management System

IEEE Transactions on Knowledge and Data Engineering(2023)

引用 23|浏览173
暂无评分
摘要
With the development of positioning technology, a large number of trajectories have been generated, which are very useful for many urban applications. However, it is challenging to manage trajectory data for its spatio-temporal dynamics and high-volume properties. Existing trajectory data management frameworks suffer from efficiency or scalability problem, and support only limited trajectory query types. This paper takes the first attempt to build a holistic distributed NoSQL trajectory query engine, named TrajMesa, based on GeoMesa, an open-source indexing toolkit for spatio-temporal data. TrajMesa can manage a prohibitively large number of trajectories, and support plenty of query types efficiently. Specifically, we first design a novel trajectory storage schema, which reduces the storage size tremendously. We then devise a novel indexing key schema for time ranges, based on which ID (i.e., moving object identifier) temporal query can be supported efficiently. To reduce the amount of retrieved trajectory data for a spatial range query, we propose a position code to indicate the spatial location of trajectories accurately. We also propose a bunch of pruning strategies for similarity query and $k$ -NN query in the NoSQL environment. Extensive experiments are conducted using two real datasets and one synthetic dataset, verifying the powerful query efficiency and scalability of TrajMesa. The results show that TrajMesa is about $100 \sim 1000$ times faster than the state-of-the-art trajectory management frameworks in our experimental settings. TrajMesa is currently deployed in JD company, processing over 1T trajectories of JD Logistics every day.
更多
查看译文
关键词
Trajectory data management,distributed NoSQL storage,spatio-temporal indexing and query processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要