TrajParquet: A Trajectory-Oriented Column File Format for Mobility Data Lakes

31ST ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2023(2023)

引用 1|浏览11
暂无评分
摘要
Columnar data formats, such as Apache Parquet, are increasingly popular nowadays for scalable data storage and querying data lakes, due to compressed storage and efficient data access via data skipping. However, when applied to spatial or spatio-temporal data, advanced solutions are required to go beyond pruning over single attributes and towards multidimensional pruning. Even though there exist solutions for geospatial data, such as GeoParquet and SpatialParquet, they fall short when applied to trajectory data (sequences of spatio-temporal positions). In this paper, we propose TrajParquet, a format for columnar storage of trajectory data, which is highly efficient and scalable. Also, we present a query processing algorithm that supports spatio-temporal range queries over TrajParquet. We evaluate TrajParquet using real-world data sets and in comparison with extensions of GeoParquet and SpatialParquet, suitable for handling spatio-temporal data.
更多
查看译文
关键词
Apache Parquet,spatio-temporal,trajectories,mobility data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要