Range Partitioning Within Sublinear Time in the External Memory Model.

AAIM（2020）

引用 0|浏览6

暂无评分

摘要

Range partitioning is a popular method for processing massive data, whose task is to divide the input N data items into k ranges of the same size. To avoid accessing the whole input, in the RAM model, sampling based \\((\\epsilon ,\\delta )\\)-approximation algorithms with \\(O(\\frac{k\\log (N/\\delta )}{\\epsilon ^2})\\) time cost have been well studied. However, massive data may be too large to be maintained in the main memory. Usually, they are stored in the external memory devices and need to design I/O efficient algorithms in the external memory model. Then, a natural question is whether or not there are efficient range partitioning algorithms with \\(O(\\frac{k\\log (N/\\delta )}{B\\epsilon ^2})\\) I/O cost. To answer the above question, this paper studies the range partitioning problem in the external memory model. Two lower bounds of the sampling cost required by the external sublinear range partitioning algorithms are proved, which show that it needs to make a full scan of the input in the worst case. Motivated by the hard instances utilized in the proof of lower bounds, a model for describing the inputs of the range partitioning problem in practical applications is proposed. Finally, for the special case that input data are generated by the proposed model, a nearly optimal algorithm with \\(O(\\frac{k\\log (N/\\delta )}{wB\\epsilon ^2})\\) I/O cost is introduced.

查看译文

关键词

sublinear time,memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要