Space-Partitioning-Based Bulk-Loading for the NSP-Tree in Non-ordered Discrete Data Spaces

Gang Qian,Hyun-Jeong Seok,Qiang Zhu,Sakti Pramanik

DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS（2008）

引用 6|浏览1

暂无评分

摘要

Properly-designed bulk-loading techniques are more efficient than the conventional tuple-loading method in constructing a multidimensional index tree for a large data set. Although a number of bulk-loading algorithms have been proposed in the literature, most of them were designed for continuous data spaces (CDS) and cannot be directly applied to non-ordered discrete data spaces (NDDS). In this paper, we present a new space-partitioning-based bulk-loading algorithm for the NSP-tree -- a multidimensional index tree recently developed for NDDSs . The algorithm constructs the target NSP-tree by repeatedly partitioning the underlying NDDS for a given data set until input vectors in every subspace can fit into a leaf node. Strategies to increase the efficiency of the algorithm, such as multi-way splitting, memory buffering and balanced space partitioning, are employed. Histograms that characterize the data distribution in a subspace are used to decide space partitions. Our experiments show that the proposed bulk-loading algorithm is more efficient than the tuple-loading algorithm and a popular generic bulk-loading algorithm that could be utilized to build the NSP-tree.

查看译文

关键词

multidimensional index tree,large data,proposed bulk-loading algorithm,new space-partitioning-based bulk-loading algorithm,space-partitioning-based bulk-loading,popular generic bulk-loading algorithm,properly-designed bulk-loading technique,bulk-loading algorithm,continuous data space,data distribution,non-ordered discrete data spaces,tuple-loading algorithm,indexation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要