Space-Partitioning-Based Bulk-Loading for the NSP-Tree in Non-ordered Discrete Data Spaces

DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS(2008)

引用 6|浏览1
暂无评分
摘要
Properly-designed bulk-loading techniques are more efficient than the conventional tuple-loading method in constructing a multidimensional index tree for a large data set. Although a number of bulk-loading algorithms have been proposed in the literature, most of them were designed for continuous data spaces (CDS) and cannot be directly applied to non-ordered discrete data spaces (NDDS). In this paper, we present a new space-partitioning-based bulk-loading algorithm for the NSP-tree -- a multidimensional index tree recently developed for NDDSs . The algorithm constructs the target NSP-tree by repeatedly partitioning the underlying NDDS for a given data set until input vectors in every subspace can fit into a leaf node. Strategies to increase the efficiency of the algorithm, such as multi-way splitting, memory buffering and balanced space partitioning, are employed. Histograms that characterize the data distribution in a subspace are used to decide space partitions. Our experiments show that the proposed bulk-loading algorithm is more efficient than the tuple-loading algorithm and a popular generic bulk-loading algorithm that could be utilized to build the NSP-tree.
更多
查看译文
关键词
multidimensional index tree,large data,proposed bulk-loading algorithm,new space-partitioning-based bulk-loading algorithm,space-partitioning-based bulk-loading,popular generic bulk-loading algorithm,properly-designed bulk-loading technique,bulk-loading algorithm,continuous data space,data distribution,non-ordered discrete data spaces,tuple-loading algorithm,indexation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要