String join using precedence count matrix

Xia Cao,Anthony K. H. Tung,Beng Chin Ooi,Kian-Lee Tan,Shuai Cheng Li

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management（2004）

引用 6|浏览2

暂无评分

摘要

In this paper; we propose a filter-and-refine string join algorithm. While the filtering phase can rapidly prune away strings that are not joinable, the refinement phase employs a comprehensive algorithm to remove the remaining false alarms. The efficiency of the proposed scheme lies in the use of the precedence count matrix (PCM) for computing the edit distance between two sequences. With PCM, the complexity of sequence comparison is a constant time. We also evaluated the proposed sequence join algorithm, and our study shows that it outperforms the known techniques.

查看译文

关键词

relational databases,genomic applications,string matching,sequence comparison,string pruning,precedence count matrix,genetics,proposed schemelies,scientific information systems,query languages,string similarity,false alarm removal,string refinement,sequence edit distance computing,constant time complexity,filter-and-refine string join algorithm,proposed sequence,string data manipulation,remainingfalse alarm,sequence join algorithm,filter-and-refine string joinalgorithm,pruneaway string,dna,comprehensive algorithm,string join,distributed databases,dna sequences,bioinformatics,assembly,genomics,finance,computer science,dynamic programming,edit distance

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要