Declarative generation of synthetic XML data

SOFTWARE-PRACTICE & EXPERIENCE(2006)

引用 15|浏览44
暂无评分
摘要
Synthetic data can be extremely useful in testing and evaluating algorithms, tools and systems. Most synthetic data generators available today are the result of individual benchmarking efforts. Typicallly, these are complex programs in which the specifications of both the structure and the contents of the data are hard-coded. As a result, it is often difficult to customize these tools for producing synthetic data tailored for specific needs. In this article, we describe the ToXgene synthetic data generator, which is a declarative tool for generating realistic XML data for benchmarking as well as testing purposes. We present our template specification language, which consists of augmenting XML Schema with probabilistic models that guide the data-generation process. We discuss the architecture of our current implementation and we argue about ToXgene's usefulness by discussing experimental results as well as describing two projects that use our tool. Copyright (C) 2006 John Wiley & Sons, Ltd.
更多
查看译文
关键词
XML,synthetic data,benchmarking,probabilistic generative models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要