Bilevel Relations and Their Applications to Data Insights.
CoRR(2023)
摘要
Many data-insight analytic tasks in anomaly detection, metric attribution,
and experimentation analysis can be modeled as searching in a large space of
tables and finding important ones, where the notion of importance is defined in
some adhoc manner. While various frameworks have been proposed (e.g., DIFF,
VLDB 2019), a systematic and general treatment is lacking. This paper describes
bilevel relations and operators. While a relation (i.e., table) models a set of
tuples, a bilevel relation is a dictionary that explicitly models a set of
tables, where each ``value'' table is identified by a ``key'' of a (region,
features) pair, where region specifies key attributes of the table, and
features specify columns of the table. Bilevel relational operators are
BilevelRelation-to-BilevelRelation transformations and directly analyze a set
of tables. Bilevel relations and operators provide higher level abstractions
for creating and manipulating a set of tables, and are compatible with the
classic relational algebra. Together, they allow us to construct bilevel
queries, which can express succinctly a range of insight-analytical questions
with ``search+eval'' character. We have implemented and deployed a query engine
for bilevel queries as a service, which is a first of its kind. Bilevel queries
pose a rich algorithm and system design space, such as query optimization and
data format, in order to evaluate them efficiently. We describe our current
designs and lessons, and report empirical evaluations. Bilevel queries have
found many useful applications, and have attracted more than 30 internal teams
to build data-insight applications with it.
更多查看译文
关键词
data insights
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要