A Fast Method for Checkpointing Asynchronously in Arbit arily Configured Main-Memory OLTP Systems

semanticscholar(2012)

引用 0|浏览10
暂无评分
摘要
Traditional database tautology has sought to ensure that any database system maintains so-called ACID-compliance. This model seeks to ensure that all transactions processed in a storage system are atomic, consistent, isolated, and durable [2]. The final characteristic, durability, refers t o the fact that any transaction that has been committed to the database must be recoverable in the event of a node failure [10]. The increased availability and dramatically reduced cost of high-speed random-access memory, which is generally several orders of magnitude faster than hard disk storage, h as resulted in the widespread use of database systems that are executed mostly or entirely in main memory [8]. In order to avoid data loss that necessarily occurs when volatile memor y is reset during a node failure, several checkpointing proto cols have been developed to periodically write the contents of memory to disk. ARIES [16], often considered the golden standard for checkpointing, uses write ahead logging along with redo logging and logical undo operations to recover a node that has experienced some form of failure. Recent improvements on this highly generalized method for databas e recovery have focused on leveraging specific aspects of the system they operate in to reduce the amount of time spent capturing a global snapshot. For example, Cao et. al discuss Ping-Pong and Zig-Zag [3], systems that achieve extremely short checkpoint periods in frequently consiste nt applications. However, this protocol relies heavily on the assumption that the database is guaranteed several instanc e in time where all transactions are committed and no effects of uncommitted transactions are reflected in the data layer. These are referred to as “physical points of consistency” and, although often found in common applications such as massively multiplayer online games, limit the frequency wi th which checkpoints can be captured. Simultaneously, several popular distributed storage systems have begun to depart from consistency guarantees across replicated data centers. These products, including Google’s BigTable [5], Amazon’s Dynamo [6], and Facebook’s Cassandra [13], use the CAP theorem [9] to explain their non-compliance with desired ACID properties. This theorem states that reduced guarantees in crossreplicati on consistency are the only manner in which the system can remain globally available around the clock. Reduced guarantees of consistency in a distributed, multiply replicate d system further complicate the ability to capture a global snapshot. However, recent work has signaled a return to traditional views on the need for databases, even those replicated and distributed, to be ACID-compliant. Calvin [18][19], the distributed and synchronously replic ated storage system this checkpointing scheme is implemented as part of, achieves global consistency through a replication of inputs rather than effects, avoiding the prohibitiv ely expensive contention costs that had previously impeded the prevalence of systems supporting distributed transaction s. Our protocol is based loosely on work developed on multi-versioned “historical queries” in the HARBOR [14] recovery and failover system, as well as the notion of points of consistency exploited by Ping-Pong and Zig-Zag[3]. We present herein a method where, when a serial ordering of transaction inputs is guaranteed, global system checkpoin ts can be captured without stopping the database’s execution, while only requiring at most a duplication of the storage layer. Furthermore, because the protocol relies on guarant ees of a serial transaction ordering, only a “virtual” point of consistency is required, rather than a precise moment in tim e at which the entire data layer is consistent. The rest of this paper proceeds as follows. In Section II we discuss some background that we will leverage in accomplishing the goal of capturing cheap, fast snapshots in a database system. In Section III, we describe CALC, our contribution to snapshot capture in OLTP systems. Section IV presents our experimental setup, and Section V describes the results observed in our experiments. Finally, in Sectio ns VI and VII we examine related work and conclude.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要