Imbalanced Hypergraph Partitioning and Improvements for Consensus Clustering

Tools with Artificial Intelligence(2013)

引用 10|浏览6
暂无评分
摘要
Hypergraph partitioning is typically defined as an optimization problem wherein vertices are placed in separate parts (of a partition) such that the fewest number of hyperedges will span multiple parts. To ensure that parts have sizes satisfying user requirements, constraints are typically imposed. Under such constraints, the problem is known to be NP-Hard, so heuristic methods are needed to find approximate solutions in reasonable time. Circuit layout has historically been one of the most prominent application areas and has seen a proliferation of tools designed to satisfy its needs. Constraints in these tools typically focus on equal size parts, allowing the user to specify a maximum tolerance for deviation from that equal size. A more generalized constraint allows the user to define fixed sizes and tolerances for each part. More recently, other domains have mapped problems to hypergraph partitioning and, perhaps due to their availability, have used existing tools to perform partitioning. In particular, consensus clustering easily fits a hypergraph representation where each cluster of each input clustering is represented by a hyperedge. Authors of such research have reported partitioning tends to only have good results when clusters can be expected to be roughly the same size, an unsurprising result given the tools' focus on equal sized parts. Thus, even though many datasets have "natural" part sizes that are mixed, the current toolset is ill-suited to find good solutions unless those part sizes are known a priori. We argue that the main issue rests in the current constraint definitions and their focus measuring imbalance on the basis of the largest/smallest part. We further argue that, due to its holistic nature, entropy best measures imbalance and can best guide the partition method to the natural part sizes with lowest cut for a given level of imbalance. We provide a method that finds good approximate solutions under an entropy constraint and further introduce the notion of a discount cut, which helps overcome local optima that frequently plague k-way partitioning algorithms. In comparison to today's popular tools, we show our method returns sizable improvements in cut size as the level of imbalance grows. In consensus clustering, we demonstrate that good solutions are more easily achieved even when part sizes are not roughly equal.
更多
查看译文
关键词
best measures imbalance,smallest part,k-way partitioning algorithm,multiple part,part size,good solution,natural part size,imbalanced hypergraph partitioning,consensus clustering,hypergraph partitioning,equal size part,separate part,graph theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要