CommunityDiff: Visualizing Community Clustering Algorithms
Srayan Datta and Eytan Adar

Community detection is an oft-used analytical function of network analysis but can be a black art to apply in practice. Grouping of related nodes is important for identifying patterns in network datasets but also notoriously sensitive to input data and algorithm selection. This is further complicated by the fact that, depending on domain and use case, the ground truth knowledge of the end-user can vary from none to complete. In this work, we present CommunityDiff, an interactive visualization system that combines visualization and active learning (AL) to support the end-user's analytical process. As the end-user interacts with the system, a continuous refinement process updates both the community labels and visualizations. CommunityDiff features a mechanism for visualizing ensemble spaces, weighted combinations of algorithm output, that can identify patterns, commonalities, and differences among multiple community detection algorithms. Among other features, CommunityDiff introduces an AL mechanism that visually indicates uncertainty about community labels to focus end-user attention and supporting end-user control that ranges from explicitly indicating the number of expected communities to merging and splitting communities. Based on this end-user input, CommunityDiff dynamically recalculates communities. We demonstrate the viability of our through a study of speed of end-user convergence on satisfactory community labels. As part of building CommunityDiff, we describe a design process that can be adapted to other Interactive Machine Learning applications.

Available as: PDF (1.2Mb)

Srayan Datta and Eytan Adar. 2018. CommunityDiff: Visualizing Community Clustering Algorithms. ACM Trans. Knowl. Discov. Data 12, 1, Article 11 (January 2018)