Task

List of Benchmark Tasks

Tasks are list below with implemented baselines.

Influence Maximization

Given the graph structure information $G = (V, E, A)$ as the input and a pre-defined seed budget $k\in \mathbb{N}^{+}$, IM methods attempt to optimize the selection of a $k$-sized source node set $\Omega$ to maximize the expected influence spread $|\sigma(\Omega)|$ which is the size of the final activated subgraph $\sigma(\Omega)$. Except for the heuristic methods such as PageRank, degree, and eigen-centrality, the simulation-based and proxy-based methods also require a known diffusion model $D$ such as IC, LT, SI, or SIR. Thus, when the budget requirement is met ($|\Omega| = k$), IM aims at finding a seed set $\Omega$ such that: $$ \Omega = argmax_{\Omega^*}|\sigma(\Omega^*|G, D)|. $$

Baselines

simulation: greedy, CELF, and CELF++,
proxy: PI ($\pi$), sigma, degree, and eigen-centrality
sketch: RIS, SKIM, IMM

Blocking Maximization

IBM problem can be either against a set of known diffusion seeds or to prevent any future diffusion on the network. On the one hand, when graph structure $G = (V, E, A)$, diffusion model $D$, and the seed set $\Omega$ are known inputs, an IBM method aims to find part of $G$, namely $\Delta G$ such that the expected influence spread $|\sigma(\Omega)|$ is minimized if $\Delta G$ is blocked before the diffusion starts. Note that $\Delta G$ can be a set of nodes or edges, and there is usually a budget constraining the size of $\Delta G$. On the other hand, if the diffusion seed set is unknown, the goal of an IBM method is to minimize the expected influence spread of any diffusion starting from any set of seeds. This is usually achieved by removing $\Delta G$ from $G$ to weaken its connectivity. Annotating $G_{new} = G - \Delta G$, we have $$ \Delta G = argmin_{\Delta G^*}|\sigma(\Omega|G_{new}, D)| $$

Baselines

Source Localization

Given a snapshot of graph $G = (V, E, A)$ as the observation of the diffusion status at a certain time, the activated subgraph $\sigma$ of the snapshot is the propagation result of an unknown seed set $\Omega$. The goal of an SL method is to find out $\Omega$ based on the activated subgraph $\sigma$, the graph structure $G$, and the diffusion model $D$. It is achieved by finding the most likely seed set resulting in the activated subgraph $\sigma$. $$ \Omega = argmax_{\Omega^*}Pr(\Omega^*|\sigma, G, D) $$

Baselines

NETSLEUTH (Legacy and Fast versions)
Jordan Centrality
LISN

Planned Extension

Comparative Influence Maximation
Dynamic Influence Maximization
Percolation and Synchronization
Computational Fluid Mechanism
Multilayer Dynamics
Higher-order Dynamics (hypergraph and simplicial complex)
Multi-Flow

Last updated on Jun 14, 2023