Task
List of Benchmark Tasks
Tasks are list below with implemented baselines.
Influence Maximization
Given the graph structure information $G = (V, E, A)$ as the input and a pre-defined seed budget $k\in \mathbb{N}^{+}$, IM methods attempt to optimize the selection of a $k$-sized source node set $\Omega$ to maximize the expected influence spread $|\sigma(\Omega)|$ which is the size of the final activated subgraph $\sigma(\Omega)$. Except for the heuristic methods such as PageRank, degree, and eigen-centrality, the simulation-based and proxy-based methods also require a known diffusion model $D$ such as IC, LT, SI, or SIR. Thus, when the budget requirement is met ($|\Omega| = k$), IM aims at finding a seed set $\Omega$ such that: $$ \Omega = argmax_{\Omega^*}|\sigma(\Omega^*|G, D)|. $$
Baselines
- simulation: greedy, CELF, and CELF++,
- proxy: PI ($\pi$), sigma, degree, and eigen-centrality
- sketch: RIS, SKIM, IMM
Blocking Maximization
IBM problem can be either against a set of known diffusion seeds or to prevent any future diffusion on the network. On the one hand, when graph structure $G = (V, E, A)$, diffusion model $D$, and the seed set $\Omega$ are known inputs, an IBM method aims to find part of $G$, namely $\Delta G$ such that the expected influence spread $|\sigma(\Omega)|$ is minimized if $\Delta G$ is blocked before the diffusion starts. Note that $\Delta G$ can be a set of nodes or edges, and there is usually a budget constraining the size of $\Delta G$. On the other hand, if the diffusion seed set is unknown, the goal of an IBM method is to minimize the expected influence spread of any diffusion starting from any set of seeds. This is usually achieved by removing $\Delta G$ from $G$ to weaken its connectivity. Annotating $G_{new} = G - \Delta G$, we have $$ \Delta G = argmin_{\Delta G^*}|\sigma(\Omega|G_{new}, D)| $$
Baselines
Source Localization
Given a snapshot of graph $G = (V, E, A)$ as the observation of the diffusion status at a certain time, the activated subgraph $\sigma$ of the snapshot is the propagation result of an unknown seed set $\Omega$. The goal of an SL method is to find out $\Omega$ based on the activated subgraph $\sigma$, the graph structure $G$, and the diffusion model $D$. It is achieved by finding the most likely seed set resulting in the activated subgraph $\sigma$. $$ \Omega = argmax_{\Omega^*}Pr(\Omega^*|\sigma, G, D) $$
Baselines
- NETSLEUTH (Legacy and Fast versions)
- Jordan Centrality
- LISN
Planned Extension
- Comparative Influence Maximation
- Dynamic Influence Maximization
- Percolation and Synchronization
- Computational Fluid Mechanism
- Multilayer Dynamics
- Higher-order Dynamics (hypergraph and simplicial complex)
- Multi-Flow