# Estimating Percolation in a Matrix Network

Percolation is a concept in statistical physics and probability theory that deals with the behavior of connected clusters in random networks. It has applications in various fields, including material science, hydrology, and computer science. Let’s discuss how to estimate the percolation using dynamic connectivity algorithms.

###### What is Percolation?

Percolation refers to the movement and filtering of fluids through porous materials, such as soil, rocks, or networks of interconnected nodes. In the context of percolation theory, we often consider a grid of sites arranged in an N by N matrix, where each site can be either open or blocked. The objective is to study the emergence of a path of open sites from the top to the bottom of the grid, which signifies percolation.

###### Estimating Percolation
1. Setup: Create an N by N grid of sites, where each site can be either open (allowing fluid flow) or blocked (preventing fluid flow).
2. Percolation: Determine if there exists a path of open sites from the top row to the bottom row of the grid. If such a path exists, we say that percolation has occurred.
3. Estimation: Estimate the percolation, which is the probability at which the system transitions from being mostly blocked to mostly percolated.
###### Dynamic Connectivity Algorithm

Dynamic connectivity algorithms provide an efficient way to model and analyze percolation in a grid of sites. One such algorithm is the union-find (or disjoint-set) data structure, which efficiently maintains a collection of disjoint sets and supports operations to merge sets and determine connectivity between elements.

Union-Find Algorithms are discussed in detail in these posts:

Note: The percolation problem can also be solved by using a BFS (Breadth-First Search) like algorithm, similar to solving the “rat in a maze” problem. It involves traversing the grid using BFS, exploring neighboring cells and checking if there is a path from the top row to the bottom row.

While BFS is an efficient algorithm, its time complexity can be higher compared to the union-find algorithm, especially for larger grids and use-cases where the grid may change dynamically. This is due to the need to explore all possible paths. In general, BFS might be better for smaller, more static scenarios or where you need simple implementations without additional data structures.

###### Pseudocode
Percolation Simulation Pseudocode:

1. Initialize a grid of size N by N, where each cell represents a site.

2. Create a Union-Find data structure to manage connectivity between sites.

3. Initialize all sites as blocked.

4. Repeat the following steps to detect percolation:
a. Randomly choose a site to open.
b. Open the chosen site.
c. Connect the opened site to its neighboring open sites.
d. Check if there is a path from any site in the top row to any site in the bottom row using the Union-Find data structure.
Union-Find Pseudocode:

Initialize(N * N):
Create an array id[] of size N x N to represent each site.
Initialize each site's id to its index (i.e., id[i] = i).

Find(p):
Return the root (id of the set p belongs to) of the
component containing site p.

Union(p, q):
Connect the root of the component containing site p
to the root of the component containing site q.
(Set id[root(p)] = root(q)).

isOpen(row, col):
Return true if the site at (row, col) is open.

percolates():
Check if any site in the top row is connected
to any site in the bottom row.
###### Virtual Top Site & Virtual Bottom Site Optimisation

In the original approach where we don’t use virtual sites, the percolates() method checks if there exists a path of open sites from the top row to the bottom row. For this we iterate through each site in the top row and check if it is connected to any site in the bottom row.

percolates() method for this approach would be:

bool percolates (int N) {

for (int i = 0; i < N; ++i) {
for (int j = 0; j < N; ++j) {
// Check if a cell in the top row
// is connected to any cell in the bottom row
if (sitesSet.connected(i, (N - 1) * N + j)) {
return true;
}
}
}

return false; // Doesn't percolate if no such path exists
}

In the optimised approach, we connect all top row sites to a virtual top site (identified by the index number: 0) and all bottom row sites to a virtual bottom site (identified by the index number: N*N + 1).
And to check for percolation, we only need to know if the virtual top site is connected to the virtual bottom site. This approach reduces the number of iterations needed to check for percolation, making it more efficient compared to iterating through all sites in the top and bottom row to find if the grid percolates.

Here’s how the percolates() method would look for the optimised approach:

bool percolates() {
// Percolates if virtual top site is
// connected to virtual bottom site
return sitesSet.connected(0, N * N + 1);
}

In terms of implementation, the modified approach simplifies the code by requiring only one connect call to each virtual site, thus reducing the complexity of the code and potentially improving performance.

Code Implementation

//
//  main.cpp
//  Percolation
//
//  Created on 21/03/24.
//

#include <iostream>
#include <vector>
using namespace std;

class UnionFind {
private:
vector<int> id;

int find (int p) {
while (p != id[p]) {
id[p] = id[id[p]]; // path compression
p = id[p];
}

return p;
}

public:
UnionFind (int n) {
id.resize(n);

for (int i = 0; i < n; i++) {
id[i] = i;
}
}

bool connected (int p, int q) {
return find(p) == find(q);
}

void makeUnion (int p, int q) {
int rootP = find(p);
int rootQ = find(q);

if (rootP == rootQ) return;

id[rootP] = rootQ;
}
};

class Percolation {
private:
int N;
vector<bool> openSites;
UnionFind sitesSet;

public:
Percolation (int n) : N(n), sitesSet(n * n + 2) { // Initialization List
openSites.resize(n * n + 2, false);
openSites[0] = true;           // setting virtual top site as open
openSites[n*n + 1] = true;     // setting virtual bottom site as open
}

int toIndex (int row, int col) {   // flattening the 2D-matrix into a 1D-array
return row * N + col + 1;
}

bool isValid (int row, int col) {
return row >= 0 && row < N && col >= 0 && col < N;
}

void open (int row, int col) {

if (!isValid(row, col)) {
throw invalid_argument("Invalid row or column indices.");
}

int index = toIndex(row, col);

if (!isOpen(row, col)) {
openSites[index] = true;

if (row > 0 && isOpen(row - 1, col)) {
sitesSet.makeUnion(index, toIndex(row - 1, col));
}
if (row < N - 1 && isOpen(row + 1, col)) {
sitesSet.makeUnion(index, toIndex(row + 1, col));
}
if (col > 0 && isOpen(row, col - 1)) {
sitesSet.makeUnion(index, toIndex(row, col - 1));
}
if (col < N - 1 && isOpen(row, col + 1)) {
sitesSet.makeUnion(index, toIndex(row, col + 1));
}
if (row == 0) {
sitesSet.makeUnion(index, 0); // connecting site to the virtual top site
}
if (row == N - 1) {
sitesSet.makeUnion(index, N * N + 1); // connecting site to the virtual bottom site
}
}
}

bool isOpen (int row, int col) {
if (!isValid(row, col)) {
throw invalid_argument("Invalid row or column indices.");
}
return openSites[toIndex(row, col)];
}

bool percolates() {
return sitesSet.connected(0, N * N + 1);
}
};

int main() {
int N = 5;
Percolation perc(N);

perc.open(0, 0);
perc.open(0, 1);
perc.open(0, 3);
perc.open(1, 3);
perc.open(2, 3);
perc.open(2, 4);
perc.open(3, 1);
perc.open(3, 2);
perc.open(4, 0);
perc.open(4, 1);
perc.open(4, 3);
perc.open(4, 4);

cout << "Percolates: " << (perc.percolates() ? "Yes" : "No") << endl;

return 0;
}

Output

Percolates: No

Time complexity: O(N2 * N2), N x N is the size of the grid.
Auxiliary Space: O(N2), for Grid and Union-Find data structure

Explanation

Time Complexity
Initialization:
O(N2) – Initializing the grid of size N by N.
Finding the Root: O(N2) – Finding the root of a component (grid cell) in the Union-Find data structure typically takes linear time with respect to the number of components.
Union Operation: O(N2) – Connecting two components (grid cells) by updating their roots takes linear time with respect to the number of components.

Thus, overall Time Complexity is O(N2 * N2). Since, the number of grid cells is N2. This could be reduced to O(N2 * log N) if we use Weighted Quick Union algorithm, which is discussed in the above mentioned article.

Space Complexity
Grid: O(N2) – The grid itself requires space proportional to the square of the grid size.
Union-Find Data Structure: O(N2) – The Union-Find data structure also requires space proportional to the number of sites, which is N2.

Thus, overall Space Complexity is O(N2).