Matrix-chain multiplication

We are given a sequence (chain) (M1, M2, M3…Mn) of n matrices to be multiplied and we need to find the most efficient way to multiply these matrices together. Now, if the chain of matrices is M1*M2*M3*M4, then it could be evaluated in various distinct ways without affecting the final product:

(M1(M2(M3*M4)))
(M1((M2*M3)M4))

As we know, that the cost to multiply two matrices A [p][q] and B[q][r] is pqr. Also matrix-multiplication could only be performed if number of columns of A is equal to number of rows of B.

[1] Now, the way we parenthesize a chain of matrices can have dramatic impact on the cost of evaluating the product. Consider the following chain of matrices (M1, M2, M3) with dimensions as M1(10 x 100), M2(100 x 5) and M3(5 x 50):

  1. Let’s multiply in this order (M1(M2*M3)). It leads to:

100*5*50 + 10*100*50 = 25000 + 50000 = 75000 computational operations

2. Let’s multiply in the following order ((M1*M2)*M3). It leads to:

10*100*5 + 10*5*50 = 5000 + 2500 = 7500 operations

Now, we have to find the most efficient way to multiply a chain of matrices so that we have minimum number of computational operations. This problem could be solved by Dynamic programming paradigm.

Optimal substructure

Let M[i][j] = min (i<=k<=j) {M[i, k] + M[k+1, j] + p_i-1*p_k*p_j},
where M[i][j] is the minimum number of multiplication operations needed for matrices from i to j 
Let the matrices be:
M1 = 5 x 4
M2 = 4 X 6
M3 = 6 x 2
M4 = 2 x 7
Matrix Chain Multiplication

Pseudocode and Iterations

Let p = {5, 4, 6, 2, 7}, for matrices M1(5×4), M2(4×6), M3(6×2), M4(2×7). Also M[i][i] = {0} ie M[1][1] = M[2][2] = M[3][3] = M[4][4] = 0 (Base case, no operation for matrix multiplying with itself).

Referencing the image above, we need to compute M[1][4]:

Using M[i][j] = min (i<=k<j) {M[i, k] + M[k+1, j] + p_i-1*p_k*p_j}
1. M[1][2] = min (1<=k<2) {M[1, 1] + M[2, 2] + p_0*p_1*p_2}
           = min {0 + 0 + 120} = 120
2. M[2][3] = min (2<=k<3) {M[2, 2] + M[3, 3] + p_1*p_2*p_3}
           = min {0 + 0 + 48} = 48
3. M[3][4] = min (3<=k<4) {M[3, 3] + M[4, 4] + p_2*p_3*p_4}
           = min {0 + 0 + 84} = 84
4. M[1][3] = min (1<=k<3) {{M[1, 1] + M[2, 3] + p_0*p_1*p_3},
                           {M[1, 2] + M[3,3] + p_0*p_2*p_3}}
           = min {{0 + 48 + 40}, {120 + 0 + 60}} = {88, 180} 
           = 88 
5. M[2][4] = min (2<=k<4) {{M[2, 2] + M[3, 4] + p_1*p_2*p_4},
                           {M[2, 3] + M[4,4] + p_1*p_3*p_4}}
          = min {{0 + 84 + 168}, {48 + 0 + 56}} = {252, 104} 
          = 104 
6. M[1][4] = min (1<=k<4) {{M[1, 1] + M[2, 4] + p_0*p_1*p_4},
                           {M[1, 2] + M[3,4] + p_0*p_2*p_4},
                           {M[1, 3] + M[4,4] + p_0*p_3*p_4}}
           = min {{0 + 104 + 140}, {120 + 84 + 210}, 
               {88 + 0 + 70}} = {244, 414, 158} 
           = 158

Matrix-chain multiplication solution

Code Implementation

//
//  main.cpp
//  Matrix Chain Multiplication
//
//  Created by Himanshu on 17/09/21.
//

#include <iostream>

#include <iostream>
using namespace std;
const int N = 5;

void MatrixChainMultiplication (int p[]) {
 
    int M[N][N];
    for (int i=0; i<N; i++) {
        for (int j=0; j<N; j++) {
            if (i == j) {
                M[i][j] = 0;
            } else {
                M[i][j] = INT_MAX;
            }
        }
    }

    // c is the length of chained matrices
    for (int c=2; c<N; c++) {
        for (int i=1; i<(N-c+1); i++) {
            int j = i+c-1;
            for (int k=i; k<j; k++) {
                M[i][j] = min(M[i][j], (M[i][k] + M[k+1][j] + p[i-1]*p[k]*p[j]));
            }
        }
    }
    
    cout<<M[1][N-1]<<endl;
    
}
 
int main() {
    //For matrices M1(5x4), M2(4x6), M3(6x2), M4(2x7)
    int A[] = {5, 4, 6, 2, 7};
    MatrixChainMultiplication(A);
}


Output:

158

Here’s a working example:

https://ideone.com/oi9692

Reference:

[1] Introduction to Algorithms

Leave a Reply

Your email address will not be published.