We are given a sequence (chain) (M1, M2, M3…Mn) of n matrices to be multiplied and we need to find the most efficient way to multiply these matrices together. Now, if the chain of matrices is M1*M2*M3*M4, then it could be evaluated in various distinct ways without affecting the final product:
(M1*(M2*(M3*M4))) (M1*((M2*M3)*M4))
As we know, that the cost to multiply two matrices A[p][q] and B[q][r] is pqr. Also matrix-multiplication could only be performed if number of columns of A is equal to the number of rows of B.
[1] Now, the way we parenthesize a chain of matrices can have dramatic impact on the cost of evaluating the product. Consider the following chain of matrices (M1, M2, M3) with dimensions as M1(10 x 100), M2(100 x 5) and M3(5 x 50):
1. Let’s multiply in this order (M1*(M2*M3)). It leads to:
100*5*50 + 10*100*50 = 25000 + 50000 = 75000 computational operations
2. Let’s multiply in the following order ((M1*M2)*M3). It leads to:
10*100*5 + 10*5*50 = 5000 + 2500 = 7500 computational operations
Now, we have to find the most efficient way to multiply a chain of matrices so that we have minimum number of computational operations. This problem could be solved by Dynamic programming paradigm.
Optimal substructure
Let M[i][j] = min (i<=k<=j) {M[i, k] + M[k+1, j] + pi-1*pk*pj}, where M[i][j] is the minimum number of multiplication operations needed for matrices from i to j Let the matrices be: M1 = 5 x 4 M2 = 4 X 6 M3 = 6 x 2 M4 = 2 x 7
Pseudocode and Iterations
Let p = {5, 4, 6, 2, 7}, for matrices M1(5×4), M2(4×6), M3(6×2), M4(2×7). Also M[i][i] = {0} ie M[1][1] = M[2][2] = M[3][3] = M[4][4] = 0 (Base case, no operation for matrix multiplying with itself).
Referencing the image above, we need to compute M[1][4]:
Using M[i][j] = min (i<=k<j) {M[i, k] + M[k+1, j] + pi-1*pk*pj}
1. M[1][2] = min (1<=k<2) {M[1, 1] + M[2, 2] + p0*p1*p2} = min {0 + 0 + 120} = 120
2. M[2][3] = min (2<=k<3) {M[2, 2] + M[3, 3] + p1*p2*p3} = min {0 + 0 + 48} = 48
3. M[3][4] = min (3<=k<4) {M[3, 3] + M[4, 4] + p2*p3*p4} = min {0 + 0 + 84} = 84
4. M[1][3] = min (1<=k<3) {{M[1, 1] + M[2, 3] + p0*p1*p3}, {M[1, 2] + M[3,3] + p0*p2*p3}} = min {{0 + 48 + 40}, {120 + 0 + 60}} = {88, 180} = 88
5. M[2][4] = min (2<=k<4) {{M[2, 2] + M[3, 4] + p1*p2*p4}, {M[2, 3] + M[4,4] + p1*p3*p4}} = min {{0 + 84 + 168}, {48 + 0 + 56}} = {252, 104} = 104
6. M[1][4] = min (1<=k<4) {{M[1, 1] + M[2, 4] + p0*p1*p4}, {M[1, 2] + M[3,4] + p0*p2*p4}, {M[1, 3] + M[4,4] + p0*p3*p4}} = min {{0 + 104 + 140}, {120 + 84 + 210}, {88 + 0 + 70}} = {244, 414, 158} = 158
Matrix-chain multiplication solution
Code Implementation
//
// main.cpp
// Matrix Chain Multiplication
//
// Created by Himanshu on 17/09/21.
//
#include <iostream>
#include <iostream>
using namespace std;
const int N = 5;
void MatrixChainMultiplication (int p[]) {
int M[N][N];
for (int i=0; i<N; i++) {
for (int j=0; j<N; j++) {
if (i == j) {
M[i][j] = 0;
} else {
M[i][j] = INT_MAX;
}
}
}
// c is the length of chained matrices
for (int c=2; c<N; c++) {
for (int i=1; i<(N-c+1); i++) {
int j = i+c-1;
for (int k=i; k<j; k++) {
M[i][j] = min(M[i][j], (M[i][k] + M[k+1][j] + p[i-1]*p[k]*p[j]));
}
}
}
cout<<M[1][N-1]<<endl;
}
int main() {
//For matrices M1(5x4), M2(4x6), M3(6x2), M4(2x7)
int A[] = {5, 4, 6, 2, 7};
MatrixChainMultiplication(A);
}
Output:
158
Here’s a working example: Ideone
Reference
[1] Introduction to Algorithms
One thought on “Matrix-chain multiplication”