Matrix Multiplication: Algorithm

Theory

In vector matrix multiplication, the number of columns of first matrix should be equal to the number of rows of the second matrix.
If matrix A having m columns and n rows is multiplied with matrix B having n rows and m columns, then the product of the two matrices C is a square matrix of m rows and m columns.

The general rule for product of two matrices is given by

(AB i,j) mxm = ∑(A i,k) mxn(B k,j) nxm

The summation ranges over the interval 1≤ i ≤ m.

In block matrix multiplication, each matrix is divided into blocks of equal sizes. This technique proves to be very efficient for parallel matrix multiplication as each node can compute the product of the individual blocks.
Block matrix multiplication is used in Strassen’s algorithm for fast matrix multiplication.

Flowchart

mmflowchart.jpg

Parallel Matrix Multiplication

Implementation

The initialization of the program variables and MPI environment is done initially. The code snippet given below demonstrates how the matrix row data is exchanged between client and server nodes. The server node generates the matrices, creates the divisions based on the ‘rank’ of each node and sends the block data to the client nodes.

if(rank==0) {       
start = 0;
end = (msize/nproc);
MPI_Bcast(&matrix2[0][0],(msize*msize*3)+40,MPI_INT,0,MPI_COMM_WORLD);
for(i=0;i<nproc;i++){
    start = unit * i;
    end = unit * (i+1);
    if(i!=0){
        MPI_Send(&start,1,MPI_INT,i,1,MPI_COMM_WORLD);
        MPI_Send(&end,1,MPI_INT,i,2,MPI_COMM_WORLD);
        MPI_Send(&matrix1[start][0],
              (msize*unit*3)+8,
                MPI_INT,i,3,MPI_COMM_WORLD );
    }
    }
}
else{
    MPI_Bcast(&matrix2[0][0],(msize*msize*3)+40,MPI_INT,0,MPI_COMM_WORLD);
    MPI_Recv(&start,1,MPI_INT,0,1,MPI_COMM_WORLD,&s);
    MPI_Recv(&end,1,MPI_INT,0,2,MPI_COMM_WORLD,&s);
    MPI_Recv(&matrix1[start][0],(msize*unit*3)+8,MPI_INT,0,3,MPI_COMM_WORLD,&s);
}

The code given below demonstrates the actual matrix multiplication taking place on the client nodes. The results are then sent back to the server node.

start = unit * rank;
    end = unit * (rank+1);

    MPI_Barrier(MPI_COMM_WORLD);

    for(i=start;i<end;i++) {
        for(j=0;j<msize;j++) {
            for(k=0;k<msize;k++) {
            matrix[i][j] = matrix[i][j]+(matrix1[i][k] * matrix2[k][j]);
            }
        }
    }
    if(rank!=0)
        MPI_Send(&matrix[start][0],(msize*unit*2)+6,MPI_INT,0,7,MPI_COMM_WORLD);
The server node assembles the results sent by the client nodes. 
    if(rank==0){
        for(i=1;i<nproc;i++){
            start = unit * i;
            if(i!=0){
                    MPI_Recv(&matrix[start][0],                                         (msize*unit*2)+6,
                    MPI_INT,i,7,MPI_COMM_WORLD,&s);
        }
    }
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.