A general scalable implementation of fast matrix multiplication algorithms on distributed memory computers | IEEE Conference Publication | IEEE Xplore