Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor | IEEE Conference Publication | IEEE Xplore