Parallel sparse LU decomposition using FPGA with an efficient cache architecture | IEEE Conference Publication | IEEE Xplore