Extracting speedup from C-code with poor instruction-level parallelism | IEEE Conference Publication | IEEE Xplore