Abstract
In this article, we focus on solving the energy optimization problem for real-time streaming applications on multiprocessor System-on-Chip by combining task-level coarse-grained software pipelining with DVS (Dynamic Voltage Scaling) and DPM (Dynamic Power Management) considering transition overhead, inter-core communication and discrete voltage levels. We propose a two-phase approach to solve the problem. In the first phase, we propose a coarse-grained task parallelization algorithm called RDAG to transform a periodic dependent task graph into a set of independent tasks by exploiting the periodic feature of streaming applications. In the second phase, we propose a scheduling algorithm, GeneS, to optimize energy consumption. GeneS is a genetic algorithm that can search and find the best schedule within the solution space generated by gene evolution. We conduct experiments with a set of benchmarks from E3S and TGFF. The experimental results show that our approach can achieve a 24.4% reduction in energy consumption on average compared with the previous work.
- Acharya, S. and Mahapatra, R. 2008. A dynamic slack management technique for real-time distributed embedded systems. IEEE Trans. Comput. 57, 2, 215--230. Google ScholarDigital Library
- Alba, E. and Troya, J. M. 1999. A survey of parallel distributed genetic algorithms. Complex. 4, 4, 31--52. Google ScholarDigital Library
- AlEnawy, T. A. and Aydin, H. 2005. Energy-aware task allocation for rate monotonic scheduling. In Proceedings of the 11th IEEE Real Time on Embedded Technology and Applications Symposium (RTAS'05). IEEE Computer Society Press, Los Alamitos, CA, 213--223. Google ScholarDigital Library
- AMD. 2001. Mobile AMD Athlon 4 processor model 6 CPGA data sheet. Advanced Micro Devices, Tech, rep. 24319.Google Scholar
- Aydin, H., Devadas, V., and Zhu, D. 2006. System-level energy management for periodic real-time tasks. In Proceedings of the 27th IEEE International Real-Time Systems Symposium (RTSS'06). IEEE Computer Society Press, Los Alamitos, CA, 313--322. Google ScholarDigital Library
- Aydin, H., Melhem, R., Mossé, D., and Mejía-Alvarez, P. 2001. Determining optimal processor speeds for periodic real-time tasks with different power characteristics. In Proceedings of the 13th Euromicro Conference on Real-Time Systems (ECRTS'01). IEEE Computer Society Press, Los Alamitos, CA, 225--232. Google ScholarDigital Library
- Aydin, H. and Yang, Q. 2003. Energy-aware partitioning for multiprocessor real-time systems. In Proceedings of the 17th International Symposium on Parallel and Distributed Processing (IPDPS'03). IEEE Computer Society Press, Los Alamitos, CA, 113--121. Google ScholarDigital Library
- Bambha, N. K. and Bhattacharyya, S. S. 2000. A joint power/performance optimization algorithm for multiprocessor systems using a period graph construct. In Proceedings of the 13th International Symposium on System Synthesis (ISSS'00). IEEE Computer Society Press, Los Alamitos, CA, 91--97. Google ScholarDigital Library
- Bini, E., Buttazzo, G., and Lipari, G. 2005. Speed modulation in energy-aware real-time systems. In Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS'05). IEEE Computer Society Press, Los Alamitos, CA, 3--10. Google ScholarDigital Library
- Burd, T. 2001. Energy-efficient processor system design. Ph.D. thesis, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley.Google Scholar
- Chao, L.-F. and LaPaugh, A. 1993. Rotation scheduling: A loop pipelining algorithm. In Proceedings of the 30th International Design Automation Conference (DAC'93). ACM, New York, NY, 566--572. Google ScholarDigital Library
- Chao, L.-F. and Sha, E. H.-M. 1993. Static scheduling of uniform nested loops. In Proceedings of 7th International Parallel Processing Symposium. IEEE Computer Society Press, Los Alamitos, CA, 254--258.Google Scholar
- Chen, J.-J. and Kuo, T.-W. 2005. Energy-efficient scheduling of periodic real-time tasks over homogeneous multiprocessors. In Proceedings of the 2nd International Workshop on Power-Aware Real-Time Computing (PARC'05). IEEE Computer Society Press, Los Alamitos, CA, 30--35.Google Scholar
- Chen, J.-J., Kuo, T.-W., and Shih, C.-S. 2005. 1 + &epsis; approximation clock rate assignment for periodic real-time tasks on a voltage-scaling processor. In Proceedings of the 5th ACM International Conference on Embedded Software (EMSOFT'05). ACM, New York, NY, 247--250. Google ScholarDigital Library
- Dick, R., Rhodes, D., and Wolf, W. 1998. TGFF: Task graphs for free. In Proceedings of the 6th International Workshop on Hardware/Software Codesign (CODES'98). ACM, New York, NY, 97--101. Google ScholarDigital Library
- El-Rewini, H., Ali, H. H., and Lewis, T. 1995. Task scheduling in multiprocessing systems. Computer 28, 12, 27--37. Google ScholarDigital Library
- Gruian, F. and Kuchcinski, K. 2001. Lenes: Task scheduling for low-energy systems using variable supply voltage processors. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'01). ACM, New York, NY, 449--455. Google ScholarDigital Library
- Hu, J. and Marculescu, R. 2004. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of the Conference on Design Automation and Test in Europe (DATE'04). IEEE Computer Society Press, Los Alamitos, CA, 234--239. Google ScholarDigital Library
- Hua, S. and Qu, G. 2005. Voltage setup problem for embedded systems with multiple voltages. IEEE Trans. VLSI Syst. 13, 7, 869--872. Google ScholarDigital Library
- Hung, C.-M., Chen, J.-J., and Kuo, T.-W. 2006. Energy-efficient real-time task scheduling for a dvs system with a non-dvs processing element. In Proceedings of the 27th IEEE International Real-Time Systems Symposium (RTSS'06). IEEE Computer Society Press, Los Alamitos, CA, 303--312. Google ScholarDigital Library
- Jejurikar, R. and Gupta, R. 2004. Dynamic voltage scaling for systemwide energy minimization in real-time embedded systems. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'04). ACM, New York, NY, 78--81. Google ScholarDigital Library
- Jha, N. K. 2001. Low power system scheduling and synthesis. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'01). IEEE Press, Los Alamitos, CA, 259--263. Google ScholarDigital Library
- Rabaey, J. M., Chandrakasan, A., and Nikolic, B. 2002. Digital Integrated Circuits 2nd Ed. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
- Kianzad, V., Bhattacharyya, S. S., and Qu, G. 2005. Casper: An integrated energy-driven approach for task graph scheduling on distributed embedded systems. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05). IEEE Computer Society Press, Los Alamitos, CA, 191--197. Google ScholarDigital Library
- Kim, N. S., Kgil, T., Bowman, K., De, V., and Mudge, T. 2005. Total power-optimal pipelining and parallel processing under process variations in nanometer technology. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'05). IEEE Computer Society Press, Los Alamitos, CA, 535--540. Google ScholarDigital Library
- Landskov, D., Davidson, S., Shriver, B., and Mallett, P. W. 1980. Local microcode compaction techniques. ACM Comput. Surv. 12, 3, 261--294. Google ScholarDigital Library
- Leiserson, C. E. and Saxe, J. B. 1991. Retiming synchronous circuitry. Algorithmica 6, 5--35.Google ScholarDigital Library
- Li, J. and Martínez, J. F. 2005. Power-performance considerations of parallel computing on chip multiprocessors. ACM Trans. Archit. Code Optim. 2, 4, 397--422. Google ScholarDigital Library
- Liu, H., Shao, Z., Wang, M., and Chen, P. 2008. Overhead-aware system-level joint energy and performance optimization for streaming applications on multiprocessor systems-on-chip. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS'08). IEEE Computer Society Press, Los Alamitos, CA, 92--101. Google ScholarDigital Library
- Liu, H., Shao, Z., Wang, M., Du, J., Xue, C. J., and Jia, Z. 2009. Combining coarse-grained software pipelining with dvs for scheduling real-time periodic dependent tasks on multi-core embedded systems. J. Signal Process. Syst. 57, 2, 249--262. Google ScholarDigital Library
- Luo, J. and Jha, N. K. 2000. Power-conscious joint scheduling of periodic task graphs and aperiodic tasks in distributed real-time embedded systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'00). IEEE Press, Los Alamitos, CA, 357--364. Google ScholarDigital Library
- Luo, J. and Jha, N. K. 2007. Power-efficient scheduling for heterogeneous distributed real-time embedded systems. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 26, 6, 1161--1170. Google ScholarDigital Library
- Martin, S. M., Flautner, K., Mudge, T., and Blaauw, D. 2002. Combined dynamic voltage scaling and adaptive body biasing for lower power microprocessors under dynamic workloads. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'02). IEEE Computer Society Press, Los Alamitos, CA, 721--725. Google ScholarDigital Library
- Mejia-Alvarez, P., Levner, E., and Mossé, D. 2004. Adaptive scheduling server for power-aware real-time tasks. ACM Trans. Embed. Comput. Syst. 3, 2, 284--306. Google ScholarDigital Library
- Mitchell, M. 1996. An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Mochocki, B., Hu, X., and Quan, G. 2004. A unified approach to variable voltage scheduling for nonideal DVS processors. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 23, 9, 1370--1377. Google ScholarDigital Library
- Niu, L. and Quan, G. 2006. System-wide dynamic power management for portable multimedia devices. In Proceedings of the 8th IEEE International Symposium on Multimedia (ISM'06). IEEE Computer Society Press, Los Alamitos, CA, 97--104. Google ScholarDigital Library
- Pandey, V., Jiang, W., Zhou, Y., and Bianchini, R. 2006. Dma-aware memory energy management. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture (HPCA'06). IEEE Computer Society Press, Los Alamitos, CA, 133--144.Google Scholar
- Passos, N. L. and Sha, E. H.-M. 1996. Achieving full parallelism using multidimensional retiming. IEEE Trans. Parall. Distrib. Syst. 7, 11, 1150--1163. Google ScholarDigital Library
- Quan, G. and Hu, X. 2002. Minimum energy fixed-priority scheduling for variable voltage processor. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'02). IEEE Computer Society Press, Los Alamitos, CA, 782--787. Google ScholarDigital Library
- Saewong, S. and Rajkumar, R. R. 2003. Practical voltage-scaling for fixed-priority RT-systems. In Proceedings of the 9th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'03). IEEE Computer Society Press, Los Alamitos, CA, 106--114. Google ScholarDigital Library
- Shao, Z., Wang, M., Chen, Y., Xue, C., Qiu, M., Yang, L. T., and Sha, E. H. M. 2007. Real-time dynamic voltage loop scheduling for multi-core embedded systems. IEEE Trans. Circ. Syst. II 54, 5, 445--449.Google Scholar
- Shin, D., Kim, J., and Lee, S. 2001. Low-energy intra-task voltage scheduling using static timing analysis. In Proceedings of the 38th Annual Design Automation Conference (DAC'01). ACM, New York, NY, 438--443. Google ScholarDigital Library
- Vallerio, K. S. and Jha, N. K. 2003. Task graph extraction for embedded system synthesis. In Proceedings of the 16th International Conference on VLSI Design (VLSID'03). IEEE Computer Society Press, Los Alamitos, CA, 480--486. Google ScholarDigital Library
- Varatkar, G. and Marculescu, R. 2003. Communication-aware task scheduling and voltage selection for total systems energy minimization. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'03). IEEE Computer Society Press, Los Alamitos, CA, 510--517. Google ScholarDigital Library
- Wang, Y., Liu, D., Wang, M., Qin, Z., and Shao, Z. 2010. Optimal task scheduling by removing inter-core communication overhead for streaming applications on MPSoC. In Proceedings of the 16th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS'10). IEEE Computer Society Press, Los Alamitos, CA, 195--204. Google ScholarDigital Library
- Wiegand, T., Sullivan, G. J., Bjontegaard, G., and Luthra, A. 2003. Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13, 7, 560--576. Google ScholarDigital Library
- Xu, R., Melhem, R., and Mosse, D. 2007. Energy-aware scheduling for streaming applications on chip multiprocessors. In Proceedings of the 28th IEEE International Real-Time Systems Symposium (RTSS'07). IEEE Computer Society Press, Los Alamitos, CA, 25--38. Google ScholarDigital Library
- Yu, Y. and Prasanna, V. 2002. Power-aware resource allocation for independent tasks in heterogeneous real-time systems. In Proceedings of the 9th International Conference on Parallel and Distributed Systems (ICPADS'02). IEEE Computer Society Press, Los Alamitos, CA, 341--348. Google ScholarDigital Library
- Zhang, Y., Hu, X. S., and Chen, D. Z. 2002. Task scheduling and voltage selection for energy minimization. In Proceedings of the 39th Annual Design Automation Conference (DAC'02). ACM, New York, NY, 183--188. Google ScholarDigital Library
- Zhong, X. and Xu, C.-Z. 2007. Frequency-aware energy optimization for real-time periodic and aperiodic tasks. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'07). ACM, New York, NY, 21--30. Google ScholarDigital Library
Index Terms
- Overhead-aware energy optimization for real-time streaming applications on multiprocessor System-on-Chip
Recommendations
Optimally Removing Intercore Communication Overhead for Streaming Applications on MPSoCs
This paper aims to totally remove intercore communication overhead with joint computation and communication task scheduling for streaming applications on Multiprocessor System-on-Chips (MPSoCs). Our basic idea is to let some computation and ...
Energy optimization by exploiting execution slacks in streaming applications on multiprocessor systems
DAC '13: Proceedings of the 50th Annual Design Automation ConferenceDynamic voltage and frequency scaling (DVFS) offers great potential for optimizing the energy efficiency of Multiprocessor Systems-on-Chip (MPSoCs). The conventional approaches for processor voltage and frequency adjustment are not suitable for ...
Comments