Skip to main content
Log in

Cost Minimization with HPDFG and Data Mining for Heterogeneous DSP

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Cost minimization and execution-time reduction have become the most important issues in today’s real-time embedded system. Meanwhile, for the DSP (Digital Signal Processing) applications running on embedded system, loops inside them are the most critical part for performance optimization. To optimize the loop iteration patterns, we need to schedule the loop execution order. Due to the uncertainties within the execution time of tasks, we model varied execution times of tasks as random variables and propose a novel data graph model, called HPDFG (Heterogeneous Probabilistic Data-Flow Graph) to model DSP applications on embedded systems. A novel algorithm, LSHAPE, is proposed to minimize the cost and satisfy the timing constraints. First of all, we use the data mining methods to estimate the probabilistic distribution of the execution time variables. Second, we rotate the loops in the application to explore different possible execution patterns. Finally, we combine the list-scheduling and the dynamic programming to generate a near-optimal task allocation and a core-mode assignment. Experimental results demonstrate the effectiveness of our algorithm. Our approach can handle loops efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

References

  1. Wolf, W. (2006). Design challenges in multiprocessor Systems-On-Chip. In B. Kleinjohann, L. Kleinjohann, R. Machado, C. Pereira, & P. S. Thiagarajan (Eds.), IFIP international federation for information processing, from model-driven design to resource management for distributed embedded systems (Vol. 225, pp. 1–8). Boston: Springer.

    Google Scholar 

  2. Shao, Z., Zhuge, Q., Xue, C., & Sha, E. H.-M. (2005). Efficient assignment and scheduling for heterogeneous DSP systems. IEEE Transactions on Parallel and Distributed Systems, 16, 516–525.

    Article  Google Scholar 

  3. Tongsima, S., Sha, E. H.-M., Chantrapornchai, C., Surma, D., & Passose, N. (2000). Probabilistic loop scheduling for applications with uncertain execution time. IEEE Transactions on Computers, 49, 65–80.

    Article  Google Scholar 

  4. Hua, S., Qu, G., & Bhattacharyya, S. (2003). Exploring the probabilistic design space of multimedia systems. In IEEE international workshop on rapid system prototyping (pp. 233–240).

  5. Hua, S., Qu, G., & Bhattacharyya, S. (2003). Energy reduction techniques for multimedia applications with tolerance to deadline misses. In DAC (pp. 131–136).

  6. Hua, S., & Qu, G. (2003). Approaching the maximum energy saving on embedded systems with multiple voltages. In International conference on computer aid design (ICCAD) (pp. 26–29).

  7. Zhou, T., Hu, X., & Sha, E. H.-M. (2001). Estimating probabilistic timing performance for real-time embedded systems. IEEE Transactions on Very Large Scale Integration(VLSI) Systems, 9(6), 833–844.

    Article  Google Scholar 

  8. Tia, T., Deng, Z., Shankar, M., Storch, M., Sun, J., Wu, L., et al. (1995). Probabilistic performance guarantee for real-time tasks with varying computation times. In Proceedings of real-time technology and applications symposium (pp. 164–173).

  9. Qiu, M., Yang, L. T., Shao, Z., & Sha, E. (2009). Rotation scheduling and voltage assignment to minimize energy for SoC. In IEEE embedded and ubiquitous computing (EUC), best paper award (pp. 48–55).

  10. Qiu, M., Xue, C., Zhuge, Q., Yang, L. T., Shao, Z., & Sha, E. H.-M. (2007). Energy minimization with soft real-time and DVS for uniprocessor and multiprocessor embedded systems. In IEEE design, automation and test in Europe (pp. 1641–1646).

  11. Qiu, M., Jia, Z., Xue, C., Z.Shao, Liu, Y., & Sha, E. H.-M. (2006). Loop scheduling to minimize cost with data mining and prefetching for heterogeneous DSP. In 18th IASTED parallel and distributed computing and systems (PDCS).

  12. Qiu, M., Jia, Z., Xue, C., Shao, Z., & Sha, E. (2007). Voltage assignment with guaranteed probability satisfying timing constraint for real-time multiproceesor DSP. Journal of VLSI Signal Processing, 46(1), 55–73.

    Article  Google Scholar 

  13. Qiu, M., Yang, L. T., Shao, Z., & Sha, E. (2010). Dynamic and leakage energy minimization with soft real-time loop scheduling and voltage assignment. IEEE Transactions on Very Large Scale Integration Systems, 18(3), 501–504.

    Article  Google Scholar 

  14. Shestak, V., Smith, J., Siegel, H. J., & Maciejewski, A. A. (2006). Iterative algorithms for stochastically robust static resource allocation in periodic sensor driven clusters. In International conference on parallel and distributed computing and systems (pp. 166–174).

  15. Smith, J., Chong, E. K. P., Maciejewski, A. A., & Siegel, H. J. (2009). Stochastic-based robust dynamic resource allocation in a heterogeneous computing system. In International conference on parallel processing (pp. 188–195).

  16. Paulin, P. G., & Knight, J. P. (1989). Force-directed scheduling for the behavioral synthesis of asic’s. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 8, 661–679.

    Article  Google Scholar 

  17. Ito, K., Lucke, L., & Parhi, K. (1998). Ilp-based cost-optimal dsp synthesis with module selection and data format conversion. IEEE Transactions on VLSI Systems, 6, 582–594.

    Article  Google Scholar 

  18. Ito, K., & Parhi, K. (1995). Register minimization in cost-optimal synthesis of dsp architecture. In Proc. of the IEEE VLSI signal processing workshop.

  19. Yu, Y., Ren, S., & Xiaobo, S. H. (2009). A metric for judicious relaxation of timing constraints in soft real-time systems. In 15th IEEE real-time and embedded technology and applications symposium (pp. 163–172).

  20. Chantem, T., Xiaobo, S. H, & Lemmon, M. D. (2009). Generalized elastic scheduling for real-time tasks. IEEE Transactions on Computers, 58(4), 480–495.

    Article  Google Scholar 

  21. Chao, L.-F., LaPaugh, A., & Sha, E. H.-M. (1997). Rotation scheduling: A loop pipelining algorithm. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 16, 229–239.

    Article  Google Scholar 

  22. Zhang, Y., Hu, X., & Chen, D. Z. (2002). Task scheduling and voltage selection for energy minimization. In DAC (pp. 183–188).

  23. Leiserson, C. E., & Saxe, J. B. (1991). Retiming synchronous circuitry. Algorithmica, 6, 5–35.

    Article  MathSciNet  MATH  Google Scholar 

  24. Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. New York: Morgan-Kaufman.

    Google Scholar 

  25. Liu, H., & Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In International conference on tools with artificial intelligence.

  26. Weiss, S. M., & Indurkhya, N. (1997). Predictive data mining: A practical guide. New York: Morgan-Kaufman.

    Google Scholar 

  27. Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.

    MATH  Google Scholar 

  28. Srinivasan, V., Davidson, E. S., Tyson, G. S., Charney, M. J., & Puzak, T.R. (2001). Branch history guided instruction prefetching. In Proc. of the 7th int’l conference on high performance computer architecture (HPCA) (pp. 291–300). Monterrey, Mexico.

    Google Scholar 

  29. Tse, J., & Smith, A. J. (1998). Cpu cache prefetching: Timing evaluation of hardware implementations. IEEE Transactions on Computers, 47(5), 509–526.

    Article  Google Scholar 

  30. Hsu, W.-C., & Smith, J. E. (1998). A performance study of instruction cache prefetching methods. IEEE Transactions on Computers, 47(5), 497–508.

    Article  Google Scholar 

  31. Zhang, Y., Haga, S., & Barua, R. (2002). Execution history guided instruction prefetching. In Intl. conf. on supercomputing (pp. 199–208).

  32. Joseph, D., & Grunwald, D. (1999). Prefetching using markov predictors. IEEE Transactions on Computers, 48(2), 121–133.

    Article  Google Scholar 

  33. Mutlu, O., Stark, J., Wilkerson, C., & Patt, Y. N. (2003). Runahead execution: An alternative to very large instruction windows for out-of-order processors. In IEEE HPCA-9.

  34. Spracklen, L., Chou, Y., & Abraham, S. G. (2005). Effective instruction prefetching in chip multiprocessors for modern commercial applications. In IEEE HPCA-11.

  35. Yang, C., Lebeck, A., Tseng, H., & Lee, C. (2004). Tolerating memory latency through push prefetching for pointer-intensive applications. In ACM transactions on architecture and code optimization (pp. 445–475).

  36. Simunic, T., Benini, L., De Micheli, G., & Hans, M. (2000). Source code optimization and profiling of energy consumption in embedded systems. In 13th international symposium on system synthesis (pp. 193–198).

  37. Kumar, C. M., Sindhwani, M., & Srikanthan, T. (2008). Profile-based technique for dynamic power management in embedded systems. In International conference on electronic design (pp. 1–6).

  38. Luo, J., & Jha, N. K. (2003). Power-profile driven variable voltage scaling for heterogeneous distributed real-time embedded systems. In 16th international conference on VLSI design (pp. 369–375).

  39. Xie, Y., Wolf, W., & Lekatsas, H. (2003). Profile-driven selective code compression. In Design, automation and test in Europe conference and exhibition (pp. 462–467).

  40. Leskela, J., Nikula, J., & Salmela, M. (2009). Opencl embedded profile prototype in mobile device. In IEEE workshop on signal processing systems (pp. 279–284).

Download references

Acknowledgements

This work was supported in part by The Research Fund of the State Key laboratory of Software Development Environment BUAA SKLSDE-2010ZX-13, NSFC 61071061, and NSFC 60873241; The NSFC 61071061 and the University of Kentucky Start Up Fund; National High-Tech R&D Plan of China 2009AA01Z123; The NSFC 61070001, RFEB Zhejiang Y200803333 and Y200909683, State Key Lab of High-end Server Storage Tech. 2009HSSA10, National Key Lab STASI, SFKPC 2009ZX01039-002-001-04, 2009ZX03001-016, 2009ZX03004-005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meikang Qiu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Niu, JW., Qiu, M., Wang, X. et al. Cost Minimization with HPDFG and Data Mining for Heterogeneous DSP. J Sign Process Syst 67, 213–228 (2012). https://doi.org/10.1007/s11265-010-0546-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0546-x

Keywords

Navigation