skip to main content
10.1145/3316482.3326352acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article

SPECTRUM: a software defined predictable many-core architecture for LTE baseband processing

Published:23 June 2019Publication History

ABSTRACT

Wireless communication standards such as Long Term Evolution (LTE) are rapidly changing to support the high data rate of wireless devices. The physical layer baseband processing has strict real-time deadlines, especially in the next-generation applications enabled by the 5G standard. Existing base station transceivers utilize customized Digital Signal Processing (DSP) cores or fixed-function hardware accelerators for physical layer baseband processing. However, these approaches incur significant non-recurring engineering costs and are inflexible to newer standards or updates. Software programmable processors offer more adaptability. However, it is challenging to sustain guaranteed worst-case latency and throughput at reasonably low-power on shared-memory many-core architectures featuring inherently unpredictable design choices, such as caches and network-on chip. We propose SPECTRUM, a predictable software defined many-core architecture that exploits the massive parallelism of the LTE baseband processing. The focus is on designing a scalable lightweight hardware that can be programmed and defined by sophisticated software mechanisms. SPECTRUM employs hundreds of lightweight in-order cores augmented with custom instructions that provide predictable timing, a purely software-scheduled on-chip network that orchestrates the communication to avoid any contention and per-core software controlled scratchpad memory with deterministic access latency. Compared to a many-core architecture like Skylake-SP (average power 215W) that drops 14% packets at high traffic load, 256-core SPECTRUM by definition has zero packet drop rate at significantly lower average power of 24W. SPECTRUM consumes 2.11x lower power than C66x DSP cores+accelerator platform in baseband processing. SPECTRUM is also well-positioned to support future 5G workloads.

References

  1. 2009. Alcatel-Lucent 9926 digital 2U eNodeB baseband unit. Alcatellucent product brief.Google ScholarGoogle Scholar
  2. 2010. Amber ARM-Compatible Core. https://opencores.org/project, amber .Google ScholarGoogle Scholar
  3. 2011. LTE baseband targeted design platform. Xilinx product brief. http://www.origin.xilinx.com/publications/prod_mktg/LTE-Baseband-SellSheet.pdf.Google ScholarGoogle Scholar
  4. 2011. Temperature Control Solution of Communication Base Station. https://bit.ly/2Bpa9jH .Google ScholarGoogle Scholar
  5. 2012. LTE baseband targeted design platform. Xilinx product brief. https://www.intel.com/content/dam/alterawww/global/en_US/pdfs/literature/po/wireless-channel-card.pdf.Google ScholarGoogle Scholar
  6. 2012. Octean Fusion-M CN73XX. https://bit.ly/2TypyW7.Google ScholarGoogle Scholar
  7. 2013. 66AK2Hxx Multicore DSP+ARM Keystone II SoC. https://bit.ly/2zgPDjO.Google ScholarGoogle Scholar
  8. 2013. QorIQ ® Qonverge B4860 Baseband Processor. https://bit.ly/2uT6lnp.Google ScholarGoogle Scholar
  9. 2013. SoC and ASIC Design At Ericsson. https://bit.ly/2TOMLmP .Google ScholarGoogle Scholar
  10. 2014. Open Air Interface. http://www.openairinterface.org/.Google ScholarGoogle Scholar
  11. 2016. Transcede t3K Concurrent Dual-Mode SoC Family Communiation Infrastructure. https://intel.ly/2OvK4aY.Google ScholarGoogle Scholar
  12. 2017. LTE 3GPP releases Overview. https://bit.ly/2DNNnoh.Google ScholarGoogle Scholar
  13. 2018. Personal Communication with base station manufacturer.Google ScholarGoogle Scholar
  14. Sebastian Altmeyer et al. 2014. Evaluation of cache partitioning for hard real-time systems. In ECRTS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Oren Avissar, Rajeev Barua, and Dave Stewart. 2002. An Optimal Memory Allocation Scheme for Scratch-pad-based Embedded Systems. ACM Trans. Embed. Comput. Syst. 1, 1 (Nov. 2002), 6–26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad Memory: Design Alternative for Cache On-chip Memory in Embedded Systems. In Proceedings of the Tenth International Symposium on Hardware/Software Codesign (CODES ’02). ACM, New York, NY, USA, 73–78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sandro Belfanti, Christoph Roth, Michael Gautschi, Christian Benkeser, and Qiuting Huang. 2013. A 1Gbps LTE-advanced turbo-decoder ASIC in 65nm CMOS. In VLSI Circuits (VLSIC), 2013 Symposium on. IEEE.Google ScholarGoogle Scholar
  18. Paul Bender, Peter Black, Matthew Grob, Roberto Padovani, Nagabhushana Sindhushayana, and Andrew Viterbi. 2010. CDMA/HDR: A bandwidth-efficient high-speed wireless data service for nomadic users. In The Foundations Of The Digital Wireless World: Selected Works of AJ Viterbi. World Scientific, 161–168.Google ScholarGoogle Scholar
  19. Sourjya Bhaumik, Shoban Preeth Chandrabose, Manjunath Kashyap Jataprolu, Gautam Kumar, Anand Muralidhar, Paul Polakos, Vikram Srinivasan, and Thomas Woo. 2012. CloudIQ: A framework for processing base stations in a data center. In Proceedings of the 18th annual international conference on Mobile computing and networking. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (Aug. 2011), 1–7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ouajdi Brini and Mounir Boukadoum. 2017. Virtualization of the LTE physical layer symbol processing with GPUs. In New Circuits and Systems Conference (NEWCAS), 2017 15th IEEE International. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  22. Dai Bui, Alessandro Pinto, and Edward A Lee. 2009. On-time network on-chip: Analysis and architecture. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-59 (2009).Google ScholarGoogle Scholar
  23. Dai N Bui, Hiren D Patel, and Edward A Lee. 2010. Deploying hard real-time control software on chip-multiprocessors. In Embedded and Real-Time Computing Systems and Applications (RTCSA), 2010 IEEE 16th International Conference on. IEEE, 283–292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Divya Chitimalla, Koteswararao Kondepu, Luca Valcarenghi, and Biswanath Mukherjee. 2015. Reconfigurable and efficient fronthaul of 5G systems. In 2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems, ANTS 2015, Kolkata, India, December 15-18, 2015. 1–5.Google ScholarGoogle ScholarCross RefCross Ref
  25. Christoph Cullmann et al. 2010. Predictability considerations in the design of multi-core embedded systems. RTSS.Google ScholarGoogle Scholar
  26. W. J. Dally. 1992. Virtual-Channel Flow Control. IEEE Trans. Parallel Distrib. Syst. 3, 2 (March 1992), 194–205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Benoît Dupont de Dinechin, Pierre Guironnet de Massas, Guillaume Lager, Clément Léger, Benjamin Orgogozo, Jérôme Reybert, and Thierry Strudel. 2013. A Distributed Run-Time Environment for the Kalray MPPA®-256 Integrated Manycore Processor.. In ICCS, Vol. 13.Google ScholarGoogle ScholarCross RefCross Ref
  28. Angel Dominguez, Sumesh Udayakumaran, and Rajeev Barua. 2005. Heap Data Allocation to Scratch-pad Memory in Embedded Systems. J. Embedded Comput. 1, 4 (Dec. 2005), 521–540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Stephen A Edwards and Edward A Lee. 2007. The case for the precision timed (PRET) machine. In 2007 44th ACM/IEEE DAC. IEEE, 264–265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Damodaran et al. 2012. A 1.25GHz 0.8W C66x DSP Core in 40nm CMOS. In VLSID. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Heiko Falk et al. 2007. Compile-time decided instruction cache locking using worst-case execution paths. In CODES+ISSS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Heiko Falk et al. 2009. Optimal static WCET-aware scratchpad allocation of program code. In DAC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Arnon Friedmann and Sandeep Kumar. 2009. LTE emerges as early leader in 4G technologies. In White Paper. Texas Instruments.Google ScholarGoogle Scholar
  34. Nan Guan et al. 2009. Cache-aware scheduling and analysis for multicores. In EMSOFT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Andreas Hansson, Kees Goossens, and Andrei Rˇadulescu. 2005. A Unified Approach to Constrained Mapping and Routing on Networkon-chip Architectures. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’05). ACM, New York, NY, USA, 75–80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Andreas Hansson, Mahesh Subburaman, and Kees Goossens. 2009. Aelite: A Flit-synchronous Network on Chip with Composable and Predictable Services. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’09). European Design and Automation Association, 3001 Leuven, Belgium, Belgium, 250–255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. Hesham, J. Rettkowski, D. Goehringer, and M. A. Abd El Ghany. 2017. Survey on Real-Time Networks-on-Chip. IEEE Transactions on Parallel and Distributed Systems 28, 5 (May 2017), 1500–1517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Huawei. 2017. Base Station Operation Increases the Efficiency of Network Construction. https://bit.ly/2GtCd6N .Google ScholarGoogle Scholar
  39. Yiming Huo, Xiaodai Dong, and Wei Xu. 2017. 5G cellular user equipment: From theory to practical hardware design. IEEE Access 5 (2017).Google ScholarGoogle Scholar
  40. Xianfeng Li et al. 2007. Chronos: A timing analyzer for embedded software. Science of Computer Programming (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jing Lu, Ke Bai, and Aviral Shrivastava. 2015. Efficient Code Assignment Techniques for Local Memory on Software Managed Multicores. ACM Trans. Embed. Comput. Syst. 14, 4, Article 71 (Dec. 2015), 24 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Timothy G Mattson, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram Vangal, Nitin Borkar, Greg Ruhl, et al. 2010. The 48-core scc processor: The programmer’s view. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. 2006. A Methodology for Mapping Multiple Use-Cases onto Networks on Chips. In Proceedings of the Design Automation Test in Europe Conference, Vol. 1. 1–6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Imtiaz Parvez, Ali Rahmati, Ismail Guvenc, Arif I Sarwat, and Huaiyu Dai. 2017. A Survey on Low Latency Towards 5G: RAN, Core Network and Caching Solutions. arXiv preprint arXiv:1708.02562 (2017).Google ScholarGoogle Scholar
  45. Klaus I Pedersen, Gilberto Berardinelli, Frank Frederiksen, Preben Mogensen, and Agnieszka Szufarska. 2016. A flexible 5G frame structure design for frequency-division duplex cases. IEEE Communications Magazine 54, 3 (2016), 53–59.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean François Nezan, and Slaheddine Aridhi. 2014. Preesm: A dataflowbased rapid prototyping framework for simplifying multicore dsp programming. In EDERC. 36.Google ScholarGoogle Scholar
  47. Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, et al. 2015. T-CREST: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture 61, 9 (2015), 449–471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Martin Schoeberl, Florian Brandner, Jens Sparsø, and Evangelia Kasapaki. 2012. A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems. In Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip (NOCS ’12). IEEE Computer Society, Washington, DC, USA, 152–160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Philipp Schulz, Maximilian Matthe, Henrik Klessig, Meryem Simsek, Gerhard Fettweis, Junaid Ansari, Shehzad Ali Ashraf, Bjoern Almeroth, Jens Voigt, Ines Riedel, et al. 2017. Latency critical IoT applications in 5G: Perspective on the design of radio interface and network architecture. IEEE Communications Magazine 55, 2 (2017), 70–78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Silexica. 2016. Multi-core Software Design For an LTE Base Station, White Paper. https://bit.ly/2TyE7sx.Google ScholarGoogle Scholar
  51. Magnus Sjalander, Sally A. McKee, Peter Brauer, David Engdal, and Andras Vajda. 2012. An LTE Uplink Receiver PHY Benchmark and Subframe-based Power Management. In Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS ’12). IEEE Computer Society, Washington, DC, USA, 25–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Avinash Sodani. 2015. Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor. In Hot Chips 27 Symposium (HCS). IEEE, 1–24.Google ScholarGoogle ScholarCross RefCross Ref
  53. Manikantan Srinivasan, C Siva Ram Murthy, and Anusuya Balasubramanian. 2015. Modular performance analysis of Multicore SoC-based small cell LTE base station. In Very Large Scale Integration (VLSI-SoC), 2015 IFIP/IEEE International Conference on. IEEE, 37–42.Google ScholarGoogle ScholarCross RefCross Ref
  54. Christoph Studer, Christian Benkeser, Sandro Belfanti, and Quiting Huang. 2011. Design and implementation of a parallel turbo-decoder ASIC for 3GPP-LTE. IEEE Journal of Solid-State Circuits 46, 1 (2011).Google ScholarGoogle ScholarCross RefCross Ref
  55. Vivy Suhendra et al. 2005. WCET centric data allocation to scratchpad memory. In RTSS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Michael Bedford Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman, Johnson, et al. 2002. The raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE micro (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic Allocation for Scratch-pad Memory Using Compile-time Decisions. ACM Trans. Embed. Comput. Syst. 5, 2 (May 2006), 472–511.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Theo Ungerer, Francisco Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quinones, Mike Gerdes, Marco Paolieri, Julian Wolf, et al. 2010. Merasa: Multicore execution of hard real-time applications supporting analyzability. IEEE Micro 30, 5 (2010), 66–75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Leslie G. Valiant. 1982. A scheme for fast parallel communication. SIAM journal on computing 11, 2 (1982), 350–361.Google ScholarGoogle Scholar
  60. Vanchinathan Venkataramani, Mun Choon Chan, and Tulika Mitra. 2019. Scratchpad-Memory Management for Multi-Threaded Applications on Many-Core Architectures. ACM Transactions on Embedded Computing Systems (TECS) 18, 1 (2019), 10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xavier Vera et al. 2007. Data cache locking for tight timing calculations. TECS (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Reinhard Wilhelm et al. 2008. The worst-case execution-time problemoverview of methods and survey of tools. TECS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Qi Zheng, Yajing Chen, Ronald G. Dreslinski, Chaitali Chakrabarti, Achilleas Anastasopoulos, Scott A. Mahlke, and Trevor N. Mudge. 2013. WiBench: An open source kernel suite for benchmarking wireless systems. In Proceedings of the IEEE International Symposium on Workload Characterization, IISWC 2013, Portland, OR, USA, September 22-24, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  64. Qi Zheng, Yajing Chen, Hyunseok Lee, Ronald Dreslinski, Chaitali Chakrabarti, Achilleas Anastasopoulos, Scott Mahlke, and Trevor Mudge. 2015. Using Graphics Processing Units in an LTE Base Station. Journal of Signal Processing Systems 78, 1 (01 Jan 2015), 35–47. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SPECTRUM: a software defined predictable many-core architecture for LTE baseband processing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems
      June 2019
      218 pages
      ISBN:9781450367240
      DOI:10.1145/3316482

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 June 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate116of438submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader