research-article

SPECTRUM: a software defined predictable many-core architecture for LTE baseband processing

Authors:
Vanchinathan Venkataramani

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Aditi Kulkarni

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Tulika Mitra

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Li-Shiuan Peh

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded SystemsJune 2019Pages 82–96https://doi.org/10.1145/3316482.3326352

Published:23 June 2019Publication History

LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

Pages 82–96

ABSTRACT

Wireless communication standards such as Long Term Evolution (LTE) are rapidly changing to support the high data rate of wireless devices. The physical layer baseband processing has strict real-time deadlines, especially in the next-generation applications enabled by the 5G standard. Existing base station transceivers utilize customized Digital Signal Processing (DSP) cores or fixed-function hardware accelerators for physical layer baseband processing. However, these approaches incur significant non-recurring engineering costs and are inflexible to newer standards or updates. Software programmable processors offer more adaptability. However, it is challenging to sustain guaranteed worst-case latency and throughput at reasonably low-power on shared-memory many-core architectures featuring inherently unpredictable design choices, such as caches and network-on chip. We propose SPECTRUM, a predictable software defined many-core architecture that exploits the massive parallelism of the LTE baseband processing. The focus is on designing a scalable lightweight hardware that can be programmed and defined by sophisticated software mechanisms. SPECTRUM employs hundreds of lightweight in-order cores augmented with custom instructions that provide predictable timing, a purely software-scheduled on-chip network that orchestrates the communication to avoid any contention and per-core software controlled scratchpad memory with deterministic access latency. Compared to a many-core architecture like Skylake-SP (average power 215W) that drops 14% packets at high traffic load, 256-core SPECTRUM by definition has zero packet drop rate at significantly lower average power of 24W. SPECTRUM consumes 2.11x lower power than C66x DSP cores+accelerator platform in baseband processing. SPECTRUM is also well-positioned to support future 5G workloads.

References

2009. Alcatel-Lucent 9926 digital 2U eNodeB baseband unit. Alcatellucent product brief.Google Scholar
2010. Amber ARM-Compatible Core. https://opencores.org/project, amber .Google Scholar
2011. LTE baseband targeted design platform. Xilinx product brief. http://www.origin.xilinx.com/publications/prod_mktg/LTE-Baseband-SellSheet.pdf.Google Scholar
2011. Temperature Control Solution of Communication Base Station. https://bit.ly/2Bpa9jH .Google Scholar
2012. LTE baseband targeted design platform. Xilinx product brief. https://www.intel.com/content/dam/alterawww/global/en_US/pdfs/literature/po/wireless-channel-card.pdf.Google Scholar
2012. Octean Fusion-M CN73XX. https://bit.ly/2TypyW7.Google Scholar
2013. 66AK2Hxx Multicore DSP+ARM Keystone II SoC. https://bit.ly/2zgPDjO.Google Scholar
2013. QorIQ ® Qonverge B4860 Baseband Processor. https://bit.ly/2uT6lnp.Google Scholar
2013. SoC and ASIC Design At Ericsson. https://bit.ly/2TOMLmP .Google Scholar
2014. Open Air Interface. http://www.openairinterface.org/.Google Scholar
2016. Transcede t3K Concurrent Dual-Mode SoC Family Communiation Infrastructure. https://intel.ly/2OvK4aY.Google Scholar
2017. LTE 3GPP releases Overview. https://bit.ly/2DNNnoh.Google Scholar
2018. Personal Communication with base station manufacturer.Google Scholar
Sebastian Altmeyer et al. 2014. Evaluation of cache partitioning for hard real-time systems. In ECRTS. Google ScholarDigital Library
Oren Avissar, Rajeev Barua, and Dave Stewart. 2002. An Optimal Memory Allocation Scheme for Scratch-pad-based Embedded Systems. ACM Trans. Embed. Comput. Syst. 1, 1 (Nov. 2002), 6–26. Google ScholarDigital Library
Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad Memory: Design Alternative for Cache On-chip Memory in Embedded Systems. In Proceedings of the Tenth International Symposium on Hardware/Software Codesign (CODES ’02). ACM, New York, NY, USA, 73–78. Google ScholarDigital Library
Sandro Belfanti, Christoph Roth, Michael Gautschi, Christian Benkeser, and Qiuting Huang. 2013. A 1Gbps LTE-advanced turbo-decoder ASIC in 65nm CMOS. In VLSI Circuits (VLSIC), 2013 Symposium on. IEEE.Google Scholar
Paul Bender, Peter Black, Matthew Grob, Roberto Padovani, Nagabhushana Sindhushayana, and Andrew Viterbi. 2010. CDMA/HDR: A bandwidth-efficient high-speed wireless data service for nomadic users. In The Foundations Of The Digital Wireless World: Selected Works of AJ Viterbi. World Scientific, 161–168.Google Scholar
Sourjya Bhaumik, Shoban Preeth Chandrabose, Manjunath Kashyap Jataprolu, Gautam Kumar, Anand Muralidhar, Paul Polakos, Vikram Srinivasan, and Thomas Woo. 2012. CloudIQ: A framework for processing base stations in a data center. In Proceedings of the 18th annual international conference on Mobile computing and networking. ACM. Google ScholarDigital Library
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (Aug. 2011), 1–7. Google ScholarDigital Library
Ouajdi Brini and Mounir Boukadoum. 2017. Virtualization of the LTE physical layer symbol processing with GPUs. In New Circuits and Systems Conference (NEWCAS), 2017 15th IEEE International. IEEE.Google ScholarCross Ref
Dai Bui, Alessandro Pinto, and Edward A Lee. 2009. On-time network on-chip: Analysis and architecture. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-59 (2009).Google Scholar
Dai N Bui, Hiren D Patel, and Edward A Lee. 2010. Deploying hard real-time control software on chip-multiprocessors. In Embedded and Real-Time Computing Systems and Applications (RTCSA), 2010 IEEE 16th International Conference on. IEEE, 283–292. Google ScholarDigital Library
Divya Chitimalla, Koteswararao Kondepu, Luca Valcarenghi, and Biswanath Mukherjee. 2015. Reconfigurable and efficient fronthaul of 5G systems. In 2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems, ANTS 2015, Kolkata, India, December 15-18, 2015. 1–5.Google ScholarCross Ref
Christoph Cullmann et al. 2010. Predictability considerations in the design of multi-core embedded systems. RTSS.Google Scholar
W. J. Dally. 1992. Virtual-Channel Flow Control. IEEE Trans. Parallel Distrib. Syst. 3, 2 (March 1992), 194–205. Google ScholarDigital Library
Benoît Dupont de Dinechin, Pierre Guironnet de Massas, Guillaume Lager, Clément Léger, Benjamin Orgogozo, Jérôme Reybert, and Thierry Strudel. 2013. A Distributed Run-Time Environment for the Kalray MPPA®-256 Integrated Manycore Processor.. In ICCS, Vol. 13.Google ScholarCross Ref
Angel Dominguez, Sumesh Udayakumaran, and Rajeev Barua. 2005. Heap Data Allocation to Scratch-pad Memory in Embedded Systems. J. Embedded Comput. 1, 4 (Dec. 2005), 521–540. Google ScholarDigital Library
Stephen A Edwards and Edward A Lee. 2007. The case for the precision timed (PRET) machine. In 2007 44th ACM/IEEE DAC. IEEE, 264–265. Google ScholarDigital Library
R. Damodaran et al. 2012. A 1.25GHz 0.8W C66x DSP Core in 40nm CMOS. In VLSID. Google ScholarDigital Library
Heiko Falk et al. 2007. Compile-time decided instruction cache locking using worst-case execution paths. In CODES+ISSS. Google ScholarDigital Library
Heiko Falk et al. 2009. Optimal static WCET-aware scratchpad allocation of program code. In DAC. Google ScholarDigital Library
Arnon Friedmann and Sandeep Kumar. 2009. LTE emerges as early leader in 4G technologies. In White Paper. Texas Instruments.Google Scholar
Nan Guan et al. 2009. Cache-aware scheduling and analysis for multicores. In EMSOFT. Google ScholarDigital Library
Andreas Hansson, Kees Goossens, and Andrei Rˇadulescu. 2005. A Unified Approach to Constrained Mapping and Routing on Networkon-chip Architectures. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS ’05). ACM, New York, NY, USA, 75–80. Google ScholarDigital Library
Andreas Hansson, Mahesh Subburaman, and Kees Goossens. 2009. Aelite: A Flit-synchronous Network on Chip with Composable and Predictable Services. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE ’09). European Design and Automation Association, 3001 Leuven, Belgium, Belgium, 250–255. Google ScholarDigital Library
S. Hesham, J. Rettkowski, D. Goehringer, and M. A. Abd El Ghany. 2017. Survey on Real-Time Networks-on-Chip. IEEE Transactions on Parallel and Distributed Systems 28, 5 (May 2017), 1500–1517. Google ScholarDigital Library
Huawei. 2017. Base Station Operation Increases the Efficiency of Network Construction. https://bit.ly/2GtCd6N .Google Scholar
Yiming Huo, Xiaodai Dong, and Wei Xu. 2017. 5G cellular user equipment: From theory to practical hardware design. IEEE Access 5 (2017).Google Scholar
Xianfeng Li et al. 2007. Chronos: A timing analyzer for embedded software. Science of Computer Programming (2007). Google ScholarDigital Library
Jing Lu, Ke Bai, and Aviral Shrivastava. 2015. Efficient Code Assignment Techniques for Local Memory on Software Managed Multicores. ACM Trans. Embed. Comput. Syst. 14, 4, Article 71 (Dec. 2015), 24 pages. Google ScholarDigital Library
Timothy G Mattson, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram Vangal, Nitin Borkar, Greg Ruhl, et al. 2010. The 48-core scc processor: The programmer’s view. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 1–11. Google ScholarDigital Library
S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. 2006. A Methodology for Mapping Multiple Use-Cases onto Networks on Chips. In Proceedings of the Design Automation Test in Europe Conference, Vol. 1. 1–6. Google ScholarDigital Library
Imtiaz Parvez, Ali Rahmati, Ismail Guvenc, Arif I Sarwat, and Huaiyu Dai. 2017. A Survey on Low Latency Towards 5G: RAN, Core Network and Caching Solutions. arXiv preprint arXiv:1708.02562 (2017).Google Scholar
Klaus I Pedersen, Gilberto Berardinelli, Frank Frederiksen, Preben Mogensen, and Agnieszka Szufarska. 2016. A flexible 5G frame structure design for frequency-division duplex cases. IEEE Communications Magazine 54, 3 (2016), 53–59.Google ScholarDigital Library
Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean François Nezan, and Slaheddine Aridhi. 2014. Preesm: A dataflowbased rapid prototyping framework for simplifying multicore dsp programming. In EDERC. 36.Google Scholar
Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, et al. 2015. T-CREST: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture 61, 9 (2015), 449–471. Google ScholarDigital Library
Martin Schoeberl, Florian Brandner, Jens Sparsø, and Evangelia Kasapaki. 2012. A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems. In Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip (NOCS ’12). IEEE Computer Society, Washington, DC, USA, 152–160. Google ScholarDigital Library
Philipp Schulz, Maximilian Matthe, Henrik Klessig, Meryem Simsek, Gerhard Fettweis, Junaid Ansari, Shehzad Ali Ashraf, Bjoern Almeroth, Jens Voigt, Ines Riedel, et al. 2017. Latency critical IoT applications in 5G: Perspective on the design of radio interface and network architecture. IEEE Communications Magazine 55, 2 (2017), 70–78. Google ScholarDigital Library
Silexica. 2016. Multi-core Software Design For an LTE Base Station, White Paper. https://bit.ly/2TyE7sx.Google Scholar
Magnus Sjalander, Sally A. McKee, Peter Brauer, David Engdal, and Andras Vajda. 2012. An LTE Uplink Receiver PHY Benchmark and Subframe-based Power Management. In Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS ’12). IEEE Computer Society, Washington, DC, USA, 25–34. Google ScholarDigital Library
Avinash Sodani. 2015. Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor. In Hot Chips 27 Symposium (HCS). IEEE, 1–24.Google ScholarCross Ref
Manikantan Srinivasan, C Siva Ram Murthy, and Anusuya Balasubramanian. 2015. Modular performance analysis of Multicore SoC-based small cell LTE base station. In Very Large Scale Integration (VLSI-SoC), 2015 IFIP/IEEE International Conference on. IEEE, 37–42.Google ScholarCross Ref
Christoph Studer, Christian Benkeser, Sandro Belfanti, and Quiting Huang. 2011. Design and implementation of a parallel turbo-decoder ASIC for 3GPP-LTE. IEEE Journal of Solid-State Circuits 46, 1 (2011).Google ScholarCross Ref
Vivy Suhendra et al. 2005. WCET centric data allocation to scratchpad memory. In RTSS. Google ScholarDigital Library
Michael Bedford Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman, Johnson, et al. 2002. The raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE micro (2002). Google ScholarDigital Library
Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic Allocation for Scratch-pad Memory Using Compile-time Decisions. ACM Trans. Embed. Comput. Syst. 5, 2 (May 2006), 472–511.Google ScholarDigital Library
Theo Ungerer, Francisco Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quinones, Mike Gerdes, Marco Paolieri, Julian Wolf, et al. 2010. Merasa: Multicore execution of hard real-time applications supporting analyzability. IEEE Micro 30, 5 (2010), 66–75. Google ScholarDigital Library
Leslie G. Valiant. 1982. A scheme for fast parallel communication. SIAM journal on computing 11, 2 (1982), 350–361.Google Scholar
Vanchinathan Venkataramani, Mun Choon Chan, and Tulika Mitra. 2019. Scratchpad-Memory Management for Multi-Threaded Applications on Many-Core Architectures. ACM Transactions on Embedded Computing Systems (TECS) 18, 1 (2019), 10.Google ScholarDigital Library
Xavier Vera et al. 2007. Data cache locking for tight timing calculations. TECS (2007). Google ScholarDigital Library
Reinhard Wilhelm et al. 2008. The worst-case execution-time problemoverview of methods and survey of tools. TECS. Google ScholarDigital Library
Qi Zheng, Yajing Chen, Ronald G. Dreslinski, Chaitali Chakrabarti, Achilleas Anastasopoulos, Scott A. Mahlke, and Trevor N. Mudge. 2013. WiBench: An open source kernel suite for benchmarking wireless systems. In Proceedings of the IEEE International Symposium on Workload Characterization, IISWC 2013, Portland, OR, USA, September 22-24, 2013.Google ScholarCross Ref
Qi Zheng, Yajing Chen, Hyunseok Lee, Ronald Dreslinski, Chaitali Chakrabarti, Achilleas Anastasopoulos, Scott Mahlke, and Trevor Mudge. 2015. Using Graphics Processing Units in an LTE Base Station. Journal of Signal Processing Systems 78, 1 (01 Jan 2015), 35–47. Google ScholarDigital Library

Index Terms

SPECTRUM: a software defined predictable many-core architecture for LTE baseband processing
1. Computer systems organization
  1. Real-time systems
    1. Real-time system architecture

Recommendations

SPECTRUM: A Software-defined Predictable Many-core Architecture for LTE/5G Baseband Processing
Special Issue on LCETES, Part 1, Real-Time, Critical Systems, and Approximation

Wireless communication standards such as Long-term Evolution (LTE) are rapidly changing to support the high data-rate of wireless devices. The physical layer baseband processing has strict real-time deadlines, especially in the next-generation ...
Read More
Software-Defined Time-Predictable Many-Core Architecture for Lte Baseband Processing
Read More
Using Graphics Processing Units in an LTE Base Station

Base stations have been built from ASICs, DSP processors, or FPGAs. This paper studies the feasibility of building wireless base stations from commercial graphics processing units (GPUs). GPUs are attractive because they are widely used massively ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems
June 2019
218 pages
ISBN:9781450367240
DOI:10.1145/3316482
General Chair:
Jian-Jia Chen
TU Dortmund, Germany
,
Program Chair:
Aviral Shrivastava
Arizona State University, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
5G
LTE
Time-predictable architecture
baseband processing
low-power
many-cores
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate116of438submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SPECTRUM: a software defined predictable many-core architecture for LTE baseband processing

LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

SPECTRUM: A Software-defined Predictable Many-core Architecture for LTE/5G Baseband Processing

Software-Defined Time-Predictable Many-Core Architecture for Lte Baseband Processing

Using Graphics Processing Units in an LTE Base Station