skip to main content
10.1145/1555754.1555773acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

AnySP: anytime anywhere anyway signal processing

Published: 20 June 2009 Publication History

Abstract

In the past decade, the proliferation of mobile devices has increased at a spectacular rate. There are now more than 3.3 billion active cell phones in the world-a device that we now all depend on in our daily lives. The current generation of devices employs a combination of general-purpose processors, digital signal processors, and hardwired accelerators to provide giga-operations-per-second performance on milliWatt power budgets. Such heterogeneous organizations are inefficient to build and maintain, as well as waste silicon area and power. Looking forward to the next generation of mobile computing, computation requirements will increase by one to three orders of magnitude due to higher data rates, increased complexity algorithms, and greater computation diversity but the power requirements will be just as stringent. Scaling of existing approaches will not suffice instead the inherent computational efficiency, programmability, and adaptability of the hardware must change. To overcome these challenges, this paper proposes an example architecture, referred to as AnySP, for the next generation mobile signal processing. AnySP uses a co-design approach where the next generation wireless signal processing and high-definition video algorithms are analyzed to create a domain specific programmable architecture. At the heart of AnySP is a configurable single-instruction multiple-data datapath that is capable of processing wide vectors or multiple narrow vectors simultaneously. In addition, deeper computation subgraphs can be pipelined across the single-instruction multiple-data lanes. These three operating modes provide high throughput across varying application types. Results show that AnySP is capable of sustaining 4G wireless processing and high-definition video throughput rates, and will approach the 1000 Mops/mW efficiency barrier when scaled to 45nm.

References

[1]
J. H. Ahn, W.J. Dally, B. Khailany, U.J. Kapasi, and A. Das. Evaluating the imagine stream architecture. In Proc. of the 31st Intl. Symposium on Computer Architecture, pages 14--24, Jun. 2004.
[2]
ARM Ltd. The ARM Architecture Version 6 (ARMv6), 2002. White Paper.
[3]
R. Baines and D. Pulley. The PICOArray and reconfigurable baseband processing for wireless basestations. In Software Defined Radio, February 2004.
[4]
N. Clark, J. Blome, M. Chu, S. Mahlke, S. Biles, and K. Flautner. An architecture framework for transparent instruction set customization in embedded processors. In ISCA '05: Proceedings of the 32nd annual international symposium on Computer Architecture, pages 272--283.
[5]
H. Corporaal and H. (J.M.) Mulder. Move: A framework for high-performance processor design. In Supercomputing '91: Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pages 692--701, New York, NY, USA, 1991.
[6]
W.J. Dally et al. Efficient embedded computing. Computer, 41(7):27--32, July 2008.
[7]
K. Fan et al. Systematic register bypass customization for application-specific processors. Proceedings. IEEE International Conference on Application-Specific Systems, Architectures, and Processors, 2003, pages 64--74, June 2003.
[8]
J. Glossner, E. Hokenek, and M. Moudgill. The sandbridge sandblaster communications processor. In 3rd Workshop on Application Specific Processors, pages 53--58, Sept. 2004.
[9]
N. Goel, A. Kumar, and P. R. Panda. Power reduction in VLIW processor with compiler driven bypass network. In VLSID '07: Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference, pages 233--238, 2007.
[10]
R. Golshan and B. Haroun. A novel reduced swing CMOS bus interface circuit for high speed low power VLSI systems. volume 4, pages 351--354 vol.4, May-2 Jun 1994.
[11]
X. Guan and Y. Fei. Reducing power consumption of embedded processors through register file partitioning and compiler support. International Conference on Application-Specific Systems, Architectures and Processors, pages 269--274, July 2008.
[12]
Nanoscale Integration and Modeling Group. Predictive Technology Model. http://www.eas.asu.edu/ ptm/.
[13]
S. Knowles. The SoC Future is Soft. IEE Cambridge Branch Seminar 2005, Dec. 2005. http://www.iee-cambridge.org.uk/arc/seminar05/slides/SimonKnowles.pdf.
[14]
C. Kozyrakis and C. Patterson. Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks. In Proc. of the 35th Intl. Symposium on Microarchitecture, pages 283--293, Nov. 2002.
[15]
R. Krashinsky et al. The vector-thread architecture. In Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004., pages 52--63, June 2004.
[16]
T. A. Lin, T. M. Liu, and C. Y. Lee. A low-power H.264/AVC decoder. International Symposium on VLSI Design, Automation and Test, 2005., pages 283--286, April 2005.
[17]
Y. Lin et al. SODA: A low-power architecture for software radio. In Proc. of the 33rd Annual International Symposium on Computer Architecture, pages 89--101, 2006.
[18]
Y. Lin, S. Mahlke, T. Mudge, C. Chakrabarti, A. Reid, and K. Flautner. Design and implementation of Turbo decoders for software defined radio. IEEE Workshop on Signal Processing Systems Design and Implementation, 2006. SIPS '06., pages 22--27, Oct. 2006.
[19]
A. Lodi et al. Xisystem: A XiRisc-based SoC with reconfigurable IO module. IEEE Journal of Solid-State Circuits, 41(1):85--96, Jan. 2006.
[20]
B.F. Mei et al. ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. 13th International Conference on Field-Programmable Logic and Applications, 2003. FPL 2003, pages 61--70, Sept. 2003.
[21]
S. Park et al. Register file power reduction using bypass sensitive compiler. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 27(6):1155--1159, June 2008.
[22]
S. Park, A. Shrivastava, N. Dutt, A. Nicolau, Y. Paek, and E. Earlie. Bypass aware instruction scheduling for register file power reduction. In LCTES '06: Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems, pages 173--181, New York, NY, USA, 2006.
[23]
A. Peleg, S. Wilkie, and U. Weiser. Intel MMX for multimedia PCs. Commun. ACM, 40(1):24--38, 1997.
[24]
D. Pham et al. The design and implementation of a first generation CELL processor. In IEEE Intl. Solid State Circuits Symposium, February 2005.
[25]
P. Raghavan et al. A customized cross-bar for data-shuffling in domain-specific simd processors. In ARCS, volume 4415 of Lecture Notes in Computer Science, pages 57--68. Springer, 2007.
[26]
U. Ramacher. Software-defined radio prospects for multistandard mobile phones. Computer, 40(10):62--69, Oct. 2007.
[27]
S.K. Raman, V. Pentkovski, and J. Keshava. Implementing streaming simd extensions on the pentium III processor. Micro, IEEE, 20(4):47--57, Jul/Aug 2000.
[28]
International Telecommunications Union M.1645 Recommendation. Framework and overall objectives of the future development of IMT-2000 and systems beyond IMT-2000". http://www.ieee802.org/secmail/pdf00204.pdf.
[29]
L. Seiler et al. Larrabee: A many-core x86 architecture for visual computing. In SIGGRAPH '08: ACM SIGGRAPH 2008 papers, pages 1--15, New York, NY, USA.
[30]
S. Seo, T. Mudge, Y. Zhu, and C. Chakrabarti. Design and analysis of LDPC decoders for software defined radio. IEEE Workshop on Signal Processing Systems, 2007, pages 210--215, Oct. 2007.
[31]
H. Taoka, K. Higuchi, and M. Sawahashi. Field experiments on real-time 1-Gbps high-speed packet transmission in MIMO-OFDM broadband packet radio access. IEEE 63rd Vehicular Technology Conference, 2006. VTC 2006-Spring., 4:1812--1816, May 2006.
[32]
K. van Berkel, F. Heinle, P. P. E. Meuwissen, K. Moerman, and M. Weiss. Vector processing as an enabler for software-defined radio in handheld devices. EURASIP J. Appl. Signal Process., 2005(1):2613--2625, 2005.
[33]
M. Woh et al. The next generation challenge for software defined radio. In Proc. 7th Intl. Conference on Systems, Architectures, Modelling, and Simulation, pages 343--354, Jul. 2007.
[34]
M. Woh et al. From SODA to Scotch: The evolution of a wireless baseband processor. Proceedings. 41th Annual IEEE/ACM International Symposium on Microarchitecture, 2008. MICRO-41., pages 152--163, Nov. 2008.

Cited By

View all
  • (2024)Canalis: A Throughput-Optimized Framework for Real-Time Stream Processing of Wireless CommunicationACM Transactions on Reconfigurable Technology and Systems10.1145/369588017:4(1-32)Online publication date: 22-Nov-2024
  • (2023)Vector Memory-Access Shuffle Fused Instructions for FFT-Like AlgorithmsChinese Journal of Electronics10.23919/cje.2021.00.40132:5(1077-1088)Online publication date: Sep-2023
  • (2019)VIP: A Versatile Inference Processor2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00049(345-358)Online publication date: Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture
June 2009
510 pages
ISBN:9781605585260
DOI:10.1145/1555754
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 37, Issue 3
    June 2009
    495 pages
    ISSN:0163-5964
    DOI:10.1145/1555815
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fully programmable architecture
  2. high-end signal processing
  3. low-power architecture
  4. simd
  5. single-instruction multiple-data parallelism
  6. software defined radio

Qualifiers

  • Research-article

Conference

ISCA '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Canalis: A Throughput-Optimized Framework for Real-Time Stream Processing of Wireless CommunicationACM Transactions on Reconfigurable Technology and Systems10.1145/369588017:4(1-32)Online publication date: 22-Nov-2024
  • (2023)Vector Memory-Access Shuffle Fused Instructions for FFT-Like AlgorithmsChinese Journal of Electronics10.23919/cje.2021.00.40132:5(1077-1088)Online publication date: Sep-2023
  • (2019)VIP: A Versatile Inference Processor2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00049(345-358)Online publication date: Feb-2019
  • (2018)VideoCoreClusterWireless Communications & Mobile Computing10.1155/2018/74702342018Online publication date: 1-Jan-2018
  • (2017)An efficient conflict-free memory-addressing unit for SIMD VLIW DSP2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS)10.23919/SPECTS.2017.8046778(1-7)Online publication date: Jul-2017
  • (2017)XProACM SIGARCH Computer Architecture News10.1145/3140659.308021945:2(69-80)Online publication date: 24-Jun-2017
  • (2017)XProProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080219(69-80)Online publication date: 24-Jun-2017
  • (2017)Optimizing General-Purpose CPUs for Energy-Efficient Mobile Web ComputingACM Transactions on Computer Systems10.1145/304102435:1(1-31)Online publication date: 20-Mar-2017
  • (2016)A configurable SIMD architecture with explicit datapath for intelligent learning2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS)10.1109/SAMOS.2016.7818343(156-163)Online publication date: Jul-2016
  • (2016)A low power software-defined-radio baseband processor for the Internet of Things2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446052(40-51)Online publication date: Mar-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media