skip to main content
10.1145/1162618.1162626acmotherconferencesArticle/Chapter ViewAbstractPublication PagessnapiConference Proceedingsconference-collections
Article

The Mercury system: exploiting truly fast hardware for data search

Published: 28 September 2003 Publication History

Abstract

In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.

References

[1]
John Reynders, "Computing Biology," invited talk at 5th High Performance Embedded Computing Workshop, November 2001.
[2]
Pedro Domingos and Geoff Hulten, "Catching Up with the Data: Research Issues in Mining Data Streams," in Workshop on Research Issues in Data Mining and Knowledge Discovery, May 2001.
[3]
Benjamin M. West, "An FPGA-Based, High-Speed Search Engine for Off-the-Shelf Hard Drives," Master's Thesis, Washington University, 2003.
[4]
John Lockwood, Chris Zuver, Chris Neely, James Moscola, Sarang Dharmapurikar, "An Extensible System-On-Chip Internet Firewall," in Proc. of Design Automation Conf., June 2003.
[5]
R. Baeza-Yates and G. H. Gonnet, "A new approach to text searching," Communications of the ACM, 35(10):74--82, October 1992.
[6]
S. Wu and U. Manber, "Fast text searching allowing errors," Communications of the ACM, 35(10): 83--91, October 1992.
[7]
Dan Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge University Press, 1997.
[8]
S. F. Altschul, W. Gish, W. Miller, E. W Myers, and D. J. Lipman, "Basic Local Alignment Search Tool," J. Mol. Biol., 215:403--410.
[9]
J. A. O'Sullivan, M. D. DeVore, V. Kedia, and M. I. Miller, "SAR ATR performance using a conditionally Gaussian model," IEEE Transactions on Aerospace and Electronic Systems, 37(1):91--108, January 2001.
[10]
Michael D. DeVore, Joseph A. O'Sullivan, Roger D. Chamberlain, and Mark A. Franklin, "Relationships Between Computational System Performance and Recognition System Performance," in Proc. of SPIE 15th Annual International Symposium on Aerospace/Defense Sensing, Simulation and Controls (Automatic Target Recognition XI), April 2001.
[11]
Michael D. DeVore, Roger D. Chamberlain, George L. Engel, Joseph A. O'Sullivan, and Mark A. Franklin, "Tradeoffs Between Quality of Results and Resource Consumption in a Recognition System," in Proc. of IEEE Int'l Conf. on Application-Specific Systems, Architectures and Processors, pp. 391--402, July 2002.
[12]
D. E. Knuth, Fundamental Algorithms, 2nd edition, AddisonWesley, 1974.
[13]
Sharath Reddy Cholleti, "Storage Allocation in Bounded Time," Master's Thesis, Washington University, 2002.
[14]
Erik Riedel, "Active Disks - Remote Execution for Network-Attached Storage," PhD Thesis, Carnegie-Mellon University, 1999.
[15]
Erik Riedel, Garth Gibson, and Christos Faloutsos, "Active Storage For Large-Scale Data Mining and Multimedia," in Proceedings of the 24th International Conference on Very Large Databases, pp. 62--73, August 1998.
[16]
Erik Riedel, Christos Faloutsos, Garth A. Gibson, and David Nagle, "Active Disks for Large-Scale Data Processing." IEEE Computer, 34(6):68--74, June 2001.
[17]
Kimberly Keeton, David A. Patterson, and Joseph M. Hellerstein, "A Case for Intelligent Disks (IDISKs)," SIGMOD Record, 24(7): 42--52, September 1998.
[18]
A. Acharya, M. Uysal, and J. Saltz, "Active Disks," in Proc. of Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 81--91, October 1998.
[19]
Cluster File Systems, "Lustre: A Scalable High-Performance File System," White Paper, 2002.

Cited By

View all
  • (2017)Reconfigurable acceleration of genetic sequence alignment: A survey of two decades of efforts2017 27th International Conference on Field Programmable Logic and Applications (FPL)10.23919/FPL.2017.8056838(1-8)Online publication date: Sep-2017
  • (2017)Design Space Exploration of 2-D Processor Array Architectures for Similarity Distance ComputationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.265690028:8(2218-2228)Online publication date: 1-Aug-2017
  • (2014)Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage DevicesIEEE Transactions on Computers10.1109/TC.2013.10163:9(2356-2368)Online publication date: Sep-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SNAPI '03: Proceedings of the international workshop on Storage network architecture and parallel I/Os
September 2003
83 pages
ISBN:9781450378215
DOI:10.1145/1162618
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. application-specific processing
  2. database systems
  3. reconfigurable hardware

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Reconfigurable acceleration of genetic sequence alignment: A survey of two decades of efforts2017 27th International Conference on Field Programmable Logic and Applications (FPL)10.23919/FPL.2017.8056838(1-8)Online publication date: Sep-2017
  • (2017)Design Space Exploration of 2-D Processor Array Architectures for Similarity Distance ComputationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.265690028:8(2218-2228)Online publication date: 1-Aug-2017
  • (2014)Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage DevicesIEEE Transactions on Computers10.1109/TC.2013.10163:9(2356-2368)Online publication date: Sep-2014
  • (2009)Optimal runtime reconfiguration strategies for systolic arrays2009 International Conference on Field Programmable Logic and Applications10.1109/FPL.2009.5272515(162-167)Online publication date: Aug-2009
  • (2009)Similarity Computation Using Reconfigurable Embedded HardwareProceedings of the 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing10.1109/DASC.2009.79(323-329)Online publication date: 12-Dec-2009
  • (2009)Acceleration of ungapped extension in Mercury BLASTMicroprocessors & Microsystems10.1016/j.micpro.2009.02.00733:4(281-289)Online publication date: 1-Jun-2009
  • (2008)Mercury BLASTPACM Transactions on Reconfigurable Technology and Systems10.1145/1371579.13715811:2(1-44)Online publication date: 1-Jun-2008
  • (2008)A Cost-Effective Guarantee of Security and Scalability on HVEM DataGrid with Active Disk2008 32nd Annual IEEE International Computer Software and Applications Conference10.1109/COMPSAC.2008.28(409-416)Online publication date: Jul-2008
  • (2008)Hardware acceleration for similarity computations of feature vectorsCanadian Journal of Electrical and Computer Engineering10.1109/CJECE.2008.462179133:1(21-30)Online publication date: Dec-2009
  • (2008)Parallel Computation of Similarity Measures Using an FPGA-Based Processor ArrayProceedings of the 22nd International Conference on Advanced Information Networking and Applications10.1109/AINA.2008.97(955-962)Online publication date: 25-Mar-2008
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media