Overview

Authors:

Bin Dong ⁰,
Kesheng Wu ¹,
Suren Byna ²

Bin Dong
1. Lawrence Berkeley National Laboratory, Berkeley, USA
View author publications

You can also search for this author in PubMed Google Scholar
Kesheng Wu
1. Lawrence Berkeley National Laboratory, Berkeley, USA
View author publications

You can also search for this author in PubMed Google Scholar
Suren Byna
1. Lawrence Berkeley National Laboratory, Berkeley, USA
View author publications

You can also search for this author in PubMed Google Scholar

FasTensor can achieve multiple orders of magnitude speedup over Spark and other peer systems in executing big data analysis operations
FasTensor makes programming for data analysis operations at large scale on supercomputers as productively and efficiently as possible
A complete reference of FasTensor includes its theoretical foundations, C++ implementation, and usage in applications

Part of the book series: SpringerBriefs in Computer Science (BRIEFSCOMPUTER)

1543 Accesses
1 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook USD 49.99

Price excludes VAT (USA)

Softcover Book USD 64.99

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

The SpringerBrief introduces FasTensor, a powerful parallel data programming model developed for big data applications. This book also provides a user's guide for installing and using FasTensor. FasTensor enables users to easily express many data analysis operations, which may come from neural networks, scientific computing, or queries from traditional database management systems (DBMS). FasTensor frees users from all underlying and tedious data management tasks, such as data partitioning, communication, and parallel execution.

This SpringerBrief gives a high-level overview of the state-of-the-art in parallel data programming model and a motivation for the design of FasTensor. It illustrates the FasTensor application programming interface (API) with an abundance of examples and two real use cases from cutting edge scientific applications. FasTensor can achieve multiple orders of magnitude speedup over Spark and other peer systems in executing big data analysis operations. FasTensor makes programming for data analysis operations at large scale on supercomputers as productively and efficiently as possible. A complete reference of FasTensor includes its theoretical foundations, C++ implementation, and usage in applications.

Scientists in domains such as physical and geosciences, who analyze large amounts of data will want to purchase this SpringerBrief. Data engineers who design and develop data analysis software and data scientists, and who use Spark or TensorFlow to perform data analyses, such as training a deep neural network will also find this SpringerBrief useful as a reference tool.

Empowering R with High Performance Computing Resources for Big Data Analytics

Big Data programming with Apache Spark

PASTA: a parallel sparse tensor algorithm benchmark suite

Article 01 August 2019

Keywords

Table of contents (4 chapters)

Front Matter

Pages i-xii

Download chapter PDF
Introduction
- Bin Dong, Kesheng Wu, Suren Byna
Pages 1-8
FasTensor Programming Model
- Bin Dong, Kesheng Wu, Suren Byna
Pages 9-22
FasTensor User Interface
- Bin Dong, Kesheng Wu, Suren Byna
Pages 23-71
FasTensor in Real Scientific Applications
- Bin Dong, Kesheng Wu, Suren Byna
Pages 73-84
Back Matter

Pages 85-101

Download chapter PDF

Authors and Affiliations

Lawrence Berkeley National Laboratory, Berkeley, USA

Bin Dong, Kesheng Wu, Suren Byna

About the authors

Dr. Bin Dong is a Research Scientist in Lawrence Berkeley National Laboratory in Berkeley, California, USA. Bin has the Ph.D degree in computing science and technology. Bin has wide research interests in big scientific data analysis, parallel computing, parallel I/O, machine learning, etc. He has co-authored more than 62 technical publications.

Dr. Kesheng Wu is a Senior Scientist at Lawrence Berkeley National Laboratory. He works extensively on data management, data analysis, and scientific computing. He is the developer of a number of widely used algorithms including FastBit bitmap indexes for querying large scientific datasets, Thick-Restart Lanczos (TRLan) algorithm for solving eigenvalue problems, and IDEALEM for statistical data reduction and feature extraction. He has co-authored more than 200 technical publications.

Dr. Suren Byna is a Computer Scientist in the Scientific Data Management (SDM) Group at Lawrence Berkeley National Laboratory in Berkeley, California, USA. His research interests are in scalable scientific data management. More specifically, he works on optimizing parallel I/O and on developing systems for managing scientific data. He leads the ExaIO project in the Exascale Computing Project (ECP) that contributes advanced I/O features to HDF5 and develops a new file system called UnifyFS. He also leads efforts that develop object-centric data management systems (Proactive Data Containers - PDC) and experimental and observational data (EOD) management strategies. He has co-authored more than 150 technical publications.

Bibliographic Information

Book Title: User-Defined Tensor Data Analysis
Authors: Bin Dong, Kesheng Wu, Suren Byna
Series Title: SpringerBriefs in Computer Science
DOI: https://doi.org/10.1007/978-3-030-70750-7
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
Softcover ISBN: 978-3-030-70749-1Published: 30 September 2021
eBook ISBN: 978-3-030-70750-7Published: 29 September 2021
Series ISSN: 2191-5768
Series E-ISSN: 2191-5776
Edition Number: 1
Number of Pages: XII, 101
Number of Illustrations: 23 b/w illustrations
Topics: Database Management, Big Data, Data Engineering, Machine Learning

Publish with us

Policies and ethics

User-Defined Tensor Data Analysis

Overview

Access this book

Subscribe and save

Buy Now

Other ways to access

About this book

Similar content being viewed by others

Empowering R with High Performance Computing Resources for Big Data Analytics

Big Data programming with Apache Spark

PASTA: a parallel sparse tensor algorithm benchmark suite

Keywords

Table of contents (4 chapters)

Front Matter

Introduction

FasTensor Programming Model

FasTensor User Interface

FasTensor in Real Scientific Applications

Back Matter

Authors and Affiliations

Lawrence Berkeley National Laboratory, Berkeley, USA

About the authors

Bibliographic Information

Publish with us

Navigation

User-Defined Tensor Data Analysis

Overview

Access this book

Subscribe and save

Buy Now

Other ways to access

About this book

Similar content being viewed by others

Empowering R with High Performance Computing Resources for Big Data Analytics

Big Data programming with Apache Spark

PASTA: a parallel sparse tensor algorithm benchmark suite

Keywords

Table of contents (4 chapters)

Front Matter

Introduction

FasTensor Programming Model

FasTensor User Interface

FasTensor in Real Scientific Applications

Back Matter

Authors and Affiliations

Lawrence Berkeley National Laboratory, Berkeley, USA

About the authors

Bibliographic Information

Publish with us

Search

Navigation