5th Workshop on Big Data Benchmarking (2014)

Dr. Matthias Uflacker


The objective of the WBDB workshops is to make progress towards development of industry standard application-level benchmarks for evaluating hardware and software systems for big data applications.

To make progress towards a big data benchmarking standard, the workshop will explore a range of issues including:

Data features: New feature sets of data including, high-dimensional data, sparse data, event-based data, and enormous data sizes.
System characteristics: System-level issues including, large-scale and evolving system configurations, shifting loads, and heterogeneous technologies for big data and cloud platforms.
Implementation options: Different implementation options such as SQL, NoSQL, Hadoop software ecosystem, and different implementations of HDFS.
Workload: Representative big data business problems and corresponding benchmark implementations. Specification of benchmark applications that represent the different modalities of big data, including graphs, streams, scientific data, and document collections.
Hardware options: Evaluation of new options in hardware including different types of HDD, SSD, and main memory, and large-memory systems, and new platform options that include dedicated commodity clusters and cloud platforms.
Synthetic data generation: Models and procedures for generating large-scale synthetic data with requisite properties.
Benchmark execution rules: E.g. data scale factors, benchmark versioning to account for rapidly evolving workloads and system configurations, benchmark metrics.
Metrics for efficiency: Measuring the efficiency of the solution, e.g. based on costs of acquisition, ownership, energy and/or other factors, while encouraging innovation and avoiding benchmark escalations that favor large inefficient configuration over small efficient configurations.
Evaluation frameworks: Tool chains, suites and frameworks for evaluating big data systems.
Early implementations: Of the Deep Analytics Pipeline or BigBench and lessons learned in benchmarking big data applications.
Enhancements: Proposals to augment these benchmarks, e.g. by adding more data genres (e.g. graphs), or incorporating a range of machine learning and other algorithms, will be entertained and are encouraged.

Session 1

Welcome & Introduction to WBDB

Date: August 5, 2014
Language: English
Duration: 00:13:02

An Approach to Benchmarking Industrial Big Data Applications

Date: August 5, 2014
Language: English
Duration: 00:42:07

In-Memory Processing in Healthcare and Life Sciences

Date: August 5, 2014
Language: English
Duration: 00:29:08

Session 2

Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?

Date: August 5, 2014
Language: English
Duration: 00:22:04

Extending The OLTP-Bench Framework for Big Data Systems

Date: August 5, 2014
Language: English
Duration: 00:21:08

Session 3

LDBC: Linked Data Benchmark Council

Date: August 5, 2014
Language: English
Duration: 00:28:07

SQL on Hadoop Benchmark

Date: August 5, 2014
Language: English
Duration: 00:07:01

Benchmarking Virtualized Hadoop Clusters

Date: August 5, 2014
Language: English
Duration: 00:18:59

Towards A Complete BigBench Implementation

Date: August 5, 2014
Language: English
Duration: 00:15:41

Session 5

A TU Delft Perspective on Benchmarking Big Data in the Data Center

Date: August 6, 2014
Language: English
Duration: 00:39:57

BW-EML SAP Standard Application Benchmark

Date: August 6, 2014
Language: English
Duration: 00:22:36

FoodBroker - Generating Synthetic Datasets for Graph-Based Business Analytics

Date: August 6, 2014
Language: English
Duration: 00:19:03

Session 6

Main Memory Is Less Expensive Than Disk

Date: August 6, 2014
Language: English
Duration: 00:15:39

Building Efficient Data Intensive Environments

Date: August 6, 2014
Language: English
Duration: 00:08:48

PopulAid: In-Memory Data Generation for Customized Benchmarks

Date: August 6, 2014
Language: English
Duration: 00:18:06

Session 7

Benchmarking Elastic Query Processing on Big Data

Date: August 6, 2014
Language: English
Duration: 00:27:09