W06 Data-driven applications for industrial and societal challenges: Problems, methods, and computing platforms
W06.1 Big data, HPC and FPGAs
This session gives an overview of the challenges of modern big-data applications, reports on state-of-the art FPGA-based HPC systems, describes an open-source reconfigurable hardware ecosystem and design flow, and closes with recent research on hardware acceleration for sparse big-data workloads.
W06.1.1 Workshop introduction
Brief introduction by the organizers on the motivation for as well as the format and the contents of the DATA-DREAM workshop in DATE 2022.
W06.1.2 Evolution of the Data Market: Highlights and Projections
Taking decisions in the context of the data economy requires a deep understanding of the market and its projections. Investment in technologies should consider the value of indicators associated to the potential growth of the market, competitors, size of the ecosystem or a view on the skills gap. This presentation will offer updated figures elaborated by the Data Market Study run by IDC for 2021-2023 and will position the figures in a set of potential scenarios that will define the performance of the EU in a data-driven economy. Attendees will learn about the value of indicators such as data professionals and the skills gap, data companies, data suppliers, data economy, the value of the data market or the international dimension bringing some knowledge on markets outside the EU (US, Brazil, Japan, China).
The presentation will also look at major developments and initiatives in Europe of relevance to the development of the data economy and will reflect on the relationship between the different technologies that are needed to maximize competitiveness and keep digital sovereignty.
W06.1.3 System and Applications of FPGA Cluster "ESSPER" for Research on Reconfigurable HPC
At RIKEN Center for Computational Science (R-CCS), we have been developing an experimental FPGA Cluster named "ESSPER (Elastic and Scalable System for high-PErformance Reconfigurable computing)," which is a research platform for reconfigurable HPC. ESSPER is composed of sixteen Intel Stratix 10 SX FPGAs which are connected to each other by a dedicated 100Gbps inter-FPGA network. We have developed our own Shell (SoC) and its software APIs for the FPGAs supporting inter-FPGA communication. The FPGA host servers are connected to a 100Gbps Infiniband switch, which allows distant servers to remotely access the FPGAs by using a software bridged Intel's OPAE FPGA driver, called R-OPAE. By 100Gbps Infiniband network and R-OPAE, ESSPER is actually connected to the world's fastest supercomputer, Fugaku, deployed in RIKEN, so that using Fugaku we can program bitstreams onto FPGAs remotely using R-OPAE, and off-load tasks to the FPGAs. In this talk, I introduce our ESSPER's concept, system stack of hardware and software, programming environment, under-development applications as well as our future prospects for reconfigurable HPC.
W06.1.4 Open-Source Hardware for Heterogeneous Computing
Information technology has entered the age of heterogeneous computing. Across a variety of application domains, computer systems rely on highly heterogeneous architectures that combine multiple general-purpose processors with many specialized hardware accelerators. The complexity of these systems, however, threatens to widen the gap between the capabilities provided by semiconductor technologies and the productivity of computer engineers. Open-source hardware is a promising avenue to address this challenge by enabling design reuse and collaboration. ESP is an open-source research platform for system-on-chip design that combines a scalable tile-based architecture and a flexible system-level design methodology. Conceived as a heterogeneous system integration platform, ESP is intrinsically suited to foster collaborative engineering across the open-source hardware community.
W06.1.5 Near-Memory Hardware Acceleration of Sparse Workloads
Sparse linear algebra operations are widely used in numerous application domains such as graph processing, machine learning, and scientific computing. These operations are typically more challenging to accelerate due to low operational intensity and irregular data access patterns.
This talk presents our recent investigation into near-memory hardware acceleration for sparse processing. Specifically, I will discuss the importance of co-designing the sparse storage format and accelerator architecture to maximize the bandwidth utilization and compute occupancy. As a case study, I will introduce GraphLily, a graph linear algebra overlay for accelerating graph processing on HBM-equipped FPGAs. GraphLily supports a rich set of graph algorithms by adopting the GraphBLAS programming interface, which formulates graph algorithms as sparse linear algebra operations.
W06 Break 1 Coffee break
W06.2 Software development, libraries and languages
This session turns to software support in form of programming models, runtime adaptability, modern libraries and programming abstractions for heterogeneous HPC and big data systems.
W06.2.1 Methods and Tools for Accelerating Image Processing Applications on FPGA-based Systems
Field Programmable Gate Arrays (FPGAs) are a promising platform for accelerating image processing as well as machine learning applications due to their parallel architecture, reconfigurability and energy-efficiency. However, programming such platforms can be quite cumbersome and time consuming compared to CPUs or GPUs. This presentation shows methods and tools for reducing the programming effort for image processing applications on FPGA-based systems. Our design methodology is based on the Open-VX standard and includes an open-source High Level Synthesis (HLS) library for generating image processing and neural network accelerators called HiFlipVX. The importance of such an approach is shown with application examples from different research projects.
W06.2.2 GridTools: High-level HPC Libraries for Weather and Climate
GridTools is a set of C++ libraries and Python tools to enable weather and climate scientists to express their computations in a high-level hardware-agnostic way, while providing highly efficient execution of the codes.
Born to address the problem of portability of performance, the original GridTools offers a lower level C++ interface, which is in production use at the Swiss national weather service, and supports different multicore architectures and GPU vendors and generations. However, C++ is not the language of choice in the weather and climate community. Therefore, a new effort has been started to provide declarative-style Python interfaces to analyze and transform the user code into efficient hardware-specific C++ code, utilizing the existing GridTools C++ libraries. In this presentation we will review the different approaches and highlight on their pros and cons.
W06.2.3 Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-Accelerated Climate Simulation
Most compilers have a single core intermediate representation (IR) (e.g., LLVM) sometimes complemented with vaguely defined IR-like data structures. This IR is commonly low-level and close to machine instructions. As a result, optimizations relying on domain-specific information are either not possible or require complex analysis to recover the missing information. In contrast, multi-level rewriting instantiates a hierarchy of dialects (IRs), lowers programs level-by-level, and performs code transformations at the most suitable level. We demonstrate the effectiveness of this approach for the weather and climate domain. In particular, we develop a prototype compiler and design stencil- and GPU-specific dialects based on a set of newly introduced design principles. We find that two domain-specific optimizations (500 lines of code) realized on top of LLVM’s extensible MLIR compiler infrastructure suffice to outperform state-of-the-art solutions. In essence, multi-level rewriting promises to herald the age of specialized compilers composed from domain- and target-specific dialects implemented on top of a shared infrastructure.
W06.2.4 climbing EVEREST: dEsign enVironmEnt foR Extreme-Scale big data analyTics on heterogeneous platforms
This talk introduces the consortium-wide effort doing within the EVEREST H2020 project. The EVEREST project aims at developing a holistic design environment that simplifies the programmability of High-Performance Big Data analytics for heterogeneous, distributed, scalable, and secure systems. Our effort is concentrated on the use of a “data-driven” design approach together with domain-specific language extensions, hardware-accelerated AI, and efficient run-time monitoring while considering a unified hardware/software paradigm. The project targets a wide range of applications from weather analysis-based production for the renewable energy market trading, to air-quality monitoring of industrial sites, and real-time traffic modeling for transportation in smart cities.