Organized By
Industry Sponsor
Past Editions
Nanyang Technological University School of Computer Engineering Parallel Distributed Computing Centre (PDCC) Block N4-B2A-03 50 Nanyang Avenue, Singapore 639798 ©2010 All rights reserved NTU - PDCC


ICPADS 2012 Tutorial Program
16 December 2012

Tutorials are free. There are 85 places available for each tutorial and allocation is based on FCFS. Please indicate your interests in your online registration form. Successful registrants will be informed via email.

Detailed Program:


Title & Presenters


16 Dec 2012: 1:30pm to 6pm

1.30pm Tutorial 1: GPGPU for Real-Time Data Analytics

By: Bingsheng He1, Huynh Phung Huynh2, Rick Siow Mong Goh2

From: 1Nanyang Technological University and 2A*STAR Institute of High Performance Computing, Singapore

LT 1
3.30pm Tea break
4.00pm Tutorial 2: Intel Many Integrated Core (MIC) Architecture

By: Sunil Sherlekar

From: Intel Labs, Bangalore

LT 1
6.00pm End of tutorial


Tutorial 1: GPGPU for Real-Time Data Analytics

The demand for real-time data analytics (RTDA) has been on the rise in the past decades and is ever-growing with the proliferation of different data collection devices (like various sensors, camera and mobiles) and our application requirements (such as monitoring, visualization and interactive explorations). This field has been identified as one of the most exciting and promising areas for both academia and industry. We are facing the challenges at all levels ranging from sophisticated algorithms and procedures to mine the gold from massive data to high-performance computing (HPC) techniques and systems to get the useful data in time. The high-performance requirements come from the ever growing data and time-consuming analytics processes. There has been a tremendous amount of research work on data mining and processing algorithms. Instead, this tutorial focuses on the research in HPC techniques and systems. With the massive computation power and high memory band-width, GPUs have become a sharp weapon to address the performance requirement of RTDA. Designed as co-processors, GPUs pose a number of technical challenges for RTDA in terms of efficiency and programmability. One the one hand, while new-generation GPUs can have over an order of magnitude higher memory bandwidth and higher computation power (in terms of GFLOPS) than CPUs, novel GPGPU algorithmic design and implementation are a must to unleash the hardware power. On the other hand, writing a correct and efficient GPU program is still challenging in general, and even more difficult for RTDA with streaming updates and real-time multi-tasking. In response to this situation, a number of GPGPU systems and tools have been developed recently by leveraging GPGPU for (real-time) data analytics. Some studies have developed a full-fledged system for a particular RTDA application, e.g., PacketShader is a high-performance PC-based software router platform that accelerates the core packet processing in Internet. There have been some tools to ease the programmability of data analytics. For example, the presenters have developed Mars, a MapReduce framework on the GPU, and automatic mapping stream programs to the GPU. Those systems and tools have greatly improved the programmability of GPGPU for data analytics. Users can focus on their application logic, and the details on GPGPU implementations and optimizations are hidden from users. Due to the advancement of RTDA, more GPGPU systems and tools are likely to be (re-)invented.

In this tutorial, we will discuss the open problems and challenging issues in RTDA, and urge the design and development of common systems and tools optimized for GPUs. Next, we will have an extensive review and comparative study on representative GPGPU systems and tools in detail. Still, the major focus of this tutorial is not just about introducing a wide range of systems and techniques to our audience. Rather, we endeavor to offer perspectives from a variety of different angles of looking at the common patterns in improving the efficiency and programmability of RTDA systems. We will also demonstrate our homegrown tools on how they can support RTDA applications. The goal of this tutorial is to provide a comprehensive introduction to current GPGPU research for RTDA to an audience with GPU computing background, interested in participating in research and/or applications of GPGPU to RTDA. We believe that this tutorial will stimulate the discussions from audience and call for further actions to address the open problems.

More details about this tutorial can be found here.

Tutorial 2: Intel Many Integrated Core (MIC) Architecture

The Intel MIC series of processors — now officially termed Intel Xeon Phi — is targeted squarely at High-Performance Computing applications. It provides high performance (One Teraflops sustained on the DGEMM benchmark) while consuming low power. The tutorial will explain this architecture, highlighting its important aspects: large core count with large vector width.

Xeon Phi is architecturally compatible with Xeon in terms of sharing the instruction set. This simplifies porting of Xeon code to Xeon Phi. However, to get the best performance still needs effort. The tutorial will highlight various techniques to achieve good performance.

Drupal 6 Appliance - Powered by TurnKey Linux