ICS 2013 Tutorial

SnuCL: An OpenCL Framework for Heterogeneous Clusters

June, 2013

Eugene, Oregon, USA


  1. OpenCL is a programming model for heterogeneous parallel computing systems. OpenCL provides a common abstraction layer across different multicore architectures, such as CPUs, GPUs, DSPs, and Xeon Phi processors. However, current OpenCL is restricted to a single heterogeneous system. To target heterogenous clusters, programmers must use the OpenCL framework combining with a communication library, such as MPI. The same thing is true for CUDA. This tutorial will cover accelerator architectures, such as GPUs and Xeon Phi, and introduction to OpenCL programming. In addition, it introduces an OpenCL framework, called SnuCL. SnuCL naturally extends the original OpenCL semantics to the heterogeneous cluster environment. It is a freely available, open-source software developed at Seoul National University. SnuCL provides an illusion of a single heterogeneous system for the programmer. SnuCL achieves both high performance and ease of programming. Finally, we characterize the performance of an OpenCL implementation (SNU NPB suite) of the NAS Parallel Benchmark suite.

  2. (the source code of SnuCL and the SNU NPB suite is available at the URL http://aces.snu.ac.kr/Center_for_Manycore_Programming/Software.html)

Target audience:

  1. This tutorial is targeted for graduate students, researchers, and practitioners who are interested in heterogeneous parallel programming models. It is designed to work either for those with prior OpenCL or CUDA programming experience or for those who are new to OpenCL.


  1. Jaejin Lee, Center for Manycore Programming, Seoul National University, jlee@cse.snu.ac.kr

Outline of the contents:

  1. The first part of this tutorial consists of an introduction to OpenCL programming and address limitations of the current OpenCL programming model. It will cover accelerator architectures, such as GPUs and Intel Xeon Phi, and basic concepts of heterogeneous parallel computing, OpenCL programming constructs, OpenCL program structures, and limits of current OpenCL.

  2. Heterogeneous computing and OpenCL

  3. Parallel processing background

  4. GPU and Intel Xeon Phi architectures

  5. Introduction to the OpenCL framework

  6. How to write an OpenCL program?

  7. Limitations of the current OpenCL programming model

  8. The second part of the tutorial covers the SnuCL framework. The main distinction of SnuCL from other OpenCL frameworks is its capability of programming heterogeneous clusters. We will cover OpenCL extensions in SnuCL for clusters, the internal structure of the SnuCL runtime and source-to-source translators, and SnuCL program structures. In addition, we evaluate the performance of SnuCL and introduces its benchmark suite.

  9. Why SnuCL?

  10. Achieving a single system image for heterogeneous clusters

  11. Source-to-source translation techniques

  12. SnuCL runtime

  13. Buffer and consistency management

  14. SnuCL collective communication extensions to OpenCL