PACT 2011 Tutorial

SnuCL: An OpenCL Framework and Unified Programming Model

for Heterogeneous CPU/GPU Clusters

13:30PM ~ 17:00PM

Monday, October 10, 2011

Galveston Island, TX, USA


  1. Open Computing Language (OpenCL) is a programming model for heterogeneous parallel computing systems. OpenCL provides a common abstraction layer across different multicore architectures, such as CPUs, GPUs, DSPs, and Cell BE processors. Programmers can write an OpenCL application once and run it on any OpenCL-compliant system. However, current OpenCL is restricted to a single heterogeneous system. To target heterogenous CPU/GPU clusters, programmers must use the OpenCL framework combining with a communication library, such as MPI. The same thing is true for CUDA. This tutorial will cover usages and internals of an OpenCL framework, called SnuCL. SnuCL naturally extends the original OpenCL semantics to the heterogeneous cluster environment. It is a freely available, open-source software developed at Seoul National University. The target cluster consists of a single host node and multiple compute nodes. They are connected by an interconnection network, such as Gigabit and InfiniBand switches. The host node contains multiple CPU cores and each compute node consists of multiple CPU cores and multiple GPUs. For such clusters, SnuCL provides an illusion of a single heterogeneous system for the programmer. A GPU or a set of CPU cores  becomes an OpenCL compute device. SnuCL allows the application to utilize compute devices in a compute node as if they were in the host node. SnuCL achieves both high performance and ease of programming.

  2. (the source code of SnuCL and its benchmark applications are available at the URL

Target audience:

  1. This tutorial is targeted for graduate students, researchers, and practitioners who are interested in heterogeneous parallel computing. It is designed to work either for those with prior OpenCL or CUDA programming experience or for those who are new to OpenCL.


  1. Jaejin Lee, Center for Manycore Programming, Seoul National University,

Outline of the contents:

  1. The first part of the tutorial consists of an introduction to OpenCL and addresses limitations of the current OpenCL programming model. Topics include:

  2. Heterogeneous computing and OpenCL

  3. OpenCL platform model

  4. OpenCL execution model

  5. OpenCL memory model

  6. How to write an OpenCL program

  7. OpenCL synchronization

  8. OpenCL memory consistency

  9. Limitations of the OpenCL programming model

  10. The second part of the tutorial covers the SnuCL framework. Topics include:

  11. Why SnuCL?

  12. Achieving a single system image for a CPU/GPU cluster

  13. Buffer management

  14. SnuCL extensions to OpenCL

  15. Source-to-source kernel restructuring techniques

  16. How to write a SnuCL application

  17. SnuCL benchmark applications

  18. Performance evaluation

  19. Future directions

© 2013 Center for Manycore Programming

Room 520, Building 301, Seoul National University, Seoul 151-744, Korea