michael.chu

Just Released: ATI Stream SDK v2.2 w/ OpenCL 1.1 Support

Discussion created by michael.chu on Aug 12, 2010

http://developer.amd.com/stream

http://www.amd.com/us/press-releases/Pages/amd-software-dev-2010aug12.aspx

http://developer.amd.com/openclzone

http://blogs.amd.com/developer/2010/08/11/opencl-a-nimble-standard/

What’s New in v2.2

  • Support for OpenCL™ 1.1 specification.
  • Support for Ubuntu® 10.04 and Red Hat® Enterprise Linux® 5.5.
  • Support for X86 CPUs with SSE2.x or later (Adds to existing support for X86 CPUs with SSE3.x or later).
  • Support for Microsoft® Visual Studio® 2010 Professional Edition and Minimalist GNU for Windows (MinGW) [GCC 4.4].
  • Support for GNU Compiler Collection (GCC) 4.1 or later on Linux® systems (Adds to existing support for GCC 4.3 or later).
  • Support for single-channel OpenCL™ image format.
  • Support for OpenCL™ / DirectX® 10 interoperability.
  • Support for additional double-precision floating point routines in OpenCL™ C kernels.
  • Support for generating and loading binary OpenCL™ kernels.
  • Support for native OpenCL™ kernels.
  • Preview Feature: Support for accessing additional physical memory on the GPU from OpenCL™ applications. 
  • Preview Feature: Support for printf() in OpenCL™ C kernels.
  • Extension: Support for additional event states when registering event callbacks in OpenCL™ 1.1.
  • Additional OpenCL™ samples:
    • ConstantBandwidth (under cl/MicroBenchmarks)
    • GlobalMemoryBandwidth (under cl/MicroBenchmarks)
    • ImageBandwidth (under cl/MicroBenchmarks)
    • LDSBandwidth (under cl/MicroBenchmarks)
    • MemoryOptimizations
    • PCIeBandwidth (under cl/MicroBenchmarks)
    • SimpleDX10
    • SimpleMultiDevice
  • Package Update: ATI Stream Profiler 1.4.
  • Various OpenCL™ compiler and runtime fixes and enhancements (see developer release notes for more details).
  • Expanded OpenCL™ performance optimization guidelines in the ATI Stream SDK OpenCL™ Programming Guide, including:
    • Global memory optimizations
    • LDS optimizations
    • Register and LDS impact on number of active wavefronts
    • Load-balancing across multiple OpenCL™ devices
    • Instruction bandwidths
    • Key cache sizes and bandwidths for "Evergreen" GPUs

Outcomes