GPU Meeting (20 Sept 2011)

Matt Kinsey: Porting the 2D Wave Equation to the GPU
  • Optimal number of threads per block is 32*n-1, where n is an integer.  The best performance in the example shown was 63 threads per block.
  • Minimum number of blocks per grid is 32, according to the user’s guide.
  • Every time a kernel is called, the memory needs to be pushed from the CPU to the GPU.  Thus it is optimal to minimize the kernel calls.
  • In the 2D wave equation problem, Matt utilized texture memory to reduce the number of kernels to one.  The memory is indexed in a space-filling curve.  This results in better cache locality.
  • With texture memory, one can take advantage of built-in linear interpolation and boundary conditions.  Texture memory can be addressed in 1D, 2D, and 3D.

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload CAPTCHA.

*