- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Bhaumik Vaidya
- 196字
- 2021-08-13 15:48:25
Local memory and registers
Local memory and register files are unique to each thread. Register files are the fastest memory available for each thread. When variables of the kernel do not fit in register files, they use local memory. This is called register spilling. Basically, local memory is a part of global memory that is unique for each thread. Access to local memory will be slow compared to register files. Though local memory is cached in L1 and L2 caches, register spilling might not affect your program adversely.
A simple program to understand how to use local memory is shown as follows:
#include <stdio.h>
#define N 5
__global__ void gpu_local_memory(int d_in)
{
int t_local;
t_local = d_in * threadIdx.x;
printf("Value of Local variable in current thread is: %d \n", t_local);
}
int main(int argc, char **argv)
{
printf("Use of Local Memory on GPU:\n");
gpu_local_memory << <1, N >> >(5);
cudaDeviceSynchronize();
return 0;
}
The t_local variable will be local to each thread and stored in a register file. When this variable is used for computation in the kernel function, the computation will be the fastest. The output of the preceding code is shown as follows:

- Embedded Linux Projects Using Yocto Project Cookbook
- arc42 by Example
- Servlet/JSP深入詳解
- jQuery從入門到精通 (軟件開發視頻大講堂)
- 你必須知道的204個Visual C++開發問題
- 深入淺出DPDK
- Hands-On Reinforcement Learning with Python
- 劍指MySQL:架構、調優與運維
- Learning AngularJS for .NET Developers
- 智能手機APP UI設計與應用任務教程
- 區塊鏈技術進階與實戰(第2版)
- .NET 4.5 Parallel Extensions Cookbook
- Mockito Essentials
- Learning Grunt
- Instant SQL Server Analysis Services 2012 Cube Security