官术网_书友最值得收藏!

Memory-related properties

Memory on the GPU has a hierarchical architecture. It can be divided in terms of L1 cache, L2 cache, global memory, texture memory, and shared memory. The cudaDeviceProp provides many properties that help in identifying memory available with the device. memoryClockRate and memoryBusWidth provide clock rate and bus width of the memory respectively. The speed of the memory is very important. It affects the overall speed of your program. totalGlobalMem returns the size of global memory available with the device. totalConstMem returns the total constant memory available with the device. sharedMemPerBlock returns the total shared memory that can be used in tne device. The total number of registers available per block can be identified by using regsPerBlock. Size of L2 cache can be identified using the l2CacheSize property. The following code snippet shows how to use memory-related properties from the CUDA program:

printf( " Total amount of global memory: %.0f MBytes (%llu bytes)\n",
(float)device_Property.totalGlobalMem / 1048576.0f, (unsigned long long) device_Property.totalGlobalMem);
printf(" Memory Clock rate: %.0f Mhz\n", device_Property.memoryClockRate * 1e-3f);
printf(" Memory Bus Width: %d-bit\n", device_Property.memoryBusWidth);
if (device_Property.l2CacheSize)
{
printf(" L2 Cache Size: %d bytes\n", device_Property.l2CacheSize);
}
printf(" Total amount of constant memory: %lu bytes\n", device_Property.totalConstMem);
printf(" Total amount of shared memory per block: %lu bytes\n", device_Property.sharedMemPerBlock);
printf(" Total number of registers available per block: %d\n", device_Property.regsPerBlock);
主站蜘蛛池模板: 义乌市| 三穗县| 会东县| 富阳市| 正阳县| 祁阳县| 广汉市| 丹寨县| 德昌县| 资兴市| 木兰县| 辽阳县| 武清区| 盐池县| 仁布县| 兰考县| 五原县| 霞浦县| 长阳| 怀仁县| 西宁市| 江门市| 彰武县| 玉溪市| 乐业县| 高碑店市| 法库县| 永城市| 兴海县| 乌拉特前旗| 贡山| 紫金县| 云浮市| 万盛区| 视频| 平江县| 剑阁县| 禄丰县| 伽师县| 白河县| 托克逊县|