官术网_书友最值得收藏!

Memory-related properties

Memory on the GPU has a hierarchical architecture. It can be divided in terms of L1 cache, L2 cache, global memory, texture memory, and shared memory. The cudaDeviceProp provides many properties that help in identifying memory available with the device. memoryClockRate and memoryBusWidth provide clock rate and bus width of the memory respectively. The speed of the memory is very important. It affects the overall speed of your program. totalGlobalMem returns the size of global memory available with the device. totalConstMem returns the total constant memory available with the device. sharedMemPerBlock returns the total shared memory that can be used in tne device. The total number of registers available per block can be identified by using regsPerBlock. Size of L2 cache can be identified using the l2CacheSize property. The following code snippet shows how to use memory-related properties from the CUDA program:

printf( " Total amount of global memory: %.0f MBytes (%llu bytes)\n",
(float)device_Property.totalGlobalMem / 1048576.0f, (unsigned long long) device_Property.totalGlobalMem);
printf(" Memory Clock rate: %.0f Mhz\n", device_Property.memoryClockRate * 1e-3f);
printf(" Memory Bus Width: %d-bit\n", device_Property.memoryBusWidth);
if (device_Property.l2CacheSize)
{
printf(" L2 Cache Size: %d bytes\n", device_Property.l2CacheSize);
}
printf(" Total amount of constant memory: %lu bytes\n", device_Property.totalConstMem);
printf(" Total amount of shared memory per block: %lu bytes\n", device_Property.sharedMemPerBlock);
printf(" Total number of registers available per block: %d\n", device_Property.regsPerBlock);
主站蜘蛛池模板: 内黄县| 砀山县| 尖扎县| 奉节县| 神农架林区| 东乌珠穆沁旗| 沿河| 金坛市| 建始县| 同仁县| 敖汉旗| 开封县| 绥滨县| 栾川县| 资中县| 安庆市| 揭西县| 开江县| 华宁县| 墨江| 蓬安县| 固原市| 九寨沟县| 醴陵市| 缙云县| 克东县| 农安县| 巧家县| 香格里拉县| 盐池县| 长宁区| 明星| 常宁市| 上蔡县| 三台县| 上林县| 贵港市| 永年县| 南江县| 墨竹工卡县| 郴州市|