官术网_书友最值得收藏!

CUDA program structure

We have seen a very simple Hello, CUDA! program earlier, that showcased some important concepts related to CUDA programs. A CUDA program is a combination of functions that are executed either on the host or on the GPU device. The functions that do not exhibit parallelism are executed on the CPU, and the functions that exhibit data parallelism are executed on the GPU. The GPU compiler segregates these functions during compilation. As seen in the previous chapter, functions meant for execution on the device are defined using the __global__ keyword and compiled by the NVCC compiler, while normal C host code is compiled by the C compiler. A CUDA code is basically the same ANSI C code with the addition of some keywords needed for exploiting data parallelism.

So, in this section, a simple two-variable addition program is taken to explain important concepts related to CUDA programming, such as kernel calls, passing parameters to kernel functions from host to device, the configuration of kernel parameters, CUDA APIs needed to exploit data parallelism, and how memory allocation takes place on the host and the device. 

主站蜘蛛池模板: 惠水县| 习水县| 榆中县| 巴塘县| 封开县| 玛沁县| 新巴尔虎右旗| 澄江县| 东海县| 黄龙县| 买车| 梅河口市| 柘城县| 微山县| 柘城县| 远安县| 莲花县| 社旗县| 济阳县| 吴忠市| 双鸭山市| 澳门| 清水河县| 清涧县| 乐至县| 上饶市| 宁武县| 天全县| 庆元县| 文山县| 徐水县| 武安市| 卫辉市| 海原县| 平遥县| 眉山市| 乌拉特后旗| 永德县| 宜兰县| 左云县| 衡山县|