官术网_书友最值得收藏!

Enabling PyTorch acceleration using CUDA

One of the main benefits of PyTorch is its ability to enable acceleration through the use of a graphics processing unit (GPU). Deep learning is a computational task that is easily parallelizable, meaning that the calculations can be broken down into smaller tasks and calculated across many smaller processors. This means that instead of needing to execute the task on a single CPU, it is more efficient to perform the calculation on a GPU.

GPUs were originally created to efficiently render graphics, but since deep learning has grown in popularity, GPUs have been frequently used for their ability to perform multiple calculations simultaneously. While a traditional CPU may consist of around four or eight cores, a GPU consists of hundreds of smaller cores. Because calculations can be executed across all these cores simultaneously, GPUs can rapidly reduce the time taken to perform deep learning tasks.

Consider a single pass within a neural network. We may take a small batch of data, pass it through our network to obtain our loss, and then backpropagate, adjusting our parameters according to the gradients. If we have many batches of data to do this over, on a traditional CPU, we must wait until batch 1 has completed before we can compute this for batch 2:

Figure 2.7 – One pass in a neural network

However, on a GPU, we can perform all these steps simultaneously, meaning there is no requirement for batch 1 to finish before batch 2 can be started. We can calculate the parameter updates for all batches simultaneously and then perform all the parameter updates in one go (as the results are independent of one another). The parallel approach can vastly speed up the machine learning process:

Figure 2.8 – Parallel approach to perform passes

Compute Unified Device Architecture (CUDA) is the technology specific to Nvidia GPUs that enables hardware acceleration on PyTorch. In order to enable CUDA, we must first make sure the graphics card on our system is CUDA-compatible. A list of CUDA-compatible GPUs can be found here: https://developer.nvidia.com/cuda-gpus. If you have a CUDA-compatible GPU, then CUDA can be installed from this link: https://developer.nvidia.com/cuda-downloads. We will activate it using the following steps:

  1. Firstly, in order to actually enable CUDA support on PyTorch, you will have to build PyTorch from source. Details about how this can be done can be found here: https://github.com/pytorch/pytorch#from-source.
  2. Then, to actually CUDA within our PyTorch code, we must type the following into our Python code:

    cuda = torch.device('cuda')

    This sets our default CUDA device's name to 'cuda'.

  3. We can then execute operations on this device by manually specifying the device argument in any tensor operations:

    x = torch.tensor([5., 3.], device=cuda)

    Alternatively, we can do this by calling the cuda method:

    y = torch.tensor([4., 2.]).cuda()

  4. We can then run a simple operation to ensure this is working correctly:

    x*y

    This results in the following output:

Figure 2.9 – Tensor multiplication output using CUDA

The changes in speed will not be noticeable at this stage as we are just creating a tensor, but when we begin training models at scale later, we will see the speed benefits of parallelizing our computations using CUDA. By training our models in parallel, we will be able to reduce the time this takes by a considerable amount.

主站蜘蛛池模板: 荣昌县| 深泽县| 罗定市| 山阴县| 辉县市| 岑巩县| 东乌珠穆沁旗| 泸溪县| 通州市| 锦屏县| 锦州市| 西充县| 裕民县| 韶山市| 长海县| 安岳县| 蓬莱市| 河南省| 天津市| 齐河县| 清徐县| 葵青区| 缙云县| 西贡区| 巨鹿县| 武陟县| 松滋市| 吐鲁番市| 社旗县| 莱州市| 洛阳市| 淳化县| 同心县| 六盘水市| 青田县| 大港区| 东乌| 南召县| 江油市| 江安县| 津南区|