- Deep Learning with PyTorch
- Vishnu Subramanian
- 247字
- 2021-06-24 19:16:24
Tensors on GPU
We have learned how to represent different forms of data in tensor representation. Some of the common operations we perform once we have data in the form of tensors are addition, subtraction, multiplication, dot product, and matrix multiplication. All of these operations can be either performed on the CPU or the GPU. PyTorch provides a simple function called cuda() to copy a tensor on the CPU to the GPU. We will take a look at some of the operations and compare the performance between matrix multiplication operations on the CPU and GPU.
Tensor addition can be obtained by using the following code:
#Various ways you can perform tensor addition
a = torch.rand(2,2)
b = torch.rand(2,2)
c = a + b
d = torch.add(a,b)
#For in-place addition
a.add_(5)
#Multiplication of different tensors
a*b
a.mul(b)
#For in-place multiplication
a.mul_(b)
For tensor matrix multiplication, lets compare the code performance on CPU and GPU. Any tensor can be moved to the GPU by calling the .cuda() function.
Multiplication on the GPU runs as follows:
a = torch.rand(10000,10000)
b = torch.rand(10000,10000)
a.matmul(b)
Time taken: 3.23 s
#Move the tensors to GPU
a = a.cuda()
b = b.cuda()
a.matmul(b)
Time taken: 11.2 μs
These fundamental operations of addition, subtraction, and matrix multiplication can be used to build complex operations, such as a Convolution Neural Network (CNN) and a recurrent neural network (RNN), which we will learn about in the later chapters of the book.
- Intel FPGA/CPLD設計(基礎篇)
- Raspberry Pi 3 Cookbook for Python Programmers
- Augmented Reality with Kinect
- 基于ARM的嵌入式系統和物聯網開發
- 3ds Max Speed Modeling for 3D Artists
- AMD FPGA設計優化寶典:面向Vivado/SystemVerilog
- STM32嵌入式技術應用開發全案例實踐
- Intel Edison智能硬件開發指南:基于Yocto Project
- Blender Quick Start Guide
- Hands-On Artificial Intelligence for Banking
- 基于Proteus仿真的51單片機應用
- Neural Network Programming with Java(Second Edition)
- LPC1100系列處理器原理及應用
- IP網絡視頻傳輸:技術、標準和應用
- Angular 6 by Example