官术网_书友最值得收藏!

Dataset class

Any custom dataset class, say for example, our Dogs dataset class, has to inherit from the PyTorch dataset class. The custom class has to implement two main functions, namely __len__(self) and __getitem__(self, idx). Any custom class acting as a Dataset class should look like the following code snippet:

from torch.utils.data import Dataset
class DogsAndCatsDataset(Dataset):
def __init__(self,):
pass
def __len__(self):
pass
def __getitem__(self,idx):
pass

We do any initialization, if required, inside the init method—for example, reading the index of the table and reading the filenames of the images, in our case. The __len__(self) operation is responsible for returning the maximum number of elements in our dataset. The __getitem__(self, idx) operation returns an element based on the idx every time it is called. The following code implements our DogsAndCatsDataset class:

class DogsAndCatsDataset(Dataset):

def __init__(self,root_dir,size=(224,224)):
self.files = glob(root_dir)
self.size = size

def __len__(self):
return len(self.files)

def __getitem__(self,idx):
img = np.asarray(Image.open(self.files[idx]).resize(self.size))
label = self.files[idx].split('/')[-2]
return img,label

Once the DogsAndCatsDataset class is created, we can create an object and iterate over it, which is shown in the following code:

for image,label in dogsdset:
#Apply your DL on the dataset.

Applying a deep learning algorithm on a single instance of data is not optimal. We need a batch of data, as modern GPUs are optimized for better performance when executed on a batch of data. The DataLoader class helps to create batches by abstracting a lot of complexity. 

主站蜘蛛池模板: 中山市| 波密县| 新余市| 尼木县| 密云县| 赤水市| 长宁区| 西乌珠穆沁旗| 五大连池市| 易门县| 合肥市| 于田县| 蓬安县| 翼城县| 金沙县| 长子县| 平潭县| 武穴市| 拉萨市| 东方市| 昆明市| 贵定县| 雷波县| 通道| 西城区| 太仆寺旗| 松江区| 图们市| 新沂市| 诸暨市| 汝阳县| 日照市| 即墨市| 阜阳市| 惠水县| 铜鼓县| 阜平县| 独山县| 濮阳市| 怀柔区| 苍梧县|