Comment obtenir des mini-lots dans du pytorch de manière propre et efficace?

Question

J'essayais de faire une chose simple qui consistait à former un modèle linéaire avec descente de gradient stochastique (SGD) à l'aide d'une torche:

import numpy as np import torch from torch.autograd import Variable import pdb def get_batch2(X,Y,M,dtype): X,Y = X.data.numpy(), Y.data.numpy() N = len(Y) valid_indices = np.array( range(N) ) batch_indices = np.random.choice(valid_indices,size=M,replace=False) batch_xs = torch.FloatTensor(X[batch_indices,:]).type(dtype) batch_ys = torch.FloatTensor(Y[batch_indices]).type(dtype) return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False) def poly_kernel_matrix( x,D ): N = len(x) Kern = np.zeros( (N,D+1) ) for n in range(N): for d in range(D+1): Kern[n,d] = x[n]**d; return Kern ## data params N=5 # data set size Degree=4 # number dimensions/features D_sgd = Degree+1 ## x_true = np.linspace(0,1,N) # the real data points y = np.sin(2*np.pi*x_true) y.shape = (N,1) ## TORCH dtype = torch.FloatTensor # dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU X_mdl = poly_kernel_matrix( x_true,Degree ) X_mdl = Variable(torch.FloatTensor(X_mdl).type(dtype), requires_grad=False) y = Variable(torch.FloatTensor(y).type(dtype), requires_grad=False) ## SGD mdl w_init = torch.zeros(D_sgd,1).type(dtype) W = Variable(w_init, requires_grad=True) M = 5 # mini-batch size eta = 0.1 # step size for i in range(500): batch_xs, batch_ys = get_batch2(X_mdl,y,M,dtype) # Forward pass: compute predicted y using operations on Variables y_pred = batch_xs.mm(W) # Compute and print loss using operations on Variables. Now loss is a Variable of shape (1,) and loss.data is a Tensor of shape (1,); loss.data[0] is a scalar value holding the loss. loss = (1/N)*(y_pred - batch_ys).pow(2).sum() # Use autograd to compute the backward pass. Now w will have gradients loss.backward() # Update weights using gradient descent; w1.data are Tensors, # w.grad are Variables and w.grad.data are Tensors. W.data -= eta * W.grad.data # Manually zero the gradients after updating weights W.grad.data.zero_() # c_sgd = W.data.numpy() X_mdl = X_mdl.data.numpy() y = y.data.numpy() # Xc_pinv = np.dot(X_mdl,c_sgd) print('J(c_sgd) = ', (1/N)*(np.linalg.norm(y-Xc_pinv)**2) ) print('loss = ',loss.data[0])

le code fonctionne bien et bien que ma méthode get_batch2 semble vraiment stupide/naïve, c’est probablement parce que je suis novice dans pytorch mais je n’ai pas trouvé un bon endroit où ils discutent de la façon de récupérer des lots de données. J'ai parcouru leurs tutoriels ( http://pytorch.org/tutorials/beginner/pytorch_with_examples.html ) et le jeu de données ( http://pytorch.org/tutorials/beginner /data_loading_tutorial.html ) sans chance. Les tutoriels semblent tous supposer que l’on a déjà le lot et la taille du lot au début, puis que l’on s’entraîne avec ces données sans les modifier (voir plus précisément http://pytorch.org/tutorials/beginner/ pytorch_with_examples.html # pytorch-variables-and-autograd ).

Ma question est donc la suivante: dois-je vraiment reconvertir mes données en numpy pour pouvoir en extraire un échantillon aléatoire, puis en pythore avec Variable pour pouvoir s’entraîner en mémoire? N'y a-t-il pas moyen d'obtenir des mini-lots avec une torche?

J'ai regardé quelques fonctions que fournit la torche mais sans succès:

#pdb.set_trace() #valid_indices = torch.arange(0,N).numpy() #valid_indices = np.array( range(N) ) #batch_indices = np.random.choice(valid_indices,size=M,replace=False) #indices = torch.LongTensor(batch_indices) #batch_xs, batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices) #batch_xs,batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices)

même si le code que j'ai fourni fonctionne bien, je crains que sa mise en œuvre ne soit pas efficace ET que, si j'utilisais des GPU, il y aurait un ralentissement considérable (parce que je suppose que le fait de mettre des choses en mémoire, puis de les récupérer pour les GPU comme ça, c’est idiot).

J'en ai implémenté un nouveau basé sur la réponse suggérant d'utiliser torch.index_select():

def get_batch2(X,Y,M): ''' get batch for pytorch model ''' # TODO fix and make it nicer, there is pytorch forum question #X,Y = X.data.numpy(), Y.data.numpy() X,Y = X, Y N = X.size()[0] batch_indices = torch.LongTensor( np.random.randint(0,N+1,size=M) ) pdb.set_trace() batch_xs = torch.index_select(X,0,batch_indices) batch_ys = torch.index_select(Y,0,batch_indices) return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

cependant, cela semble poser problème car cela ne fonctionne pas si X,Y n'est PAS une variable ... ce qui est vraiment étrange. J'ai ajouté ceci au forum de pytorch: https://discuss.pytorch.org/t/how-to-get-mini-batches-in-pytorch-in-a-clean-and-efficient-way/ 10322

En ce moment, je me bats avec ce travail pour que gpu. Ma version la plus récente:

def get_batch2(X,Y,M,dtype): ''' get batch for pytorch model ''' # TODO fix and make it nicer, there is pytorch forum question #X,Y = X.data.numpy(), Y.data.numpy() X,Y = X, Y N = X.size()[0] if dtype == torch.cuda.FloatTensor: batch_indices = torch.cuda.LongTensor( np.random.randint(0,N,size=M) )# without replacement else: batch_indices = torch.LongTensor( np.random.randint(0,N,size=M) ).type(dtype) # without replacement pdb.set_trace() batch_xs = torch.index_select(X,0,batch_indices) batch_ys = torch.index_select(Y,0,batch_indices) return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

l'erreur:

RuntimeError: tried to construct a tensor from a int sequence, but found an item of type numpy.int64 at index (0)

Je ne comprends pas, dois-je vraiment faire:

ints = [ random.randint(0,N) for i i range(M)]

obtenir les nombres entiers?

Ce serait également idéal si les données pouvaient être une variable. Il semble que cela torch.index_select Ne fonctionne pas pour les données de type Variable.

cette liste d'entiers ne fonctionne toujours pas:

TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor), but expected one of: * (torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (float beta, torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (float beta, torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor) * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out) didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)

saetch_g · Answer

Si je comprends bien votre code, votre fonction get_batch2 Semble prendre des mini-lots aléatoires de votre jeu de données sans suivre les indices que vous avez déjà utilisés dans une époque. Le problème avec cette implémentation est qu’elle n’utilisera probablement pas toutes vos données.

La méthode habituelle de traitement par lots consiste à créer une permutation aléatoire de tous les sommets possibles en utilisant torch.randperm(N) et à les parcourir en lots. Par exemple:

n_epochs = 100 # or whatever batch_size = 128 # or whatever for Epoch in range(n_epochs): # X is a torch Variable permutation = torch.randperm(X.size()[0]) for i in range(0,X.size()[0], batch_size): optimizer.zero_grad() indices = permutation[i:i+batch_size] batch_x, batch_y = X[indices], Y[indices] # in case you wanted a semi-full example outputs = model.forward(batch_x) loss = lossfunction(outputs,batch_y) loss.backward() optimizer.step()

Si vous aimez copier et coller, assurez-vous de définir votre optimiseur, votre modèle et votre fonction de perte quelque part avant le début de la boucle Epoch.

En ce qui concerne votre erreur, essayez d'utiliser torch.from_numpy(np.random.randint(0,N,size=M)).long() au lieu de torch.LongTensor(np.random.randint(0,N,size=M)). Je ne sais pas si cela résoudra l'erreur que vous obtenez, mais cela résoudra une erreur future.

Mo Hossny · Answer

Utilisez des chargeurs de données.

Base de données

Tout d'abord, vous définissez un jeu de données. Vous pouvez utiliser les ensembles de données des packages dans torchvision.datasets Ou utiliser la classe ImageFolder qui suit la structure d'Imagenet.

trainset=torchvision.datasets.ImageFolder(root='/path/to/your/data/trn', transform=generic_transform) testset=torchvision.datasets.ImageFolder(root='/path/to/your/data/val', transform=generic_transform)

Se transforme

Les transformations sont très utiles pour prétraiter des données chargées à la volée. Si vous utilisez des images, vous devez utiliser la transformation ToTensor() pour convertir les images chargées de PIL en torch.tensor. Plusieurs transformations peuvent être compressées dans une transformation composite comme suit.

generic_transform = transforms.Compose([ transforms.ToTensor(), transforms.ToPILImage(), #transforms.CenterCrop(size=128), transforms.Lambda(lambda x: myimresize(x, (128, 128))), transforms.ToTensor(), transforms.Normalize((0., 0., 0.), (6, 6, 6)) ])

Chargeur de données

Vous définissez ensuite un chargeur de données qui prépare le prochain lot pendant l’entraînement. Vous pouvez définir le nombre de threads pour le chargement des données.

trainloader=torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=8) testloader=torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=8)

Pour la formation, il suffit d’énumérer le chargeur de données.

 for i, data in enumerate(trainloader, 0): inputs, labels = data inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda()) # continue training...

NumPy Stuff

Oui. Vous devez convertir torch.tensor En numpy à l'aide de la méthode .numpy(). Si vous utilisez CUDA, vous devez d'abord télécharger les données du processeur graphique vers la CPU à l'aide de la méthode .cpu() avant d'appeler .numpy(). Personnellement, venant de l’arrière-plan MATLAB, je préfère effectuer la majeure partie du travail avec le tenseur de la torche, puis convertir les données en numpy uniquement à des fins de visualisation. N'oubliez pas non plus que la torche stocke les données dans un mode canal premier tandis que numpy et PIL fonctionnent avec canal dernier. Cela signifie que vous devez utiliser np.rollaxis Pour déplacer l'axe du canal vers le dernier. Un exemple de code est ci-dessous.

np.rollaxis(make_grid(mynet.ftrextractor(inputs).data, nrow=8, padding=1).cpu().numpy(), 0, 3)

Enregistrement

La meilleure méthode que j'ai trouvée pour visualiser les cartes de caractéristiques est d'utiliser un tableau de tenseurs. Un code est disponible sur yunjey/pytorch-tutorial .

Forcetti · Answer

Je ne sais pas ce que tu essayais de faire. W.r.t. vous n'aurez pas à convertir en numpy. Vous pouvez simplement utiliser index_select () , par exemple:

for Epoch in range(500): k=0 loss = 0 while k < X_mdl.size(0): random_batch = [0]*5 for i in range(k,k+M): random_batch[i] = np.random.choice(N-1) random_batch = torch.LongTensor(random_batch) batch_xs = X_mdl.index_select(0, random_batch) batch_ys = y.index_select(0, random_batch) # Forward pass: compute predicted y using operations on Variables y_pred = batch_xs.mul(W) # etc..

Le reste du code devra également être modifié.

À mon avis, vous souhaitez créer une fonction get_batch qui concatène vos tenseurs X et vos tenseurs Y. Quelque chose comme:

def make_batch(list_of_tensors): X, y = list_of_tensors[0] # may need to unsqueeze X and y to get right dimensions for i, (sample, label) in enumerate(list_of_tensors[1:]): X = torch.cat((X, sample), dim=0) y = torch.cat((y, label), dim=0) return X, y

Ensuite, pendant la formation, vous sélectionnez, par exemple max_batch_size = 32, exemples via le découpage en tranches.

for Epoch: X, y = make_batch(list_of_tensors) X = Variable(X, requires_grad=False) y = Variable(y, requires_grad=False) k = 0 while k < X.size(0): inputs = X[k:k+max_batch_size,:] labels = y[k:k+max_batch_size,:] # some computation k+= max_batch_size

gary69 · Answer

Créez une classe qui est une sous-classe de torch.utils.data.Dataset et le transmettre à un torch.utils.data.Dataloader. Vous trouverez ci-dessous un exemple pour mon projet.

class CandidateDataset(Dataset): def __init__(self, x, y): self.len = x.shape[0] if torch.cuda.is_available(): device = 'cuda' else: device = 'cpu' self.x_data = torch.as_tensor(x, device=device, dtype=torch.float) self.y_data = torch.as_tensor(y, device=device, dtype=torch.long) def __getitem__(self, index): return self.x_data[index], self.y_data[index] def __len__(self): return self.len def fit(self, candidate_count): feature_matrix = np.empty(shape=(candidate_count, 600)) target_matrix = np.empty(shape=(candidate_count, 1)) fill_matrices(feature_matrix, target_matrix) candidate_ds = CandidateDataset(feature_matrix, target_matrix) train_loader = DataLoader(dataset = candidate_ds, batch_size = self.BATCH_SIZE, shuffle = True) for Epoch in range(self.N_EPOCHS): print('starting Epoch ' + str(Epoch)) for batch_idx, (inputs, labels) in enumerate(train_loader): print('starting batch ' + str(batch_idx) + ' Epoch ' + str(Epoch)) inputs, labels = Variable(inputs), Variable(labels) self.optimizer.zero_grad() inputs = inputs.view(1, inputs.size()[0], 600) # init hidden with number of rows in input y_pred = self.model(inputs, self.model.initHidden(inputs.size()[1])) labels.squeeze_() # labels should be tensor with batch_size rows. Column the index of the class (0 or 1) loss = self.loss_f(y_pred, labels) loss.backward() self.optimizer.step() print('done batch ' + str(batch_idx) + ' Epoch ' + str(Epoch))

Jibin Mathew · Answer

Vous pouvez utiliser torch.utils.data

en supposant que vous ayez chargé les données du répertoire, dans le train et testé les tableaux numpy, vous pouvez hériter de torch.utils.data.Dataset class pour créer votre objet dataset

class MyDataset(Dataset): def __init__(self, x, y): super(MyDataset, self).__init__() assert x.shape[0] == y.shape[0] # assuming shape[0] = dataset size self.x = x self.y = y def __len__(self): return self.y.shape[0] def __getitem__(self, index): return self.x[index], self.y[index]

Ensuite, créez votre objet dataset

traindata = MyDataset(train_x, train_y)

Enfin, utilisez DataLoader pour créer vos mini-lots

trainloader = torch.utils.data.DataLoader(traindata, batch_size=64, shuffle=True)