Hi, im building a jetbot with the sparkfun jetson nano 2GB kit. The V01-00 image worked though the training model isn’t working properly. It seems to have issues with the cuda system. These our the errors I have been getting:
1.RuntimeError: CUDA error: device-side assert triggered - at one of the times I tried to run the program.
- RuntimeError: cuda runtime error (59) : device-side assert triggered at /media/nvidia/WD_BLUE_2.5_1TB/pytorch-v1.1.0/aten/src/THC/generic/THCTensorMath.cu:16.
3.RuntimeError: cuda runtime error (59) : device-side assert triggered at /media/nvidia/WD_BLUE_2.5_1TB/pytorch-v1.1.0/aten/src/THC/generic/THCTensorMath.cu:26
This is my code:
NUM_EPOCHS = 30
BEST_MODEL_PATH = ‘best_model.pth’
best_accuracy = 0.0
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
for i, data in enumerate(all_dataloader):
for epoch in range(NUM_EPOCHS):
for images, labels in iter(train_loader):
images = images.to(device)
labels = labels.to(device)
optimizer.zero_grad()
outputs = model(images)
loss = F.cross_entropy(outputs, labels)
loss.backward()
optimizer.step()
test_error_count = 0.0
for images, labels in iter(test_loader):
images = images.to(device)
labels = labels.to(device)
outputs = model(images)
test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
test_accuracy = 1.0 - float(test_error_count) / float(len(test_dataset))
print(‘%d: %f’ % (epoch, test_accuracy))
if test_accuracy > best_accuracy:
torch.save(model.state_dict(), BEST_MODEL_PATH)
best_accuracy = test_accuracy
This is the error:
RuntimeError Traceback (most recent call last)
in
13 outputs = model(images)
14 loss = F.cross_entropy(outputs, labels)
—> 15 loss.backward()
16 optimizer.step()
/usr/local/lib/python3.6/dist-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
105 products. Defaults to False.
→ 107 torch.autograd.backward(self, gradient, retain_graph, create_graph)
109 def register_hook(self, hook):
/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
91 Variable._execution_engine.run_backward(
92 tensors, grad_tensors, retain_graph, create_graph,
—> 93 allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /media/nvidia/WD_BLUE_2.5_1TB/pytorch-v1.1.0/aten/src/THC/generic/THCTensorMath.cu:26