最近在做VOC2008格式的语义分割数据集的图像扩增,由于标签文件是8通道的P模式灰度图像,导致扩增真能按照numpy矩阵的形式来变换,具体操作见下文。先来看一个运行代码过程中遇到的一个问题。
详细问题
using 78 images for training, 24 images for validation.
0%| | 0/78 [00:00, ?it/s]
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106:cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [1,0,0], thread: [869,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [1,0,0], thread: [870,0,0] Assertion `t >= 0 && t < n_classes` failed.
.......
pytorchatensrcTHCUNNSpatialClassNLLCriterion.cu106 cunn_SpatialClassNLLCriterion_updateOutput_kernel block [0,0,0], thread [511,0,0] Assertion `t = 0 && t n_classes` failed.
THCudaCheck FAIL file=pytorchatensrcTHCUNNgenericSpatialClassNLLCriterion.cu line=134 error=710 device-side assert triggered
0% 078 [0000, its]
Traceback (most recent call last)
File train.py, line 23, in module
t_kernal(base_de_dir, 'ce', num_classes, num_workers, model, 'class.json', lr, epoidx, num_epochs, ga, al)
File rootBDDtrain_kernal.py, line 142, in t_kernal
loss = criterion(outputs, masks.long())
File rootminiconda3libpython3.8site-packagestorchnnmodulesmodule.py, line 727, in _call_impl
result = self.forward(input, kwargs)
File rootminiconda3libpython3.8site-packagestorchnnmodulesloss.py, line 961, in forward
return F.cross_entropy(input, target, weight=self.weight,
File rootminiconda3libpython3.8site-packagestorchnnfunctional.py, line 2468, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File rootminiconda3libpython3.8site-packagestorchnnfunctional.py, line 2266, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError cuda runtime error (710) device-side assert triggered at pytorchatensrcTHCUNNgenericSpatialClassNLLCriterion.cu134
问题分析:
在CSDN上搜了许多带block [0,0,0], thread [511,0,0] Assertion t = 0 && t n_classes failed报错信息的文章,综合下来就是数据集标签标注类别与预设不符,例如:代码中设置num_class=4,但在标签中却出现了大于4的类别值。 所以,在这个时候需要仔细检查一下自己标签的像素值是否有大于预设num_class的。详细代码如下:
if __name__ == "__main__":
# 标签路径
outmask_path = r'D:\py\python_projections\...\Brake_Dataset\SegmentationClass'
for img_name in os.listdir(outmask_path):
img_path = os.path.join(outmask_path, img_name)
img = Image.open(img_path)
img = np.array(img)
H, W = img.shape
i = 0
j = 0
for h in range(0, H):
for w in range(0, W):
if img[h, w] > 1:
i += 1
print(30 * '=')
print(img_name, '有', i, '个像素值大于1')
print(30 * '=')
运行结果可以知道每一幅图到底有多少像素值大于预设值。
问题解决
经过分析,在图像扩增过程中的变换使部分像素值变为了‘2’,超出了我的预设值num_class=1。所解决办法自然而然应该是将像素值是2的地方改为1。但是,如此操作之后,我修改之后的图像像素值仍然还是2,代码运行依然报错。直到看到这篇文章。
https://blog.csdn.net/a2824256/article/details/117924612
P模式图像不能直接保存为.jpg,但是可直接保存为.png。是的,我这次的数据集图像的格式是.jpg。这句话启发了我,我又重新生成了.png格式的原始数据,重新数据扩增,代码运行无问题,问题解决。
VOC2008语义分割数据集图像扩增代码
import cv2
import os.path
import PIL.Image as Image
import numpy as np
def rotate(image, angle, center=None, scale=1.0):
# 旋转
w = image.shape[1]
h = image.shape[0]
# If no rotation center is specified, the center of the image is set as the rotation center
if center is None:
center = (w / 2, h / 2)
m = cv2.getRotationMatrix2D(center, angle, scale)
rotated = cv2.warpAffine(image, m, (w, h))
return rotated
def shifting(image, distance=50, i=0, j=0):
"""
param img: 原始图片矩阵
return: [1, 0, 100]-宽右移100像素; [0, 1, 100]-高下移100像素
"""
paras_wide = [[1, 0, distance], [1, 0, -distance]]
paras_height = [[0, 1, distance], [0, 1, -distance]]
rows, cols = image.shape[:2]
img_shift = np.float32([paras_wide[i], paras_height[j]])
img_new = cv2.warpAffine(image, img_shift, (cols, rows))
return img_new
def Image_flip(img, paras):
"""
:param img:原始图片矩阵
:return: 0-垂直; 1-水平; -1-垂直&水平
"""
if img is None:
return
img_new = cv2.flip(img, paras)
return img_new
def aug(file_dir, outfile_dir):
for img_name in os.listdir(file_dir):
img_path = os.path.join(file_dir, img_name)
out_dir = os.path.join(outfile_dir, img_name[0:-4])
img = Image.open(img_path)
img = np.array(img)
# 图1:旋转90度
rotated_90 = rotate(img, 90)
rotated_90 = Image.fromarray(rotated_90)
rotated_90.save(out_dir + '_r90.png')
# 图2:旋转180度
rotated_180 = rotate(img, 180)
rotated_180 = Image.fromarray(rotated_180)
rotated_180.save(out_dir + '_r180.png')
# 图3:旋转270度
rotated_270 = rotate(img, 270)
rotated_270 = Image.fromarray(rotated_270)
rotated_270.save(out_dir + '_r270.png')
# 图3:旋转315度
rotated_315 = rotate(img, 315)
rotated_315 = Image.fromarray(rotated_315)
rotated_315.save(out_dir + '_r315.png')
# 图4:右下移
rd = shifting(img)
rd = Image.fromarray(rd)
rd.save(out_dir + '_rd.png')
# 图5:右上移
ru = shifting(img, j=1)
ru = Image.fromarray(ru)
ru.save(out_dir + '_ru.png')
# 图6:左下移
ld = shifting(img, i=1)
ld = Image.fromarray(ld)
ld.save(out_dir + '_ld.png')
# 图7:左上移
lu = shifting(img, i=1, j=1)
lu = Image.fromarray(lu)
lu.save(out_dir + '_lu.png')
# 图8:垂直翻转
lf = Image_flip(img, 0)
lf = Image.fromarray(lf)
lf.save(out_dir + '_lf.png')
# 图9:水平翻转
hf = Image_flip(img, 1)
hf = Image.fromarray(hf)
hf.save(out_dir + '_hf.png')
if __name__ == "__main__":
# 原图路径
pngs_path = r'D:\py\python_projections\...\brake_data\JPEGImages'
mask_path = r'D:\py\python_projections\...\brake_data\SegmentationClass'
'''
'''
# 增强图输出路径
outjpgs_path = r'D:\py\python_projections\...\Brake_Dataset\JPEGImages'
outmask_path = r'D:\py\python_projections\...\Brake_Dataset\SegmentationClass'
#原图增强
aug(pngs_path , outjpgs_path)
print("pngs_aug done")
#标签增强
aug(mask_path, outmask_path)
print("mask_aug done")
print('done!')
上述图像扩增代码可以实现1扩9,实现10倍扩增,可用于P模式标签的数据集扩增。
参考文章
发表评论