最近在做VOC2008格式的语义分割数据集的图像扩增,由于标签文件是8通道的P模式灰度图像,导致扩增真能按照numpy矩阵的形式来变换,具体操作见下文。先来看一个运行代码过程中遇到的一个问题。

详细问题

using 78 images for training, 24 images for validation.

0%| | 0/78 [00:00

/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106:cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [1,0,0], thread: [869,0,0] Assertion `t >= 0 && t < n_classes` failed.

/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [1,0,0], thread: [870,0,0] Assertion `t >= 0 && t < n_classes` failed.

.......

pytorchatensrcTHCUNNSpatialClassNLLCriterion.cu106 cunn_SpatialClassNLLCriterion_updateOutput_kernel block [0,0,0], thread [511,0,0] Assertion `t = 0 && t n_classes` failed.

THCudaCheck FAIL file=pytorchatensrcTHCUNNgenericSpatialClassNLLCriterion.cu line=134 error=710 device-side assert triggered

0% 078 [0000, its]

Traceback (most recent call last)

File train.py, line 23, in module

t_kernal(base_de_dir, 'ce', num_classes, num_workers, model, 'class.json', lr, epoidx, num_epochs, ga, al)

File rootBDDtrain_kernal.py, line 142, in t_kernal

loss = criterion(outputs, masks.long())

File rootminiconda3libpython3.8site-packagestorchnnmodulesmodule.py, line 727, in _call_impl

result = self.forward(input, kwargs)

File rootminiconda3libpython3.8site-packagestorchnnmodulesloss.py, line 961, in forward

return F.cross_entropy(input, target, weight=self.weight,

File rootminiconda3libpython3.8site-packagestorchnnfunctional.py, line 2468, in cross_entropy

return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

File rootminiconda3libpython3.8site-packagestorchnnfunctional.py, line 2266, in nll_loss

ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

RuntimeError cuda runtime error (710) device-side assert triggered at pytorchatensrcTHCUNNgenericSpatialClassNLLCriterion.cu134

问题分析:

在CSDN上搜了许多带block [0,0,0], thread [511,0,0] Assertion t = 0 && t n_classes failed报错信息的文章,综合下来就是数据集标签标注类别与预设不符,例如:代码中设置num_class=4,但在标签中却出现了大于4的类别值。 所以,在这个时候需要仔细检查一下自己标签的像素值是否有大于预设num_class的。详细代码如下:

if __name__ == "__main__":

# 标签路径

outmask_path = r'D:\py\python_projections\...\Brake_Dataset\SegmentationClass'

for img_name in os.listdir(outmask_path):

img_path = os.path.join(outmask_path, img_name)

img = Image.open(img_path)

img = np.array(img)

H, W = img.shape

i = 0

j = 0

for h in range(0, H):

for w in range(0, W):

if img[h, w] > 1:

i += 1

print(30 * '=')

print(img_name, '有', i, '个像素值大于1')

print(30 * '=')

运行结果可以知道每一幅图到底有多少像素值大于预设值。

问题解决

经过分析,在图像扩增过程中的变换使部分像素值变为了‘2’,超出了我的预设值num_class=1。所解决办法自然而然应该是将像素值是2的地方改为1。但是,如此操作之后,我修改之后的图像像素值仍然还是2,代码运行依然报错。直到看到这篇文章。

https://blog.csdn.net/a2824256/article/details/117924612

P模式图像不能直接保存为.jpg,但是可直接保存为.png。是的,我这次的数据集图像的格式是.jpg。这句话启发了我,我又重新生成了.png格式的原始数据,重新数据扩增,代码运行无问题,问题解决。

VOC2008语义分割数据集图像扩增代码

import cv2

import os.path

import PIL.Image as Image

import numpy as np

def rotate(image, angle, center=None, scale=1.0):

# 旋转

w = image.shape[1]

h = image.shape[0]

# If no rotation center is specified, the center of the image is set as the rotation center

if center is None:

center = (w / 2, h / 2)

m = cv2.getRotationMatrix2D(center, angle, scale)

rotated = cv2.warpAffine(image, m, (w, h))

return rotated

def shifting(image, distance=50, i=0, j=0):

"""

param img: 原始图片矩阵

return: [1, 0, 100]-宽右移100像素; [0, 1, 100]-高下移100像素

"""

paras_wide = [[1, 0, distance], [1, 0, -distance]]

paras_height = [[0, 1, distance], [0, 1, -distance]]

rows, cols = image.shape[:2]

img_shift = np.float32([paras_wide[i], paras_height[j]])

img_new = cv2.warpAffine(image, img_shift, (cols, rows))

return img_new

def Image_flip(img, paras):

"""

:param img:原始图片矩阵

:return: 0-垂直; 1-水平; -1-垂直&水平

"""

if img is None:

return

img_new = cv2.flip(img, paras)

return img_new

def aug(file_dir, outfile_dir):

for img_name in os.listdir(file_dir):

img_path = os.path.join(file_dir, img_name)

out_dir = os.path.join(outfile_dir, img_name[0:-4])

img = Image.open(img_path)

img = np.array(img)

# 图1:旋转90度

rotated_90 = rotate(img, 90)

rotated_90 = Image.fromarray(rotated_90)

rotated_90.save(out_dir + '_r90.png')

# 图2:旋转180度

rotated_180 = rotate(img, 180)

rotated_180 = Image.fromarray(rotated_180)

rotated_180.save(out_dir + '_r180.png')

# 图3:旋转270度

rotated_270 = rotate(img, 270)

rotated_270 = Image.fromarray(rotated_270)

rotated_270.save(out_dir + '_r270.png')

# 图3:旋转315度

rotated_315 = rotate(img, 315)

rotated_315 = Image.fromarray(rotated_315)

rotated_315.save(out_dir + '_r315.png')

# 图4:右下移

rd = shifting(img)

rd = Image.fromarray(rd)

rd.save(out_dir + '_rd.png')

# 图5:右上移

ru = shifting(img, j=1)

ru = Image.fromarray(ru)

ru.save(out_dir + '_ru.png')

# 图6:左下移

ld = shifting(img, i=1)

ld = Image.fromarray(ld)

ld.save(out_dir + '_ld.png')

# 图7:左上移

lu = shifting(img, i=1, j=1)

lu = Image.fromarray(lu)

lu.save(out_dir + '_lu.png')

# 图8:垂直翻转

lf = Image_flip(img, 0)

lf = Image.fromarray(lf)

lf.save(out_dir + '_lf.png')

# 图9:水平翻转

hf = Image_flip(img, 1)

hf = Image.fromarray(hf)

hf.save(out_dir + '_hf.png')

if __name__ == "__main__":

# 原图路径

pngs_path = r'D:\py\python_projections\...\brake_data\JPEGImages'

mask_path = r'D:\py\python_projections\...\brake_data\SegmentationClass'

'''

'''

# 增强图输出路径

outjpgs_path = r'D:\py\python_projections\...\Brake_Dataset\JPEGImages'

outmask_path = r'D:\py\python_projections\...\Brake_Dataset\SegmentationClass'

#原图增强

aug(pngs_path , outjpgs_path)

print("pngs_aug done")

#标签增强

aug(mask_path, outmask_path)

print("mask_aug done")

print('done!')

上述图像扩增代码可以实现1扩9,实现10倍扩增,可用于P模式标签的数据集扩增。

参考文章

评论可见,请评论后查看内容,谢谢!!!评论后请刷新页面。