最新消息: USBMI致力于为网友们分享Windows、安卓、IOS等主流手机系统相关的资讯以及评测、同时提供相关教程、应用、软件下载等服务。

深度学习(五):FastFCN代码运行、测试与预测

互联网 admin 1浏览 0评论

深度学习(五):FastFCN代码运行、测试与预测

目录

0 前言

1 环境配置

1.1 安装python包

1.2 下载detail-api

1.3 运行prepare_pcontext.py

1.4 运行 prepare_ade20k.py

2 训练模型

3 测试模型

3.1 下载模型

3.2 测试 encnet_jpu_res50_pcontext.pth.tar

3.2.1 test [single-scale] (单一尺寸:pixAcc=0.7898、mIou=0.5105)

3.2.2 test [multi-scale] (多尺寸:pixAcc=0.7964、mIou=0.5210)

3.2.3 predict [single-scale] (单一尺寸)

4 报错与解决:

4.1 detail-api编译报错

4.2 模型文件丢失

4.3 AttributeError: 'NoneType' object has no attribute 'run_slave'

参考链接:


0 前言

        全称:FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation----沈阳自动化所团队

        论文:.11816

        github:

        本机:RTX3070、cuda-11.0、torch-1.7.1+cu110、python3.7

        FastFCN下一篇:深度学习(8):FastFCN代码运行、测试与预测2_biter0088的博客-CSDN博客

1 环境配置

        官方测试的环境:

PyTorch >= 1.1.0 (Note: The code is test in the environment with python=3.6, cuda=9.0)

#master版本,我克隆的是2022年3月版本的,作者可能会有改动
git clone .git 
cd FastFCN

1.1 安装python包

        创建文件requirements.txt,安装其他包

        注:激活python环境 source activate yolov5py37

nose
tqdm
scipy
cython
requests
scikit-image
python3-dev
libevent-dev
cPython
pip install -r requirements.txt

1.2 下载detail-api

        下载到FastFCN目录下:

git clone 

        并注释/xx/FastFCN/scripts/prepare_pcontext.py文件如下:

def install_pcontext_api():#repo_url = ""#os.system("git clone " + repo_url)os.system("cd detail-api/PythonAPI/ && python setup.py install")shutil.rmtree('detail-api')try:import detailexcept Exception:print("Installing PASCAL Context API failed, please install it manually %s"%(repo_url))

        注:执行prepare_pcontext.py后,detail-api被安装,上面箭头指的文件夹会被删除

1.3 运行prepare_pcontext.py

        文件目录为:/xx/FastFCN/scripts/prepare_pcontext.py,准备VOC2010数据集

python -m scripts.prepare_pcontext

        会下载VOC2010数据到如下目录:

#VOC2010数据集
官方网站:.html.
└── VOCdevkit     #根目录└── VOC2010   #不同年份的数据集,这里只下载了2012的,还有2007等其它年份的├── Annotations        #存放xml文件,与JPEGImages中的图片一一对应,解释图片的内容等等├── ImageSets          #该目录下存放的都是txt文件,txt文件中每一行包含一个图片的名称,末尾会加上±1表示正负样本│   ├── Action│   ├── Layout│   ├── Main│   └── Segmentation├── JPEGImages         #存放源图片├── SegmentationClass  #存放的是图片,语义分割相关,标注出每个像素的类别└── SegmentationObject #存放的是图片,实例分割相关,标注出每个像素属于哪一个物体

         下载完成后,会编译安装detail-api,安装完成后会删除前面1.2的下载文件----所以如果detail-api在终端打印输出安装成功时,下面几行就没有作用了,可以注释掉

    os.system("cd detail-api/PythonAPI/ && python setup.py install")shutil.rmtree('detail-api')

        注:prepare_pcontext.py程序再次运行时,还会重新下载一遍VOC2010数据集----一个bug(一般如果成功安装包和下载数据后,这个程序就不要运行了);如果第一遍数据下载成功后,出现了一些其他报错,需要再次运行prepare_pcontext.py去准备数据和环境包时,可以将下面几行注释掉:

if __name__ == '__main__':args = parse_args()#mkdir(os.path.expanduser('~/.encoding/data'))#if args.download_dir is not None:#    if os.path.isdir(_TARGET_DIR):#        os.remove(_TARGET_DIR)# make symlink#    os.symlink(args.download_dir, _TARGET_DIR)#else:#    download_ade(_TARGET_DIR, overwrite=False)install_pcontext_api()

1.4 运行 prepare_ade20k.py

        文件目录为:/xxx/FastFCN/scripts/prepare_ade20k.py,准备ADEChallengeData2016数据集。

python -m scripts.prepare_ade20k
(yolov5py37) meng@meng:~/deeplearning/FastFCN$ python -m scripts.prepare_ade20k
Downloading /home/meng/.encoding/data/downloads/ADEChallengeData2016.zip from .zip...
944710KB [05:23, 2923.61KB/s]                                                                                                                                                                                                                                                      
Downloading /home/meng/.encoding/data/downloads/release_test.zip from .zip...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 206856/206856 [04:29<00:00, 766.68KB/s]
(yolov5py37) meng@meng:~/deeplearning/FastFCN$ 

2 训练模型

        在训练模型之前,参考4.2和4.3进行操作

        参考:FastFCN/encnet_res50_pcontext.sh at master · wuhuikai/FastFCN · GitHub

        训练encnet_res_50模型的参考命令为:

#train
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.train --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --checkname encnet_res50_pcontext

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.train --dataset pcontext     --model encnet --jpu JPU --aux --se-loss     --backbone resnet50 --checkname encnet_res50_pcontext

        能够训练,但RuntimeError: CUDA out of memory.

        ---------先不训练了

3 测试模型

3.1 下载模型

       在 下载作者训练好的模型文件。(下图右侧的bash文件包含指令:训练--预测--fps计算)

        在下面文件夹中存放上述文件:

3.2 测试 encnet_jpu_res50_pcontext.pth.tar

3.2.1 test [single-scale] (单一尺寸:pixAcc=0.7898、mIou=0.5105)

#github参考输入
#test [single-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode testval

        我这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode testval

        像素准确度pixAcc=0.7898,平均交并比mIou=0.5105,测试约10分钟。

3.2.2 test [multi-scale] (多尺寸:pixAcc=0.7964、mIou=0.5210)

#github参考输入
#test [multi-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode testval --ms

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode testval --ms

        测试耗时1小时19分钟,像素准确度pixAcc为0.7964,平均交并比为0.5210

        test [multi-scale] 比test [single-scale] 多了个选项--ms,--ms在test文件里面首先改变scales

        然后跳转到base.py文件,scales值被传递过来

        接着在base.py文件里面进行系列的计算:

        for scale in self.scales:long_size = int(math.ceil(self.base_size * scale))#math.ceil():大于浮点数的最小整数if h > w:height = long_sizewidth = int(1.0 * w * long_size / h + 0.5) #好像是根据原长h:w来设置新长度height和widthshort_size = widthelse:width = long_sizeheight = int(1.0 * h * long_size / w + 0.5)short_size = height# resize image to current sizecur_img = resize_image(image, height, width, **self.module._up_kwargs)if long_size <= crop_size: #if 和 else 保证pad_img的长宽都不小于crop_sizepad_img = pad_image(cur_img, self.module.mean,self.module.std, crop_size)outputs = module_inference(self.module, pad_img, self.flip)outputs = crop_image(outputs, 0, height, 0, width)else:if short_size < crop_size:# pad if neededpad_img = pad_image(cur_img, self.module.mean,self.module.std, crop_size)else:pad_img = cur_img_,_,ph,pw = pad_img.size()assert(ph >= height and pw >= width)# grid forward and normalizeh_grids = int(math.ceil(1.0 * (ph-crop_size)/stride)) + 1w_grids = int(math.ceil(1.0 * (pw-crop_size)/stride)) + 1with torch.cuda.device_of(image):outputs = image.new().resize_(batch,self.nclass,ph,pw).zero_().cuda()count_norm = image.new().resize_(batch,1,ph,pw).zero_().cuda()# grid evaluationfor idh in range(h_grids):for idw in range(w_grids):h0 = idh * stridew0 = idw * strideh1 = min(h0 + crop_size, ph)w1 = min(w0 + crop_size, pw)crop_img = crop_image(pad_img, h0, h1, w0, w1)# pad if neededpad_crop_img = pad_image(crop_img, self.module.mean,self.module.std, crop_size)output = module_inference(self.module, pad_crop_img, self.flip)outputs[:,:,h0:h1,w0:w1] += crop_image(output,0, h1-h0, 0, w1-w0)count_norm[:,:,h0:h1,w0:w1] += 1assert((count_norm==0).sum()==0)outputs = outputs / count_normoutputs = outputs[:,:,:height,:width]score = resize_image(outputs, h, w, **self.module._up_kwargs)scores += scorereturn scores

         注意在base.py里面有对scores的定义:

        with torch.cuda.device_of(image):scores = image.new().resize_(batch,self.nclass,h,w).zero_().cuda()

        说明在test.py文件中调用的 MultiEvalModule函数应该是为了生成多个尺度的图像用于训练。

        在知乎上一个回答是:

 

3.2.3 predict [single-scale] (单一尺寸)

#github参考输入
#predict [single-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode test

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode test

        结果为:

(yolov5py37) meng@meng:~/deeplearning/FastFCN$ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \
>     --model encnet --jpu JPU --aux --se-loss \
>     --backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode test
Namespace(aux=True, aux_weight=0.2, backbone='resnet50', base_size=520, batch_size=16, checkname='default', crop_size=480, cuda=True, dataset='pcontext', dilated=False, epochs=80, ft=False, jpu='JPU', lateral=False, lr=0.001, lr_scheduler='poly', mode='test', model='encnet', model_zoo=None, momentum=0.9, ms=False, no_cuda=False, no_val=False, resume='/home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar', save_folder='experiments/segmentation/results', se_loss=True, se_weight=0.2, seed=1, split='val', start_epoch=0, test_batch_size=16, train_split='train', weight_decay=0.0001, workers=16)
loading annotations into memory...
JSON root keys:dict_keys(['info', 'images', 'annos_segmentation', 'annos_occlusion', 'annos_boundary', 'categories', 'parts'])
Done (t=3.22s)
creating index...
index created! (t=2.42s)
mask_file: /home/meng/.encoding/data/VOCdevkit/VOC2010/val.pth
=> loaded checkpoint '/home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar' (epoch 79)

        观察上面打印的:save_folder,去找预测的结果,进行对比(原图片在:/home/meng/.encoding/data/VOCdevkit/VOC2010/JPEGImages

        对比2008_000064图片

        图片介绍文件:2008_000064.xml:

<annotation><folder>VOC2010</folder><filename>2008_000064.jpg</filename><source><database>The VOC2008 Database</database><annotation>PASCAL VOC2008</annotation><image>flickr</image></source><size><width>375</width><height>500</height><depth>3</depth></size><segmented>0</segmented><object><name>aeroplane</name><pose>Frontal</pose><truncated>1</truncated><occluded>0</occluded><bndbox><xmin>1</xmin><ymin>152</ymin><xmax>375</xmax><ymax>461</ymax></bndbox><difficult>0</difficult></object>
</annotation>

4 报错与解决:

4.1 detail-api编译报错

        error: command 'gcc' failed with exit status 1
Installing PASCAL Context API failed, please install it manually

        我第一遍运行prepare_pcontext.py程序时,编译detail-api报错如下,此时我按照1.1和1.2的操作解决了问题.

gcc -pthread -B /home/meng/anaconda3/envs/yolov5py37/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include -I../common -I/home/meng/anaconda3/envs/yolov5py37/include/python3.7m -c detail/_mask.c -o build/temp.linux-x86_64-3.7/detail/_mask.o
In file included from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1969:0,from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,from detail/_mask.c:461:
/home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]#warning "Using deprecated NumPy API, disable it with " \^~~~~~~
detail/_mask.c: In function ‘__Pyx_PyCFunction_FastCall’:
detail/_mask.c:12772:13: error: too many arguments to function ‘(PyObject * (*)(PyObject *, PyObject * const*, Py_ssize_t))meth’return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL);~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
detail/_mask.c: In function ‘__Pyx__ExceptionSave’:
detail/_mask.c:14254:21: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?*type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14255:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?*value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14256:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?*tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c: In function ‘__Pyx__ExceptionReset’:
detail/_mask.c:14263:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tmp_type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14264:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tmp_value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14265:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tmp_tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c:14266:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tstate->exc_type = type;^~~~~~~~curexc_type
detail/_mask.c:14267:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tstate->exc_value = value;^~~~~~~~~curexc_value
detail/_mask.c:14268:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tstate->exc_traceback = tb;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c: In function ‘__Pyx__GetException’:
detail/_mask.c:14323:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tmp_type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14324:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tmp_value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14325:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tmp_tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c:14326:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tstate->exc_type = local_type;^~~~~~~~curexc_type
detail/_mask.c:14327:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tstate->exc_value = local_value;^~~~~~~~~curexc_value
detail/_mask.c:14328:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tstate->exc_traceback = local_tb;^~~~~~~~~~~~~curexc_traceback
error: command 'gcc' failed with exit status 1
Installing PASCAL Context API failed, please install it manually 

4.2 模型文件丢失

        报错:RuntimeError: Failed downloading url .zip

        点开报错的链接进入:http://ttps://hangzh.s3.amazonaws.com/encoding/models/resnet50-ebb6acbb.zip

        换了几种上网方式都无法访问,大概是作者删模型文件了吧 

        在github上提问,作者给了三个模型的下载链接: 

        将下载的文件放在下面的文件夹中

4.3 AttributeError: 'NoneType' object has no attribute 'run_slave'

报错原因:

The reason is that you're not using multiple GPUs. Change SynBN to regular BN if you want to train on one GPU.

没有使用多个GPU进行训练,如果使用一个GPU进行训练时,将SynBN修改为regular BN

        (1)修改/FastFCN/experiments/segmentation/train.py的54行

        (2)修改/FastFCN/experiments/segmentation/train.py的111行 

        (3)移除/FastFCN/experiments/segmentation/train.py的132行   

参考链接:

一个博主汇总的部分pytorch官方训练的resnet:

在github上提问:

RuntimeError: Failed downloading url .zip · Issue #108 · wuhuikai/FastFCN · GitHub

多gpu改为单gpu:how to Change SynBN to regular BN ? · Issue #12 · wuhuikai/FastFCN · GitHub

Pascal VOC数据集分析:

Pascal Voc数据集详细分析_持久决心的博客-CSDN博客_pascal voc

知乎:关于多尺度与单一尺度的理解:

如何理解深度学习中的multi scale和single scale? - 知乎

深度学习(五):FastFCN代码运行、测试与预测

目录

0 前言

1 环境配置

1.1 安装python包

1.2 下载detail-api

1.3 运行prepare_pcontext.py

1.4 运行 prepare_ade20k.py

2 训练模型

3 测试模型

3.1 下载模型

3.2 测试 encnet_jpu_res50_pcontext.pth.tar

3.2.1 test [single-scale] (单一尺寸:pixAcc=0.7898、mIou=0.5105)

3.2.2 test [multi-scale] (多尺寸:pixAcc=0.7964、mIou=0.5210)

3.2.3 predict [single-scale] (单一尺寸)

4 报错与解决:

4.1 detail-api编译报错

4.2 模型文件丢失

4.3 AttributeError: 'NoneType' object has no attribute 'run_slave'

参考链接:


0 前言

        全称:FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation----沈阳自动化所团队

        论文:.11816

        github:

        本机:RTX3070、cuda-11.0、torch-1.7.1+cu110、python3.7

        FastFCN下一篇:深度学习(8):FastFCN代码运行、测试与预测2_biter0088的博客-CSDN博客

1 环境配置

        官方测试的环境:

PyTorch >= 1.1.0 (Note: The code is test in the environment with python=3.6, cuda=9.0)

#master版本,我克隆的是2022年3月版本的,作者可能会有改动
git clone .git 
cd FastFCN

1.1 安装python包

        创建文件requirements.txt,安装其他包

        注:激活python环境 source activate yolov5py37

nose
tqdm
scipy
cython
requests
scikit-image
python3-dev
libevent-dev
cPython
pip install -r requirements.txt

1.2 下载detail-api

        下载到FastFCN目录下:

git clone 

        并注释/xx/FastFCN/scripts/prepare_pcontext.py文件如下:

def install_pcontext_api():#repo_url = ""#os.system("git clone " + repo_url)os.system("cd detail-api/PythonAPI/ && python setup.py install")shutil.rmtree('detail-api')try:import detailexcept Exception:print("Installing PASCAL Context API failed, please install it manually %s"%(repo_url))

        注:执行prepare_pcontext.py后,detail-api被安装,上面箭头指的文件夹会被删除

1.3 运行prepare_pcontext.py

        文件目录为:/xx/FastFCN/scripts/prepare_pcontext.py,准备VOC2010数据集

python -m scripts.prepare_pcontext

        会下载VOC2010数据到如下目录:

#VOC2010数据集
官方网站:.html.
└── VOCdevkit     #根目录└── VOC2010   #不同年份的数据集,这里只下载了2012的,还有2007等其它年份的├── Annotations        #存放xml文件,与JPEGImages中的图片一一对应,解释图片的内容等等├── ImageSets          #该目录下存放的都是txt文件,txt文件中每一行包含一个图片的名称,末尾会加上±1表示正负样本│   ├── Action│   ├── Layout│   ├── Main│   └── Segmentation├── JPEGImages         #存放源图片├── SegmentationClass  #存放的是图片,语义分割相关,标注出每个像素的类别└── SegmentationObject #存放的是图片,实例分割相关,标注出每个像素属于哪一个物体

         下载完成后,会编译安装detail-api,安装完成后会删除前面1.2的下载文件----所以如果detail-api在终端打印输出安装成功时,下面几行就没有作用了,可以注释掉

    os.system("cd detail-api/PythonAPI/ && python setup.py install")shutil.rmtree('detail-api')

        注:prepare_pcontext.py程序再次运行时,还会重新下载一遍VOC2010数据集----一个bug(一般如果成功安装包和下载数据后,这个程序就不要运行了);如果第一遍数据下载成功后,出现了一些其他报错,需要再次运行prepare_pcontext.py去准备数据和环境包时,可以将下面几行注释掉:

if __name__ == '__main__':args = parse_args()#mkdir(os.path.expanduser('~/.encoding/data'))#if args.download_dir is not None:#    if os.path.isdir(_TARGET_DIR):#        os.remove(_TARGET_DIR)# make symlink#    os.symlink(args.download_dir, _TARGET_DIR)#else:#    download_ade(_TARGET_DIR, overwrite=False)install_pcontext_api()

1.4 运行 prepare_ade20k.py

        文件目录为:/xxx/FastFCN/scripts/prepare_ade20k.py,准备ADEChallengeData2016数据集。

python -m scripts.prepare_ade20k
(yolov5py37) meng@meng:~/deeplearning/FastFCN$ python -m scripts.prepare_ade20k
Downloading /home/meng/.encoding/data/downloads/ADEChallengeData2016.zip from .zip...
944710KB [05:23, 2923.61KB/s]                                                                                                                                                                                                                                                      
Downloading /home/meng/.encoding/data/downloads/release_test.zip from .zip...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 206856/206856 [04:29<00:00, 766.68KB/s]
(yolov5py37) meng@meng:~/deeplearning/FastFCN$ 

2 训练模型

        在训练模型之前,参考4.2和4.3进行操作

        参考:FastFCN/encnet_res50_pcontext.sh at master · wuhuikai/FastFCN · GitHub

        训练encnet_res_50模型的参考命令为:

#train
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.train --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --checkname encnet_res50_pcontext

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.train --dataset pcontext     --model encnet --jpu JPU --aux --se-loss     --backbone resnet50 --checkname encnet_res50_pcontext

        能够训练,但RuntimeError: CUDA out of memory.

        ---------先不训练了

3 测试模型

3.1 下载模型

       在 下载作者训练好的模型文件。(下图右侧的bash文件包含指令:训练--预测--fps计算)

        在下面文件夹中存放上述文件:

3.2 测试 encnet_jpu_res50_pcontext.pth.tar

3.2.1 test [single-scale] (单一尺寸:pixAcc=0.7898、mIou=0.5105)

#github参考输入
#test [single-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode testval

        我这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode testval

        像素准确度pixAcc=0.7898,平均交并比mIou=0.5105,测试约10分钟。

3.2.2 test [multi-scale] (多尺寸:pixAcc=0.7964、mIou=0.5210)

#github参考输入
#test [multi-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode testval --ms

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode testval --ms

        测试耗时1小时19分钟,像素准确度pixAcc为0.7964,平均交并比为0.5210

        test [multi-scale] 比test [single-scale] 多了个选项--ms,--ms在test文件里面首先改变scales

        然后跳转到base.py文件,scales值被传递过来

        接着在base.py文件里面进行系列的计算:

        for scale in self.scales:long_size = int(math.ceil(self.base_size * scale))#math.ceil():大于浮点数的最小整数if h > w:height = long_sizewidth = int(1.0 * w * long_size / h + 0.5) #好像是根据原长h:w来设置新长度height和widthshort_size = widthelse:width = long_sizeheight = int(1.0 * h * long_size / w + 0.5)short_size = height# resize image to current sizecur_img = resize_image(image, height, width, **self.module._up_kwargs)if long_size <= crop_size: #if 和 else 保证pad_img的长宽都不小于crop_sizepad_img = pad_image(cur_img, self.module.mean,self.module.std, crop_size)outputs = module_inference(self.module, pad_img, self.flip)outputs = crop_image(outputs, 0, height, 0, width)else:if short_size < crop_size:# pad if neededpad_img = pad_image(cur_img, self.module.mean,self.module.std, crop_size)else:pad_img = cur_img_,_,ph,pw = pad_img.size()assert(ph >= height and pw >= width)# grid forward and normalizeh_grids = int(math.ceil(1.0 * (ph-crop_size)/stride)) + 1w_grids = int(math.ceil(1.0 * (pw-crop_size)/stride)) + 1with torch.cuda.device_of(image):outputs = image.new().resize_(batch,self.nclass,ph,pw).zero_().cuda()count_norm = image.new().resize_(batch,1,ph,pw).zero_().cuda()# grid evaluationfor idh in range(h_grids):for idw in range(w_grids):h0 = idh * stridew0 = idw * strideh1 = min(h0 + crop_size, ph)w1 = min(w0 + crop_size, pw)crop_img = crop_image(pad_img, h0, h1, w0, w1)# pad if neededpad_crop_img = pad_image(crop_img, self.module.mean,self.module.std, crop_size)output = module_inference(self.module, pad_crop_img, self.flip)outputs[:,:,h0:h1,w0:w1] += crop_image(output,0, h1-h0, 0, w1-w0)count_norm[:,:,h0:h1,w0:w1] += 1assert((count_norm==0).sum()==0)outputs = outputs / count_normoutputs = outputs[:,:,:height,:width]score = resize_image(outputs, h, w, **self.module._up_kwargs)scores += scorereturn scores

         注意在base.py里面有对scores的定义:

        with torch.cuda.device_of(image):scores = image.new().resize_(batch,self.nclass,h,w).zero_().cuda()

        说明在test.py文件中调用的 MultiEvalModule函数应该是为了生成多个尺度的图像用于训练。

        在知乎上一个回答是:

 

3.2.3 predict [single-scale] (单一尺寸)

#github参考输入
#predict [single-scale]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu [JPU|JPU_X] --aux --se-loss \--backbone resnet50 --resume {MODEL} --split val --mode test

        这里输入:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \--model encnet --jpu JPU --aux --se-loss \--backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode test

        结果为:

(yolov5py37) meng@meng:~/deeplearning/FastFCN$ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test --dataset pcontext \
>     --model encnet --jpu JPU --aux --se-loss \
>     --backbone resnet50 --resume /home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar --split val --mode test
Namespace(aux=True, aux_weight=0.2, backbone='resnet50', base_size=520, batch_size=16, checkname='default', crop_size=480, cuda=True, dataset='pcontext', dilated=False, epochs=80, ft=False, jpu='JPU', lateral=False, lr=0.001, lr_scheduler='poly', mode='test', model='encnet', model_zoo=None, momentum=0.9, ms=False, no_cuda=False, no_val=False, resume='/home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar', save_folder='experiments/segmentation/results', se_loss=True, se_weight=0.2, seed=1, split='val', start_epoch=0, test_batch_size=16, train_split='train', weight_decay=0.0001, workers=16)
loading annotations into memory...
JSON root keys:dict_keys(['info', 'images', 'annos_segmentation', 'annos_occlusion', 'annos_boundary', 'categories', 'parts'])
Done (t=3.22s)
creating index...
index created! (t=2.42s)
mask_file: /home/meng/.encoding/data/VOCdevkit/VOC2010/val.pth
=> loaded checkpoint '/home/meng/.encoding/models/encnet_jpu_res50_pcontext.pth.tar' (epoch 79)

        观察上面打印的:save_folder,去找预测的结果,进行对比(原图片在:/home/meng/.encoding/data/VOCdevkit/VOC2010/JPEGImages

        对比2008_000064图片

        图片介绍文件:2008_000064.xml:

<annotation><folder>VOC2010</folder><filename>2008_000064.jpg</filename><source><database>The VOC2008 Database</database><annotation>PASCAL VOC2008</annotation><image>flickr</image></source><size><width>375</width><height>500</height><depth>3</depth></size><segmented>0</segmented><object><name>aeroplane</name><pose>Frontal</pose><truncated>1</truncated><occluded>0</occluded><bndbox><xmin>1</xmin><ymin>152</ymin><xmax>375</xmax><ymax>461</ymax></bndbox><difficult>0</difficult></object>
</annotation>

4 报错与解决:

4.1 detail-api编译报错

        error: command 'gcc' failed with exit status 1
Installing PASCAL Context API failed, please install it manually

        我第一遍运行prepare_pcontext.py程序时,编译detail-api报错如下,此时我按照1.1和1.2的操作解决了问题.

gcc -pthread -B /home/meng/anaconda3/envs/yolov5py37/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include -I../common -I/home/meng/anaconda3/envs/yolov5py37/include/python3.7m -c detail/_mask.c -o build/temp.linux-x86_64-3.7/detail/_mask.o
In file included from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1969:0,from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,from /home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,from detail/_mask.c:461:
/home/meng/anaconda3/envs/yolov5py37/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]#warning "Using deprecated NumPy API, disable it with " \^~~~~~~
detail/_mask.c: In function ‘__Pyx_PyCFunction_FastCall’:
detail/_mask.c:12772:13: error: too many arguments to function ‘(PyObject * (*)(PyObject *, PyObject * const*, Py_ssize_t))meth’return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL);~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
detail/_mask.c: In function ‘__Pyx__ExceptionSave’:
detail/_mask.c:14254:21: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?*type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14255:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?*value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14256:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?*tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c: In function ‘__Pyx__ExceptionReset’:
detail/_mask.c:14263:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tmp_type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14264:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tmp_value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14265:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tmp_tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c:14266:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tstate->exc_type = type;^~~~~~~~curexc_type
detail/_mask.c:14267:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tstate->exc_value = value;^~~~~~~~~curexc_value
detail/_mask.c:14268:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tstate->exc_traceback = tb;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c: In function ‘__Pyx__GetException’:
detail/_mask.c:14323:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tmp_type = tstate->exc_type;^~~~~~~~curexc_type
detail/_mask.c:14324:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tmp_value = tstate->exc_value;^~~~~~~~~curexc_value
detail/_mask.c:14325:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tmp_tb = tstate->exc_traceback;^~~~~~~~~~~~~curexc_traceback
detail/_mask.c:14326:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?tstate->exc_type = local_type;^~~~~~~~curexc_type
detail/_mask.c:14327:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?tstate->exc_value = local_value;^~~~~~~~~curexc_value
detail/_mask.c:14328:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?tstate->exc_traceback = local_tb;^~~~~~~~~~~~~curexc_traceback
error: command 'gcc' failed with exit status 1
Installing PASCAL Context API failed, please install it manually 

4.2 模型文件丢失

        报错:RuntimeError: Failed downloading url .zip

        点开报错的链接进入:http://ttps://hangzh.s3.amazonaws.com/encoding/models/resnet50-ebb6acbb.zip

        换了几种上网方式都无法访问,大概是作者删模型文件了吧 

        在github上提问,作者给了三个模型的下载链接: 

        将下载的文件放在下面的文件夹中

4.3 AttributeError: 'NoneType' object has no attribute 'run_slave'

报错原因:

The reason is that you're not using multiple GPUs. Change SynBN to regular BN if you want to train on one GPU.

没有使用多个GPU进行训练,如果使用一个GPU进行训练时,将SynBN修改为regular BN

        (1)修改/FastFCN/experiments/segmentation/train.py的54行

        (2)修改/FastFCN/experiments/segmentation/train.py的111行 

        (3)移除/FastFCN/experiments/segmentation/train.py的132行   

参考链接:

一个博主汇总的部分pytorch官方训练的resnet:

在github上提问:

RuntimeError: Failed downloading url .zip · Issue #108 · wuhuikai/FastFCN · GitHub

多gpu改为单gpu:how to Change SynBN to regular BN ? · Issue #12 · wuhuikai/FastFCN · GitHub

Pascal VOC数据集分析:

Pascal Voc数据集详细分析_持久决心的博客-CSDN博客_pascal voc

知乎:关于多尺度与单一尺度的理解:

如何理解深度学习中的multi scale和single scale? - 知乎

发布评论

评论列表 (0)

  1. 暂无评论