当我想用dataloader多开子进程加快图片加载速度时,发现报有关进程的错误:
RuntimeError: An attempt has been made to start a new process before thecurrent process has finished its bootstrapping phase.This probably means that you are not using fork to start yourchild processes and you have forgotten to use the proper idiomin the main module:if __name__ == '__main__':freeze_support()...The "freeze_support()" line can be omitted if the programis not going to be frozen to produce an executable.To fix this issue, refer to the "Safe importing of main module"section in https://docs.python.org/3/library/multiprocessing.html
原因:我使用的系统是Windows,与Linux系统在子进程底层操作上还有些不同。Windows是将脚本里所有东西都重新执行一次,所必需将与创建子进程的相关代码放在if __name__ == '__main__'中。
此外仅仅将部分代码放在if __name__ == '__main__'还不够,发现:如果__name__外面还有其他非定义类、非定义函数的代码,则有多少个num_workers就会重复执行多少次代码。其原因也是因为Windows创建进程的方式是重新运行主脚本。
总结:如果要使用多进程,必须将创建多进程的代码放在if __name__ == '__main__'的保护之下,此外为了防止重复执行主脚本的代码,必须将主脚本中除定义类、函数以外的代码也放进去。