Linux系统下使用pytorch多进程读取图片数据时的注意事项——DataLoader的多进程使用注意事项
原文:
PEP 703 – Making the Global Interpreter Lock Optional in CPython
相关内容:
The GIL Affects Python Library Usability
The GIL is a CPython implementation detail that limits multithreaded parallelism, so it might seem unintuitive to think of it as a usability issue. However, library authors frequently care a great deal about performance and will design APIs that support working around the GIL. These workaround frequently lead to APIs that are more difficult to use. Consequently, users of these APIs may experience the GIL as a usability issue and not just a performance issue.
For example, PyTorch exposes a multiprocessing-based API called DataLoader
for building data input pipelines. It uses fork()
on Linux because it is generally faster and uses less memory than spawn()
, but this leads to additional challenges for users: creating a DataLoader
after accessing a GPU can lead to confusing CUDA errors. Accessing GPUs within a DataLoader
worker quickly leads to out-of-memory errors because processes do not share CUDA contexts (unlike threads within a process).
===========================================
在pytorch中的多进程读取图片的API为DataLoader
,该API底层使用python的multiprocessing来实现的,多进程使用的是Linux中的fork()而不是spawn(),因为fork()速度更快并且内存更小,但是fork()的特性导致DataLoader API必须在代码中最早的位置实现,如果该API在CUDA调用之后实现那么该API所生成的所有子进程均会copy一个CUDA context进入到自身的内存空间中,从而造成内存泄露,占用大量内存空间,甚至导致程序因为内存不足而失败。
重点:
pytorch中要在代码最早的位置实现DataLoader的多进程操作。
===========================================