Here is a piece of code for parallel computing in python:
from concurrent.futures import ProcessPoolExecutor as PE
max_Worker = 10
def data_processing(index, file_path):
# Loading the data from file path
# Do some processing
# Export results in pickle, then there is no return
with PE(max_workers = max_Worker) as pe:
for Index, File_Path in enumerate(file_List):
pe.submit(data_processing, Index, File_Path)
The parallel solution is working well in my local PC, however, it freezes on a google virtual machine (vm) instance. Here is the information for the vm instance:
Machine type: n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Haswell
OS: ubuntu-18.04
My guess is that the problem is related to the virtual CPUs, but not sure.
Any comments about your experiences is appreciated.