I have created a pool from the python multiprocessing module and would like to change the number of processes that the pool has running or add to them. Is this possible? I have tried something like this (simplified version of my code)
class foo:
def __init__():
self.pool = Pool()
def bar(self, x):
self.pool.processes = x
return self.pool.map(somefunction, list_of_args)
It seems to work and achieves the result I wanted in the end (which was to split the work between multiple processes) but I am not sure that is this the best way to do it, or why it works.
I don't think this actually works:
import multiprocessing, time
def fn(x):
print "running for", x
time.sleep(5)
if __name__ == "__main__":
pool = multiprocessing.Pool()
pool.processes = 2
# runs with number of cores available (8 on my machine)
pool.map(fn, range(10))
# still runs with number of cores available, not 10
pool.processes = 10
pool.map(fn, range(10))
multiprocessing.Pool
stores the number of processes in a private variable (ie Pool._processes
) which is set at the point when the Pool is instantiated. See the source code.
The reason this appears to be working is because the number of processes is automatically set to the number of cores on your current machine unless you specify a different number.
I'm not sure why you'd want to change the number of processes available -- maybe you can explain this in more detail. It's pretty easy to create a new pool though whenever you want (presumably after other pools have finished running).
You can by using the private variable _processes
and private method _repopulate_pool
. But I wouldn't recommend using private variables etc.
pool = multiprocessing.Pool(processes=1, initializer=start_process)
>Starting ForkPoolWorker-35
pool._processes = 3
pool._repopulate_pool()
>Starting ForkPoolWorker-36
>Starting ForkPoolWorker-37