我想将字典列表与 python 多处理模块一起添加.
这是我的代码的简化版本:
#!/usr/bin/python2.7# -*- 编码:utf-8 -*-导入多处理导入功能工具进口时间def 合并(锁,d1,d2):time.sleep(5) # 一些耗时的东西带锁:对于 d2.keys() 中的键:如果 d1.has_key(key):d1[键] += d2[键]别的:d1[键] = d2[键]l = [{ x % 10 : x } for x in range(10000)]lock = multiprocessing.Lock()d = multiprocessing.Manager().dict()partial_merge = functools.partial(合并,d1 = d,锁 = 锁)pool_size = multiprocessing.cpu_count()池 = 多处理.池(进程 = pool_size)pool.map(partial_merge, l)池.close()pool.join()打印 d运行此脚本时出现此错误.我该如何解决这个问题?
RuntimeError: 锁对象只能通过继承在进程之间共享
这种情况下需要merge函数中的lock吗?还是python会处理它?</p>
我认为 map 应该做的是将某些内容从一个列表映射到另一个列表,而不是将一个列表中的所有内容转储到单个对象.那么有没有更优雅的方式来做这些事情呢?
以下内容应该在 Python 2 和 3 中跨平台运行(即在 Windows 上).它使用进程池初始化程序将 manager dict 设置为每个子进程中的一个全局变量.
仅供参考:
Pool 中的进程数默认为 CPU 计数.apply_async 而不是 map.导入多处理进口时间定义合并(d2):time.sleep(1) # 一些耗时的东西对于 d2.keys() 中的键:如果键入 d1:d1[键] += d2[键]别的:d1[键] = d2[键]定义初始化(d):全局 d1d1 = d如果 __name__ == '__main__':d1 = multiprocessing.Manager().dict()pool = multiprocessing.Pool(initializer=init, initargs=(d1, ))l = [{ x % 5 : x } for x in range(10)]对于 l 中的项目:pool.apply_async(合并,(项目,))池.close()pool.join()打印(l)打印(d1)I want to add a list of dicts together with python multiprocessing module.
Here is a simplified version of my code:
#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
import multiprocessing
import functools
import time
def merge(lock, d1, d2):
time.sleep(5) # some time consuming stuffs
with lock:
for key in d2.keys():
if d1.has_key(key):
d1[key] += d2[key]
else:
d1[key] = d2[key]
l = [{ x % 10 : x } for x in range(10000)]
lock = multiprocessing.Lock()
d = multiprocessing.Manager().dict()
partial_merge = functools.partial(merge, d1 = d, lock = lock)
pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes = pool_size)
pool.map(partial_merge, l)
pool.close()
pool.join()
print d
I get this error when running this script. How shall I resolve this?
RuntimeError: Lock objects should only be shared between processes through inheritance
is the lock in merge function needed in this condition? or python will take care of it?
I think what's map supposed to do is to map something from one list to another list, not dump all things in one list to a single object. So is there a more elegant way to do such things?
The following should run cross-platform (i.e. on Windows, too) in both Python 2 and 3. It uses a process pool initializer to set the manager dict as a global in each child process.
FYI:
Pool defaults to the CPU count. apply_async instead of map. import multiprocessing
import time
def merge(d2):
time.sleep(1) # some time consuming stuffs
for key in d2.keys():
if key in d1:
d1[key] += d2[key]
else:
d1[key] = d2[key]
def init(d):
global d1
d1 = d
if __name__ == '__main__':
d1 = multiprocessing.Manager().dict()
pool = multiprocessing.Pool(initializer=init, initargs=(d1, ))
l = [{ x % 5 : x } for x in range(10)]
for item in l:
pool.apply_async(merge, (item,))
pool.close()
pool.join()
print(l)
print(d1)
这篇关于python多处理锁问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
Python 多处理模块的 .join() 方法到底在做什么?What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多处理模块的 .join() 方法到底在做什么?)
在 Python 中将多个参数传递给 pool.map() 函数Passing multiple parameters to pool.map() function in Python(在 Python 中将多个参数传递给 pool.map() 函数)
multiprocessing.pool.MaybeEncodingError: 'TypeError("multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEnc
Python 多进程池.当其中一个工作进程确定不再需要Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多进程池.当其中一
如何将队列引用传递给 pool.map_async() 管理的函数How do you pass a Queue reference to a function managed by pool.map_async()?(如何将队列引用传递给 pool.map_async() 管理的函数?)
与多处理错误的另一个混淆,“模块"对象没yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(与多处理错误的另一个混淆,“模块对象