我正在使用 urllib2 从 ftp 和 http 服务器加载文件.
某些服务器仅支持每个 IP 一个连接.问题是,urllib2 不会立即关闭连接.查看示例程序.
从 urllib2 导入 urlopen从时间导入睡眠url = 'ftp://user:pass@host/big_file.ext'定义加载文件(网址):f = urlopen(url)加载 = 0而真:数据 = f.read(1024)如果数据 == '':休息已加载 += len(数据)f.close()#睡眠(1)print('已加载 {0}'.format(已加载))加载文件(网址)加载文件(网址)代码从仅支持 1 个连接的 ftp 服务器加载两个文件(此处两个文件相同).这将打印以下日志:
已加载 463675266回溯(最近一次通话最后):文件conection_test.py",第 20 行,在 <module>加载文件(网址)文件connection_test.py",第 7 行,在 load_file 中f = urlopen(url)文件/usr/lib/python2.6/urllib2.py",第 126 行,在 urlopenreturn _opener.open(网址,数据,超时)文件/usr/lib/python2.6/urllib2.py",第 391 行,打开响应 = self._open(请求,数据)_open 中的文件/usr/lib/python2.6/urllib2.py",第 409 行'_open',请求)_call_chain 中的文件/usr/lib/python2.6/urllib2.py",第 369 行结果 = 函数(*args)文件/usr/lib/python2.6/urllib2.py",第 1331 行,在 ftp_openfw = self.connect_ftp(用户,密码,主机,端口,目录,req.timeout)文件/usr/lib/python2.6/urllib2.py",第 1352 行,在 connect_ftpfw = ftpwrapper(用户、密码、主机、端口、目录、超时)__init__ 中的文件/usr/lib/python2.6/urllib.py",第 854 行self.init()文件/usr/lib/python2.6/urllib.py",第 860 行,在 initself.ftp.connect(self.host,self.port,self.timeout)文件/usr/lib/python2.6/ftplib.py",第 134 行,在连接中self.welcome = self.getresp()文件/usr/lib/python2.6/ftplib.py",第 216 行,在 getresp 中提高error_temp,respurllib2.URLError: <urlopen 错误 ftp 错误: 421 来自您的 Internet 地址的连接太多.>所以第一个文件被加载,第二个文件失败,因为第一个连接没有关闭.
但是当我在 f.close() 之后使用 sleep(1) 时不会发生错误:
已加载 463675266已加载 463675266有什么办法可以强制关闭连接,以免第二次下载失败?
原因确实是文件描述符泄漏.我们还发现,使用 jython 时,问题比使用 cpython 时要明显得多.一位同事提出了这个解决方案:
<上一页>fdurl = urllib2.urlopen(req,timeout=self.timeout)realsock = fdurl.fp._sock.fp._sock** # 我们想稍后关闭真实"套接字req = urllib2.Request(url, header)尝试:fdurl = urllib2.urlopen(req,timeout=self.timeout)除了 urllib2.URLError,e:打印urlopen 异常",erealsock.close()fdurl.close()修复很丑陋,但确实有效,不再有打开的连接太多".
I'm using urllib2 to load files from ftp- and http-servers.
Some of the servers support only one connection per IP. The problem is, that urllib2 does not close the connection instantly. Look at the example-program.
from urllib2 import urlopen
from time import sleep
url = 'ftp://user:pass@host/big_file.ext'
def load_file(url):
f = urlopen(url)
loaded = 0
while True:
data = f.read(1024)
if data == '':
break
loaded += len(data)
f.close()
#sleep(1)
print('loaded {0}'.format(loaded))
load_file(url)
load_file(url)
The code loads two files (here the two files are the same) from an ftp-server which supports only 1 connection. This will print the following log:
loaded 463675266
Traceback (most recent call last):
File "conection_test.py", line 20, in <module>
load_file(url)
File "conection_test.py", line 7, in load_file
f = urlopen(url)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
File "/usr/lib/python2.6/urllib.py", line 854, in __init__
self.init()
File "/usr/lib/python2.6/urllib.py", line 860, in init
self.ftp.connect(self.host, self.port, self.timeout)
File "/usr/lib/python2.6/ftplib.py", line 134, in connect
self.welcome = self.getresp()
File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>
So the first file is loaded and the second fails because the first connection was not closed.
But when i use sleep(1) after f.close() the error does not occurr:
loaded 463675266
loaded 463675266
Is there any way to force close the connection so that the second download would not fail?
The cause is indeed a file descriptor leak. We found also that with jython, the problem is much more obvious than with cpython. A colleague proposed this sollution:
fdurl = urllib2.urlopen(req,timeout=self.timeout)
realsock = fdurl.fp._sock.fp._sock** # we want to close the "real" socket later
req = urllib2.Request(url, header)
try:
fdurl = urllib2.urlopen(req,timeout=self.timeout)
except urllib2.URLError,e:
print "urlopen exception", e
realsock.close()
fdurl.close()
The fix is ugly, but does the job, no more "too many open connections".
这篇关于关闭 urllib2 连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
为什么我不能插入 Python 列表?Why I cannot make an insert to Python list?(为什么我不能插入 Python 列表?)
在 DataFrame 的开头(最左端)插入一列Insert a column at the beginning (leftmost end) of a DataFrame(在 DataFrame 的开头(最左端)插入一列)
Python psycopg2 没有插入到 postgresql 表中Python psycopg2 not inserting into postgresql table(Python psycopg2 没有插入到 postgresql 表中)
list extend() 索引,不仅将列表元素插入到末尾list extend() to index, inserting list elements not only to the end(list extend() 索引,不仅将列表元素插入到末尾)
如何使用 list.insert 将 Python 中的元素添加到列表How to add element in Python to the end of list using list.insert?(如何使用 list.insert 将 Python 中的元素添加到列表末尾?)
TypeError:“浮动"对象不可下标TypeError: #39;float#39; object is not subscriptable(TypeError:“浮动对象不可下标)