<bdo id='txTAz'></bdo><ul id='txTAz'></ul>

        <tfoot id='txTAz'></tfoot>
      1. <small id='txTAz'></small><noframes id='txTAz'>

      2. <i id='txTAz'><tr id='txTAz'><dt id='txTAz'><q id='txTAz'><span id='txTAz'><b id='txTAz'><form id='txTAz'><ins id='txTAz'></ins><ul id='txTAz'></ul><sub id='txTAz'></sub></form><legend id='txTAz'></legend><bdo id='txTAz'><pre id='txTAz'><center id='txTAz'></center></pre></bdo></b><th id='txTAz'></th></span></q></dt></tr></i><div id='txTAz'><tfoot id='txTAz'></tfoot><dl id='txTAz'><fieldset id='txTAz'></fieldset></dl></div>
      3. <legend id='txTAz'><style id='txTAz'><dir id='txTAz'><q id='txTAz'></q></dir></style></legend>

        通过 Python 使用 Selenium 进行多处理时,Chrome 在几

        时间:2023-05-26
          <i id='B8ykc'><tr id='B8ykc'><dt id='B8ykc'><q id='B8ykc'><span id='B8ykc'><b id='B8ykc'><form id='B8ykc'><ins id='B8ykc'></ins><ul id='B8ykc'></ul><sub id='B8ykc'></sub></form><legend id='B8ykc'></legend><bdo id='B8ykc'><pre id='B8ykc'><center id='B8ykc'></center></pre></bdo></b><th id='B8ykc'></th></span></q></dt></tr></i><div id='B8ykc'><tfoot id='B8ykc'></tfoot><dl id='B8ykc'><fieldset id='B8ykc'></fieldset></dl></div>

          <small id='B8ykc'></small><noframes id='B8ykc'>

          1. <legend id='B8ykc'><style id='B8ykc'><dir id='B8ykc'><q id='B8ykc'></q></dir></style></legend>

                <bdo id='B8ykc'></bdo><ul id='B8ykc'></ul>
                  <tbody id='B8ykc'></tbody>
                <tfoot id='B8ykc'></tfoot>
                1. 本文介绍了通过 Python 使用 Selenium 进行多处理时,Chrome 在几个小时后崩溃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  限时送ChatGPT账号..

                  这是几个小时抓取后的错误回溯:

                  This is the error traceback after several hours of scraping:

                  The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.
                  

                  这是我的 selenium python 设置:

                  This is my setup of selenium python:

                  #scrape.py
                  from selenium.common.exceptions import *
                  from selenium.webdriver.common.by import By
                  from selenium.webdriver.support import expected_conditions as EC
                  from selenium.webdriver.support.ui import WebDriverWait
                  from selenium.webdriver.chrome.options import Options
                  
                  def run_scrape(link):
                      chrome_options = Options()
                      chrome_options.add_argument('--no-sandbox')
                      chrome_options.add_argument("--headless")
                      chrome_options.add_argument('--disable-dev-shm-usage')
                      chrome_options.add_argument("--lang=en")
                      chrome_options.add_argument("--start-maximized")
                      chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
                      chrome_options.add_experimental_option('useAutomationExtension', False)
                      chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36")
                      chrome_options.binary_location = "/usr/bin/google-chrome"
                      browser = webdriver.Chrome(executable_path=r'/usr/local/bin/chromedriver', options=chrome_options)
                      browser.get(<link passed here>)
                      try:
                          #scrape process
                      except:
                          #other stuffs
                      browser.quit()
                  

                  #multiprocess.py
                  import time,
                  from multiprocessing import Pool
                  from scrape import *
                  
                  if __name__ == '__main__':
                      start_time = time.time()
                      #links = list of links to be scraped
                      pool = Pool(20)
                      results = pool.map(run_scrape, links)
                      pool.close()
                      print("Total Time Processed: "+"--- %s seconds ---" % (time.time() - start_time))
                  

                  Chrome、ChromeDriver 设置、Selenium 版本

                  Chrome, ChromeDriver Setup, Selenium Version

                  ChromeDriver 79.0.3945.36 (3582db32b33893869b8c1339e8f4d9ed1816f143-refs/branch-heads/3945@{#614})
                  Google Chrome 79.0.3945.79
                  Selenium Version: 4.0.0a3
                  

                  我想知道为什么 chrome 正在关闭但其他进程正在运行?

                  Im wondering why is the chrome is closing but other processes are working?

                  推荐答案

                  我拿了你的代码,稍微修改了一下以适应我的测试环境,下面是执行结果:

                  I took your code, modified it a bit to suit to my Test Environment and here is the execution results:

                  • 代码块:

                  • Code Block:

                  • multiprocess.py:

                  import time
                  from multiprocessing import Pool
                  from multiprocessingPool.scrape import run_scrape
                  
                  if __name__ == '__main__':
                      start_time = time.time()
                      links = ["https://selenium.dev/downloads/", "https://selenium.dev/documentation/en/"] 
                      pool = Pool(2)
                      results = pool.map(run_scrape, links)
                      pool.close()
                      print("Total Time Processed: "+"--- %s seconds ---" % (time.time() - start_time)) 
                  

                2. scrape.py:

                  from selenium import webdriver
                  from selenium.common.exceptions import NoSuchElementException, TimeoutException
                  from selenium.webdriver.common.by import By
                  from selenium.webdriver.chrome.options import Options
                  
                  def run_scrape(link):
                      chrome_options = Options()
                      chrome_options.add_argument('--no-sandbox')
                      chrome_options.add_argument("--headless")
                      chrome_options.add_argument('--disable-dev-shm-usage')
                      chrome_options.add_argument("--lang=en")
                      chrome_options.add_argument("--start-maximized")
                      chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
                      chrome_options.add_experimental_option('useAutomationExtension', False)
                      chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36")
                      chrome_options.binary_location=r'C:Program Files (x86)GoogleChromeApplicationchrome.exe'
                      browser = webdriver.Chrome(executable_path=r'C:UtilityBrowserDriverschromedriver.exe', options=chrome_options)
                      browser.get(link)
                      try:
                          print(browser.title)
                      except (NoSuchElementException, TimeoutException):
                          print("Error")
                      browser.quit()
                  

                3. 控制台输出:

                  Downloads
                  The Selenium Browser Automation Project :: Documentation for Selenium
                  Total Time Processed: --- 10.248600006103516 seconds ---
                  

                  很明显你的程序在逻辑上完美无缺.

                  It is pretty much evident your program is logically flawless and just perfect.

                  正如您在几个小时的抓取后提到的这个错误,我怀疑这是因为 WebDriver 不是线程安全的.话虽如此,如果您可以序列化对底层驱动程序实例的访问,则可以在多个线程中共享一个引用.这是不可取的.但是你总是可以实例化一个 WebDriver 每个线程的实例.

                  As you mentioned this error surfaces after several hours of scraping, I suspect this due to the fact that WebDriver is not thread-safe. Having said that, if you can serialize access to the underlying driver instance, you can share a reference in more than one thread. This is not advisable. But you can always instantiate one WebDriver instance for each thread.

                  理想情况下,线程安全的问题不在于您的代码,而在于实际的浏览器绑定.他们都假设一次只有一个命令(例如,像真实用户一样).但另一方面,您始终可以为每个将启动多个浏览选项卡/窗口的线程实例化一个 WebDriver 实例.到目前为止,您的程序似乎很完美.

                  Ideally the issue of thread-safety isn't in your code but in the actual browser bindings. They all assume there will only be one command at a time (e.g. like a real user). But on the other hand you can always instantiate one WebDriver instance for each thread which will launch multiple browsing tabs/windows. Till this point it seems your program is perfect.

                  现在,不同的线程 可以在同一个Webdriver 上运行,但是测试的结果不会是你所期望的.背后的原因是,当您使用多线程在不同的选项卡/窗口上运行不同的测试时,需要一点线程安全编码,否则您将执行的操作如 click()send_keys() 将转到当前具有焦点 的打开的选项卡/窗口,而不管您希望运行的线程.这实质上意味着所有测试将在具有焦点在预期选项卡/窗口上的同一选项卡/窗口上同时运行.

                  Now, different threads can be run on same Webdriver, but then the results of the tests would not be what you expect. The reason behind is, when you use multi-threading to run different tests on different tabs/windows a little bit of thread safety coding is required or else the actions you will perform like click() or send_keys() will go to the opened tab/window that is currently having the focus regardless of the thread you expect to be running. Which essentially means all the test will run simultaneously on the same tab/window that has focus but not on the intended tab/window.

                  这篇关于通过 Python 使用 Selenium 进行多处理时,Chrome 在几个小时后崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:多处理进程中的共享状态 下一篇:使用具有最大同时进程数的 multiprocessing.Process

                  相关文章

                  最新文章

                4. <tfoot id='rrBVn'></tfoot>

                  <small id='rrBVn'></small><noframes id='rrBVn'>

                  1. <i id='rrBVn'><tr id='rrBVn'><dt id='rrBVn'><q id='rrBVn'><span id='rrBVn'><b id='rrBVn'><form id='rrBVn'><ins id='rrBVn'></ins><ul id='rrBVn'></ul><sub id='rrBVn'></sub></form><legend id='rrBVn'></legend><bdo id='rrBVn'><pre id='rrBVn'><center id='rrBVn'></center></pre></bdo></b><th id='rrBVn'></th></span></q></dt></tr></i><div id='rrBVn'><tfoot id='rrBVn'></tfoot><dl id='rrBVn'><fieldset id='rrBVn'></fieldset></dl></div>
                      <bdo id='rrBVn'></bdo><ul id='rrBVn'></ul>
                    1. <legend id='rrBVn'><style id='rrBVn'><dir id='rrBVn'><q id='rrBVn'></q></dir></style></legend>