<i id='UXu1x'><tr id='UXu1x'><dt id='UXu1x'><q id='UXu1x'><span id='UXu1x'><b id='UXu1x'><form id='UXu1x'><ins id='UXu1x'></ins><ul id='UXu1x'></ul><sub id='UXu1x'></sub></form><legend id='UXu1x'></legend><bdo id='UXu1x'><pre id='UXu1x'><center id='UXu1x'></center></pre></bdo></b><th id='UXu1x'></th></span></q></dt></tr></i><div id='UXu1x'><tfoot id='UXu1x'></tfoot><dl id='UXu1x'><fieldset id='UXu1x'></fieldset></dl></div>
    <bdo id='UXu1x'></bdo><ul id='UXu1x'></ul>

    <legend id='UXu1x'><style id='UXu1x'><dir id='UXu1x'><q id='UXu1x'></q></dir></style></legend>
  1. <small id='UXu1x'></small><noframes id='UXu1x'>

    <tfoot id='UXu1x'></tfoot>

      通过 Selenium Python 在正常/无头模式下使用 Chrome

      时间:2023-10-08
          <tbody id='t7cbi'></tbody>

          <tfoot id='t7cbi'></tfoot>
        • <small id='t7cbi'></small><noframes id='t7cbi'>

          <i id='t7cbi'><tr id='t7cbi'><dt id='t7cbi'><q id='t7cbi'><span id='t7cbi'><b id='t7cbi'><form id='t7cbi'><ins id='t7cbi'></ins><ul id='t7cbi'></ul><sub id='t7cbi'></sub></form><legend id='t7cbi'></legend><bdo id='t7cbi'><pre id='t7cbi'><center id='t7cbi'></center></pre></bdo></b><th id='t7cbi'></th></span></q></dt></tr></i><div id='t7cbi'><tfoot id='t7cbi'></tfoot><dl id='t7cbi'><fieldset id='t7cbi'></fieldset></dl></div>

          • <bdo id='t7cbi'></bdo><ul id='t7cbi'></ul>

              <legend id='t7cbi'><style id='t7cbi'><dir id='t7cbi'><q id='t7cbi'></q></dir></style></legend>
              1. 本文介绍了通过 Selenium Python 在正常/无头模式下使用 ChromeDriver/Chrome 访问 Cloudflare 网站有什么区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                问题描述

                我对 Python Selenium for Chrome 中的 --headless 模式有疑问.

                I have a question about --headless mode in Python Selenium for Chrome.

                代码

                 from selenium import webdriver
                 from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
                
                 CHROME_DRIVER_DIR = "selenium/chromedriver"
                
                 chrome_options = webdriver.ChromeOptions()
                 caps = DesiredCapabilities().CHROME
                 chrome_options.add_argument("--disable-dev-shm-usage")
                 chrome_options.add_argument("--remote-debugging-port=9222")
                 chrome_options.add_argument("--headless")  # Runs Chrome in headless mode.
                 chrome_options.add_argument('--no-sandbox')  # # Bypass OS security model
                 chrome_options.add_argument("--disable-extensions")
                 chrome_options.add_argument("--disable-gpu")
                
                 browser = webdriver.Chrome(desired_capabilities=caps, executable_path=CHROME_DRIVER_DIR, options=chrome_options)
                
                 browser.get("https://www.manta.com/c/mm2956g/mashuda-contractors")
                 print(browser.page_source)
                 browser.quit()
                

                当我删除 chrome_options.add_argument("--headless") 一切正常,但有了这个 --headless* 得到下一个问题

                When I'm remove chrome_options.add_argument("--headless") all working good, but with this --headless* got next issue

                Please enable cookies.
                
                Error 1020 Ray ID: 53fd62b4087d8116 • 2019-12-04 11:19:28 UTC
                
                Access denied
                
                What happened?
                This website is using a security service to protect itself from online attacks.
                
                Cloudflare Ray ID: 53fd62b4087d8116 • Your IP: 168.81.117.111 • Performance & security by Cloudflare
                

                普通模式和--headless有什么区别?

                What is the difference for normal mode and --headless?

                推荐答案

                我拿走了你的代码,删除了可选的 arguments 并添加了一些 arguments 来执行测试如下:

                I took your code, removed the optional arguments and added a few arguments to execute the test as follows:

                • 代码块:

                • Code Block:

                from selenium import webdriver
                from selenium.webdriver.common.by import By
                from selenium.webdriver.support.ui import WebDriverWait
                from selenium.webdriver.support import expected_conditions as EC
                
                options = webdriver.ChromeOptions() 
                options.add_argument("start-maximized")
                options.add_argument("--headless")
                options.add_experimental_option("excludeSwitches", ["enable-automation"])
                options.add_experimental_option('useAutomationExtension', False)
                driver = webdriver.Chrome(options=options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe')
                driver.get("https://www.manta.com/c/mm2956g/mashuda-contractors")
                print(driver.page_source)
                driver.quit()
                

              2. 控制台输出:

              3. Console Output:

                <html class="js" lang="en-US" style="opacity: 1; visibility: visible;"><!--<![endif]--><head>
                <title>Access denied | www.manta.com used Cloudflare to restrict access</title>
                <meta charset="UTF-8">
                <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
                <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1">
                <meta name="robots" content="noindex, nofollow">
                <meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1">
                <link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection">
                <!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->
                <style type="text/css">body{margin:0;padding:0}</style>
                
                
                <!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->
                <!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script><!--<![endif]-->
                
                
                
                </head>
                <body>
                  <div id="cf-wrapper">
                    <div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>
                    <div id="cf-error-details" class="cf-error-details-wrapper">
                      <div class="cf-wrapper cf-header cf-error-overview">
                    <h1>
                      <span class="cf-error-type" data-translate="error">Error</span>
                      <span class="cf-error-code">1020</span>
                      <small class="heading-ray-id">Ray ID: 53fd7c2fca12d5fc • 2019-12-04 11:36:52 UTC</small>
                    </h1>
                    <h2 class="cf-subheadline">Access denied</h2>
                      </div><!-- /.header -->
                
                      <section></section><!-- spacer -->
                
                      <div class="cf-section cf-wrapper">
                    <div class="cf-columns two">
                      <div class="cf-column">
                        <h2 data-translate="what_happened">What happened?</h2>
                        <p>This website is using a security service to protect itself from online attacks.</p>
                      </div>
                
                
                    </div>
                      </div><!-- /.section -->
                
                      <div class="cf-error-footer cf-wrapper">
                  <p>
                    <span class="cf-footer-item">Cloudflare Ray ID: <strong>53fd7c2fca12d5fc</strong></span>
                    <span class="cf-footer-separator">•</span>
                    <span class="cf-footer-item"><span>Your IP</span>: 123.201.54.43</span>
                    <span class="cf-footer-separator">•</span>
                    <span class="cf-footer-item"><span>Performance &amp; security by</span> <a href="https://www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">Cloudflare</a></span>
                
                  </p>
                </div><!-- /.error-footer -->
                
                
                    </div><!-- /#cf-error-details -->
                  </div><!-- /#cf-wrapper -->
                
                  <script type="text/javascript">
                  window._cf_translation = {};
                
                
                </script>
                
                
                
                </body></html>
                

              4. 从提取的页面源中,使用 --headless 参数可以清楚地看到您正在访问的页面:

                From the extracted page source it is pretty clear using --headless argument you are reaching to a page with:

                • 标题为:拒绝访问 |www.manta.com 使用 Cloudflare 限制访问.
                • 一些信息:发生了什么?:该网站正在使用安全服务来保护自己免受在线攻击.

                浏览上下文Chrome浏览器会话被检测为BOT,并且导航被阻止.

                The Browsing Context i.e. Chrome Browser session is getting detected as a BOT and the navigation is blocked.

                您可以在以下位置找到一些相关讨论:

                You can find a couple of relevant discussions in:

                • 是否存在无法检测到的硒版本?硒真的无法检测到吗?
                • 检测到通过 ChromeDriver 启动的 Chrome 浏览器
                • 网页正在检测 Selenium Webdriver使用 Chromedriver 作为机器人

                这篇关于通过 Selenium Python 在正常/无头模式下使用 ChromeDriver/Chrome 访问 Cloudflare 网站有什么区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                上一篇:如何在 Chrome 驱动程序 Selenium Python 中禁用 java 脚 下一篇:使用 chromedriver &amp; 创建 Python 可执行文件硒

                相关文章

                最新文章

                1. <small id='jxpRS'></small><noframes id='jxpRS'>

                2. <tfoot id='jxpRS'></tfoot>

                      <bdo id='jxpRS'></bdo><ul id='jxpRS'></ul>
                    <legend id='jxpRS'><style id='jxpRS'><dir id='jxpRS'><q id='jxpRS'></q></dir></style></legend>
                  1. <i id='jxpRS'><tr id='jxpRS'><dt id='jxpRS'><q id='jxpRS'><span id='jxpRS'><b id='jxpRS'><form id='jxpRS'><ins id='jxpRS'></ins><ul id='jxpRS'></ul><sub id='jxpRS'></sub></form><legend id='jxpRS'></legend><bdo id='jxpRS'><pre id='jxpRS'><center id='jxpRS'></center></pre></bdo></b><th id='jxpRS'></th></span></q></dt></tr></i><div id='jxpRS'><tfoot id='jxpRS'></tfoot><dl id='jxpRS'><fieldset id='jxpRS'></fieldset></dl></div>