1. <legend id='W4lFL'><style id='W4lFL'><dir id='W4lFL'><q id='W4lFL'></q></dir></style></legend>
      <i id='W4lFL'><tr id='W4lFL'><dt id='W4lFL'><q id='W4lFL'><span id='W4lFL'><b id='W4lFL'><form id='W4lFL'><ins id='W4lFL'></ins><ul id='W4lFL'></ul><sub id='W4lFL'></sub></form><legend id='W4lFL'></legend><bdo id='W4lFL'><pre id='W4lFL'><center id='W4lFL'></center></pre></bdo></b><th id='W4lFL'></th></span></q></dt></tr></i><div id='W4lFL'><tfoot id='W4lFL'></tfoot><dl id='W4lFL'><fieldset id='W4lFL'></fieldset></dl></div>
      • <bdo id='W4lFL'></bdo><ul id='W4lFL'></ul>
      <tfoot id='W4lFL'></tfoot>

      1. <small id='W4lFL'></small><noframes id='W4lFL'>

        如何使用 XMLHttpRequest 在后台下载 HTML 页面并从中

        时间:2023-10-14

        <small id='j6Via'></small><noframes id='j6Via'>

          <tbody id='j6Via'></tbody>
      2. <tfoot id='j6Via'></tfoot>

            <bdo id='j6Via'></bdo><ul id='j6Via'></ul>
          • <legend id='j6Via'><style id='j6Via'><dir id='j6Via'><q id='j6Via'></q></dir></style></legend>
            1. <i id='j6Via'><tr id='j6Via'><dt id='j6Via'><q id='j6Via'><span id='j6Via'><b id='j6Via'><form id='j6Via'><ins id='j6Via'></ins><ul id='j6Via'></ul><sub id='j6Via'></sub></form><legend id='j6Via'></legend><bdo id='j6Via'><pre id='j6Via'><center id='j6Via'></center></pre></bdo></b><th id='j6Via'></th></span></q></dt></tr></i><div id='j6Via'><tfoot id='j6Via'></tfoot><dl id='j6Via'><fieldset id='j6Via'></fieldset></dl></div>

                  本文介绍了如何使用 XMLHttpRequest 在后台下载 HTML 页面并从中提取文本元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  我想制作一个 Greasemonkey 脚本,当您在 URL_1 中时,该脚本会在后台解析 URL_2 的整个 HTML 网页,以便从中提取文本元素.

                  I want to make a Greasemonkey script that, while you are in URL_1, the script parses the whole HTML web page of URL_2 in the background in order to extract a text element from it.

                  具体来说,我想在后台下载整个页面的HTML代码(一个烂番茄页面)并将其存储在一个变量中,然后使用getElementsByClassName[0] 以便从类名为critic_consensus"的元素中提取我想要的文本.

                  To be specific, I want to download the whole page's HTML code (a Rotten Tomatoes page) in the background and store it in a variable and then use getElementsByClassName[0] in order to extract the text I want from the element with class name "critic_consensus".


                  我在 MDN 中找到了这个:XMLHttpRequest 中的 HTML所以,我最终得到了这个不幸的非工作代码:


                  I've found this in MDN: HTML in XMLHttpRequest so, I ended up in this unfortunately non-working code:

                  var xhr = new XMLHttpRequest();
                  xhr.onload = function() {
                    alert(this.responseXML.getElementsByClassName(critic_consensus)[0].innerHTML);
                  }
                  xhr.open("GET", "http://www.rottentomatoes.com/m/godfather/",true);
                  xhr.responseType = "document";
                  xhr.send();
                  

                  当我在 Firefox Scratchpad 中运行它时,它会显示此错误消息:

                  It shows this error message when I run it in Firefox Scratchpad:

                  跨域请求被阻止:同源策略不允许读取http://www.rottentomatoes.com/m/godfather/ 的远程资源.这可以通过将资源移动到同一域或启用 CORS.

                  Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at http://www.rottentomatoes.com/m/godfather/. This can be fixed by moving the resource to the same domain or enabling CORS.


                  PS.我不使用烂番茄 API 的原因是 他们已经删除了批评者的共识.

                  推荐答案

                  对于跨域请求,获取的站点没有帮助设置许可CORS 策略,Greasemonkey 提供 GM_xmlhttpRequest() 函数.(大多数其他用户脚本引擎也提供此功能.)

                  For cross-origin requests, where the fetched site has not helpfully set a permissive CORS policy, Greasemonkey provides the GM_xmlhttpRequest() function. (Most other userscript engines also provide this function.)

                  GM_xmlhttpRequest 明确设计为允许跨域请求.

                  GM_xmlhttpRequest is expressly designed to allow cross-origin requests.

                  要获取您的目标信息,请在结果上创建一个 DOMParser.不要使用 jQuery 方法,因为这会导致加载无关的图像、脚本和对象、减慢速度或使页面崩溃.

                  To get your target information create a DOMParser on the result. Do not use jQuery methods as this will cause extraneous images, scripts and objects to load, slowing things down, or crashing the page.

                  这里有一个完整的脚本来说明这个过程:

                  Here's a complete script that illustrates the process:

                  // ==UserScript==
                  // @name        _Parse Ajax Response for specific nodes
                  // @include     http://stackoverflow.com/questions/*
                  // @require     http://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js
                  // @grant       GM_xmlhttpRequest
                  // ==/UserScript==
                  
                  GM_xmlhttpRequest ( {
                      method: "GET",
                      url:    "http://www.rottentomatoes.com/m/godfather/",
                      onload: function (response) {
                          var parser  = new DOMParser ();
                          /* IMPORTANT!
                              1) For Chrome, see
                              https://developer.mozilla.org/en-US/docs/Web/API/DOMParser#DOMParser_HTML_extension_for_other_browsers
                              for a work-around.
                  
                              2) jQuery.parseHTML() and similar are bad because it causes images, etc., to be loaded.
                          */
                          var doc         = parser.parseFromString (response.responseText, "text/html");
                          var criticTxt   = doc.getElementsByClassName ("critic_consensus")[0].textContent;
                  
                          $("body").prepend ('<h1>' + criticTxt + '</h1>');
                      },
                      onerror: function (e) {
                          console.error ('**** error ', e);
                      },
                      onabort: function (e) {
                          console.error ('**** abort ', e);
                      },
                      ontimeout: function (e) {
                          console.error ('**** timeout ', e);
                      }
                  } );
                  

                  这篇关于如何使用 XMLHttpRequest 在后台下载 HTML 页面并从中提取文本元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:为什么 ProgressEvent.lengthComputable 为假? 下一篇:从 JSON.parse 捕获异常的正确方法

                  相关文章

                  最新文章

                1. <tfoot id='sFPV3'></tfoot>

                2. <legend id='sFPV3'><style id='sFPV3'><dir id='sFPV3'><q id='sFPV3'></q></dir></style></legend>

                    • <bdo id='sFPV3'></bdo><ul id='sFPV3'></ul>

                      <small id='sFPV3'></small><noframes id='sFPV3'>

                      <i id='sFPV3'><tr id='sFPV3'><dt id='sFPV3'><q id='sFPV3'><span id='sFPV3'><b id='sFPV3'><form id='sFPV3'><ins id='sFPV3'></ins><ul id='sFPV3'></ul><sub id='sFPV3'></sub></form><legend id='sFPV3'></legend><bdo id='sFPV3'><pre id='sFPV3'><center id='sFPV3'></center></pre></bdo></b><th id='sFPV3'></th></span></q></dt></tr></i><div id='sFPV3'><tfoot id='sFPV3'></tfoot><dl id='sFPV3'><fieldset id='sFPV3'></fieldset></dl></div>