<bdo id='MqELo'></bdo><ul id='MqELo'></ul>

      <small id='MqELo'></small><noframes id='MqELo'>

      <tfoot id='MqELo'></tfoot>
    1. <legend id='MqELo'><style id='MqELo'><dir id='MqELo'><q id='MqELo'></q></dir></style></legend>
      1. <i id='MqELo'><tr id='MqELo'><dt id='MqELo'><q id='MqELo'><span id='MqELo'><b id='MqELo'><form id='MqELo'><ins id='MqELo'></ins><ul id='MqELo'></ul><sub id='MqELo'></sub></form><legend id='MqELo'></legend><bdo id='MqELo'><pre id='MqELo'><center id='MqELo'></center></pre></bdo></b><th id='MqELo'></th></span></q></dt></tr></i><div id='MqELo'><tfoot id='MqELo'></tfoot><dl id='MqELo'><fieldset id='MqELo'></fieldset></dl></div>

        utf-8中的php正则表达式单词边界匹配

        时间:2023-10-03
          <tbody id='Ownxs'></tbody>

          <legend id='Ownxs'><style id='Ownxs'><dir id='Ownxs'><q id='Ownxs'></q></dir></style></legend>
          • <small id='Ownxs'></small><noframes id='Ownxs'>

              • <i id='Ownxs'><tr id='Ownxs'><dt id='Ownxs'><q id='Ownxs'><span id='Ownxs'><b id='Ownxs'><form id='Ownxs'><ins id='Ownxs'></ins><ul id='Ownxs'></ul><sub id='Ownxs'></sub></form><legend id='Ownxs'></legend><bdo id='Ownxs'><pre id='Ownxs'><center id='Ownxs'></center></pre></bdo></b><th id='Ownxs'></th></span></q></dt></tr></i><div id='Ownxs'><tfoot id='Ownxs'></tfoot><dl id='Ownxs'><fieldset id='Ownxs'></fieldset></dl></div>

                <tfoot id='Ownxs'></tfoot>

                  <bdo id='Ownxs'></bdo><ul id='Ownxs'></ul>
                • 本文介绍了utf-8中的php正则表达式单词边界匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  我在 utf-8 php 文件中有以下 php 代码:

                  I have the following php code in a utf-8 php file:

                  var_dump(setlocale(LC_CTYPE, 'de_DE.utf8', 'German_Germany.utf-8', 'de_DE', 'german'));
                  var_dump(mb_internal_encoding());
                  var_dump(mb_internal_encoding('utf-8'));
                  var_dump(mb_internal_encoding());
                  var_dump(mb_regex_encoding());
                  var_dump(mb_regex_encoding('utf-8'));
                  var_dump(mb_regex_encoding());
                  var_dump(preg_replace('/weiß/iu', 'weiss', 'weißbier'));
                  

                  我希望最后一个正则表达式只替换完整的单词而不是部分单词.

                  I would like the last regex to replace only full words and not parts of words.

                  在我的 Windows 计算机上,它返回:

                  On my windows computer, it returns:

                  string 'German_Germany.1252' (length=19)
                  string 'ISO-8859-1' (length=10)
                  boolean true
                  string 'UTF-8' (length=5)
                  string 'EUC-JP' (length=6)
                  boolean true
                  string 'UTF-8' (length=5)
                  string 'weißbier' (length=9)
                  

                  在网络服务器 (linux) 上,我得到:

                  On the webserver (linux), I get:

                  string(10) "de_DE.utf8"
                  string(10) "ISO-8859-1"
                  bool(true)
                  string(5) "UTF-8"
                  string(10) "ISO-8859-1"
                  bool(true)
                  string(5) "UTF-8"
                  string(9) "weissbier"
                  

                  因此,正则表达式在 Windows 上按我的预期工作,但在 linux 上却没有.

                  Thus, the regex works as I expected on windows but not on linux.

                  所以主要问题是,我应该如何编写我的正则表达式以仅匹配单词边界?

                  So the main question is, how should I write my regex to only match at word boundaries?

                  第二个问题是如何让 windows 知道我想在我的 php 应用程序中使用 utf-8.

                  A secondary questions is how I can let windows know that I want to use utf-8 in my php application.

                  推荐答案

                  即使在 UTF-8 模式下,像 w 这样的标准类简写也不是 Unicode-知道的.您只需要使用 Unicode 速记,正如您所研究的那样,但是您可以通过使用环顾而不是交替来使它不那么难看:

                  Even in UTF-8 mode, standard class shorthands like w and  are not Unicode-aware. You just have to use the Unicode shorthands, as you worked out, but you can make it a little less ugly by using lookarounds instead of alternations:

                  /(?<!pL)weiß(?!pL)/u
                  

                  还要注意我是如何将花括号从 Unicode 类速记中去掉的;当类名由单个字母组成时,您可以这样做.

                  Notice also how I left the curly braces out of the Unicode class shorthands; you can do that when the class name consists of a single letter.

                  这篇关于utf-8中的php正则表达式单词边界匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:PHP str_word_count() 多字节安全吗? 下一篇:在 UTF-8 编码的字符串上使用 str_split

                  相关文章

                  最新文章

                  <legend id='v1T3z'><style id='v1T3z'><dir id='v1T3z'><q id='v1T3z'></q></dir></style></legend>

                    <bdo id='v1T3z'></bdo><ul id='v1T3z'></ul>

                    <small id='v1T3z'></small><noframes id='v1T3z'>

                    <i id='v1T3z'><tr id='v1T3z'><dt id='v1T3z'><q id='v1T3z'><span id='v1T3z'><b id='v1T3z'><form id='v1T3z'><ins id='v1T3z'></ins><ul id='v1T3z'></ul><sub id='v1T3z'></sub></form><legend id='v1T3z'></legend><bdo id='v1T3z'><pre id='v1T3z'><center id='v1T3z'></center></pre></bdo></b><th id='v1T3z'></th></span></q></dt></tr></i><div id='v1T3z'><tfoot id='v1T3z'></tfoot><dl id='v1T3z'><fieldset id='v1T3z'></fieldset></dl></div>
                    1. <tfoot id='v1T3z'></tfoot>