• <i id='NgFr5'><tr id='NgFr5'><dt id='NgFr5'><q id='NgFr5'><span id='NgFr5'><b id='NgFr5'><form id='NgFr5'><ins id='NgFr5'></ins><ul id='NgFr5'></ul><sub id='NgFr5'></sub></form><legend id='NgFr5'></legend><bdo id='NgFr5'><pre id='NgFr5'><center id='NgFr5'></center></pre></bdo></b><th id='NgFr5'></th></span></q></dt></tr></i><div id='NgFr5'><tfoot id='NgFr5'></tfoot><dl id='NgFr5'><fieldset id='NgFr5'></fieldset></dl></div>
  • <tfoot id='NgFr5'></tfoot>
  • <legend id='NgFr5'><style id='NgFr5'><dir id='NgFr5'><q id='NgFr5'></q></dir></style></legend>

          <bdo id='NgFr5'></bdo><ul id='NgFr5'></ul>
      1. <small id='NgFr5'></small><noframes id='NgFr5'>

        Java char 数组是否始终是有效的 UTF-16(Big Endian)编码

        时间:2023-07-27

        <legend id='Tg7is'><style id='Tg7is'><dir id='Tg7is'><q id='Tg7is'></q></dir></style></legend>

                <i id='Tg7is'><tr id='Tg7is'><dt id='Tg7is'><q id='Tg7is'><span id='Tg7is'><b id='Tg7is'><form id='Tg7is'><ins id='Tg7is'></ins><ul id='Tg7is'></ul><sub id='Tg7is'></sub></form><legend id='Tg7is'></legend><bdo id='Tg7is'><pre id='Tg7is'><center id='Tg7is'></center></pre></bdo></b><th id='Tg7is'></th></span></q></dt></tr></i><div id='Tg7is'><tfoot id='Tg7is'></tfoot><dl id='Tg7is'><fieldset id='Tg7is'></fieldset></dl></div>
              • <small id='Tg7is'></small><noframes id='Tg7is'>

                  <tbody id='Tg7is'></tbody>
                  <bdo id='Tg7is'></bdo><ul id='Tg7is'></ul>
                • <tfoot id='Tg7is'></tfoot>
                  本文介绍了Java char 数组是否始终是有效的 UTF-16(Big Endian)编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  假设我将 Java 字符数组 (char[]) 实例编码为字节:

                  Say that I would encode a Java character array (char[]) instance as bytes:

                  • 每个字符使用两个字节
                  • 使用大端编码(将最高有效 8 位存储在最左边的字节中,将最低有效 8 位存储在最右边的字节中)

                  这会始终创建有效的 UTF-16BE 编码吗?如果不是,哪些代码点会导致编码无效?

                  Would this always create a valid UTF-16BE encoding? If not, which code points will result in an invalid encoding?

                  这个问题与 这个关于 Java char 类型的问题 和 这个关于Java字符串内部表示的问题.

                  推荐答案

                  没有.您可以创建包含您想要的任何 16 位值的 char 实例——没有任何东西将它们限制为有效的 UTF-16 代码单元,也没有将它们的数组限制为有效的 UTF-16 序列.甚至 String 也不要求其数据是有效的 UTF-16:

                  No. You can create char instances that contain any 16-bit value you desire---there is nothing that constrains them to be valid UTF-16 code units, nor constrains an array of them to be a valid UTF-16 sequence. Even String does not require that its data be valid UTF-16:

                  char data[] = {'uD800', 'b', 'c'};  // Unpaired lead surrogate
                  String str = new String(data);
                  

                  Unicode 的 第 3 章 中规定了有效 UTF-16 数据的要求标准(基本上,一切都必须是 Unicode 标量值,并且所有代理项必须正确配对).您可以使用 CharsetEncoder 测试 char 数组是否是有效的 UTF-16 序列,并将其转换为 UTF-16BE(或 LE)字节序列:

                  The requirements for valid UTF-16 data are set out in Chapter 3 of the Unicode Standard (basically, everything must be a Unicode scalar value, and all surrogates must be correctly paired). You can test if a char array is a valid UTF-16 sequence, and turn it into a sequence of UTF-16BE (or LE) bytes, by using a CharsetEncoder:

                  CharsetEncoder encoder = Charset.forName("UTF-16BE").newEncoder();
                  ByteBuffer bytes = encoder.encode(CharBuffer.wrap(data)); // throws MalformedInputException
                  

                  (如果你有字节,同样使用 CharsetDecoder.)

                  (And similarly using a CharsetDecoder if you have bytes.)

                  这篇关于Java char 数组是否始终是有效的 UTF-16(Big Endian)编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:Character.getNumericValue() 问题 下一篇:使用 JDBC 时密码的字符串或字符 []?

                  相关文章

                  最新文章

                  • <bdo id='rIlCB'></bdo><ul id='rIlCB'></ul>

                  <tfoot id='rIlCB'></tfoot>

                  <small id='rIlCB'></small><noframes id='rIlCB'>

                • <legend id='rIlCB'><style id='rIlCB'><dir id='rIlCB'><q id='rIlCB'></q></dir></style></legend>
                • <i id='rIlCB'><tr id='rIlCB'><dt id='rIlCB'><q id='rIlCB'><span id='rIlCB'><b id='rIlCB'><form id='rIlCB'><ins id='rIlCB'></ins><ul id='rIlCB'></ul><sub id='rIlCB'></sub></form><legend id='rIlCB'></legend><bdo id='rIlCB'><pre id='rIlCB'><center id='rIlCB'></center></pre></bdo></b><th id='rIlCB'></th></span></q></dt></tr></i><div id='rIlCB'><tfoot id='rIlCB'></tfoot><dl id='rIlCB'><fieldset id='rIlCB'></fieldset></dl></div>