我有这个 Unicode 字符串:Аа́Ббб́Ввв́Г㥴Дд
I have this Unicode string: Ааа́Ббб́Ввв́Г㥴Дд
我想用字符分割它.现在,如果我尝试循环所有字符,我会得到这样的结果:A a a ' Б ...
And I want to it split by chars.
Right now if I try to loop truth all chars I get something like this:
A a a ' Б ...
有没有办法将此字符串正确拆分为字符:А а́?
Is there a way to properly split this string to chars: А а а́ ?
要正确执行此操作,您需要的是计算字素簇边界的算法,如 UAX 29.不幸的是,这需要从 Unicode 字符数据库中了解哪些字符是哪些类的成员,而 JavaScript 不提供该信息(*).因此,您必须在脚本中包含 UCD 的副本,这会使其非常庞大.
To do this properly, what you want is the algorithm for working out the grapheme cluster boundaries, as defined in UAX 29. Unfortunately this requires knowledge of which characters are members of which classes, from the Unicode Character Database, and JavaScript doesn't make that information available(*). So you'd have to include a copy of the UCD with your script, which would make it pretty bulky.
如果您只需要担心拉丁语或西里尔语使用的基本重音,另一种选择是仅使用组合变音符号块 (U+0300-U+036F).这对于其他语言和符号可能会失败,但对于您想要做的事情可能就足够了.
An alternative if you only need to worry about the basic accents used by Latin or Cyrillic would be to take only the Combining Diacritical Marks block (U+0300-U+036F). This would fail for other languages and symbols, but might be enough for what you want to do.
function findGraphemesNotVeryWell(s) {
var re= /.[u0300-u036F]*/g;
var match, matches= [];
while (match= re.exec(s))
matches.push(match[0]);
return matches;
}
findGraphemesNotVeryWell('Ааа́Ббб́Ввв́Г㥴Дд');
["А", "а", "а́", "Б", "б", "б́", "В", "в", "в́", "Г", "г", "Ґ", "ґ", "Д", "д"]
(*: 可能有一种方法可以通过让浏览器呈现字符串并测量其中的选择位置来提取信息...但这肯定会非常混乱和困难让跨浏览器工作.)
(*: there might be a way to extract the information by letting the browser render the string, and measuring the positions of selections in it... but it would surely be very messy and difficult to get working cross-browser.)
这篇关于带有变音符号的 Unicode 字符串,按字符分割的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
检查一个多边形点是否在传单中的另一个内部Check if a polygon point is inside another in leaflet(检查一个多边形点是否在传单中的另一个内部)
更改传单标记群集图标颜色,继承其余默认 CSSChanging leaflet markercluster icon color, inheriting the rest of the default CSS properties(更改传单标记群集图标颜色,继承其余默认
触发点击传单标记Trigger click on leaflet marker(触发点击传单标记)
如何更改 LeafletJS 中的默认加载磁贴颜色?How can I change the default loading tile color in LeafletJS?(如何更改 LeafletJS 中的默认加载磁贴颜色?)
将 Leaflet 图层控件添加到侧边栏Adding Leaflet layer control to sidebar(将 Leaflet 图层控件添加到侧边栏)
Leaflet - 在弹出窗口中获取标记的纬度和经度Leaflet - get latitude and longitude of a marker inside a pop-up(Leaflet - 在弹出窗口中获取标记的纬度和经度)