Sax - ExpatParser$ParseException

时间:2023-03-18
本文介绍了Sax - ExpatParser$ParseException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作一个读取 XML Internet 的 Android 应用程序.此应用程序使用 SAX 来解析 XML.这是我的解析部分代码:

I'm making an Android application that reads an XML Internet. This application uses SAX to parse XML. This is my code for the part of parsing:

public LectorSAX(String url){
    try{
        SAXParserFactory spf=SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        DefaultHandler lxmlr=new LibraryXMLReader() ;
        sp.parse(url, lxmlr);

        nodo=((LibraryXMLReader)lxmlr).getNodoActual();

    }catch(ParserConfigurationException e){ 
        System.err.println("Error de parseo en LectorSAX.java: "+e);
    }catch(SAXException e){
        System.err.println("Error de sax LectorSAX.java: " + e);
    } catch (IOException e){
        System.err.println("Error de  io LectorSAX.java: " + e);
    }
}

问题是发生了 SAXException.异常信息如下:

The problem is that SAXException occurs. The exception message is as follows:

org.apache.harmony.xml.ExpatParser$ParseException:在第 4 行,第 4 列42:格式不正确(无效令牌)

org.apache.harmony.xml.ExpatParser$ParseException: At line 4, column 42: not well-formed (invalid token)

但是,如果我将相同的代码放入普通的 Java SE 应用程序中,则不会发生此异常并且一切正常.

However, if I put the same code in a normal Java SE application, this exception does not occur and everything works fine.

为什么相同的代码在 Java SE 应用程序中运行良好,而不是在 Android 中运行?另一方面,如何解决这个问题?

Why the same code works fine in a Java SE application, not an Android?. On the other hand, How to solve the problem?.

感谢您的帮助.

您好.

推荐答案

这可能是字符编码问题.
如您所见,无效令牌错误指向第 4 行.
在这一行中,您可以找到一个锐号 (Meteorología) 和一个波浪号 (España).XML 标头显示 ISO-8859-15 编码值.由于它不如 UTF 或 ISO-8859-1 编码常见,因此当 SAXParser 连接并尝试使用系统默认字符集将字节内容转换为字符时,这可能会导致错误.

This could be a character encoding problem.
As you can see, the invalid token error points to the line #4.
In this line, you can find an acute (Meteorología) and a tilde (España). The XML header shows a ISO-8859-15 encoding value. As it's less common than UTFs or ISO-8859-1 encodings, this could result in a error when the SAXParser connects and try to convert the byte content into chars using your system default charset.

然后,您需要告诉 SAXParser 使用哪个字符集.一种方法是传递 InputSource,而不是 URL,到 parse 方法.举个例子:

Then, you'll need to tell the SAXParser which charset to use. A way to do so, is to pass an InputSource, instead of the URL, to the parse method. As an example:

SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();

InputSource is = new InputSource(url);
is.setEncoding("ISO-8859-15");

DefaultHandler lxmlr=new LibraryXMLReader() ;
sp.parse(is, lxmlr);

Android VM 似乎不支持这种编码,抛出 org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: unknown encoding 异常.
作为 ISO-8859-15,它主要与 ISO-8859-1 兼容,除了一些特定字符(如您所见 here),解决方法是在 setEncoding 方法中将 ISO-8859-15 值更改为 ISO-8859-1,强制解析器使用不同但兼容的字符集编码:

It seems that Android VM does not support this encoding, throwing a org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: unknown encoding exception.
As ISO-8859-15 it's mainly compatible with ISO-8859-1, except some specific characters (as you can see here), a workaround is changing the ISO-8859-15 value to ISO-8859-1 at the setEncoding method, forcing the parser to use a different but compatible charset encoding:

is.setEncoding("ISO-8859-1");

看起来,由于 Android 不支持声明的字符集,它使用其默认 (UTF-8),因此解析器无法使用 XML 声明来选择适当的编码.

As it seems, as Android doesn't support the declared charset, it uses its default (UTF-8) and hence the parser can't use the XML declaration to choose the apropiate encoding.

这篇关于Sax - ExpatParser$ParseException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

上一篇:Java、XML DocumentBuilder - 解析时设置编码 下一篇:如何使用 xml 解析器解析 java.lang.string 无法转换为

相关文章

最新文章