计算机科学论坛--显示贴子

以文本方式查看主题

-  计算机科学论坛  (http://bbs.xml.org.cn/index.asp)
--  『 DOM/SAX/XPath 』  (http://bbs.xml.org.cn/list.asp?boardid=11)
----  采用dom读取xml,不同版本系统出现的乱码不一样。  (http://bbs.xml.org.cn/dispbbs.asp?boardid=11&rootid=&id=44158)

--  作者：oceansky
--  发布时间：3/20/2007 7:12:00 PM

--  采用dom读取xml,不同版本系统出现的乱码不一样。
读取xml数据回来乱码啊。
在中文系统中读到M的三次方乱码，但在英文版中读到就是u上面有两点的字符出现乱码。
要读取的网络文件是：http://xml2.tip-ex.com/feed/aliases.php?pid=1000
我试过几种方式，比如new String(str.getBytes("iso8859-1"),"utf-8")，都没办法解决。
package Xml;

import javax.xml.parsers.*;
import org.w3c.dom.*;
import java.io.*;
import java.util.Collection;
import java.net.URL;
import java.net.URLConnection;
import java.util.TreeSet;

public class TeamXml {
    public TeamXml() {
    }
//解释并获得联赛队名的别名数据
    public Collection getTeam(String file) {
        TreeSet teamList = new TreeSet();
        try {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            //解析XML文件
            URL sohu = new URL(file);
            URLConnection sh = sohu.openConnection();
            InputStream in = sh.getInputStream();
            Document document = builder.parse(in);
            in.close();
            //去掉XML文档中空白部分
            document.normalize();
            //获取根节点并打印根节点的名称
            Element root = document.getDocumentElement();
            NodeList teams = root.getElementsByTagName("aliases");
            //遍历NodeList
            for (int i = 0; i < teams.getLength(); i++) {
                Element team = (Element) teams.item(i);
                NodeList nleas=team.getElementsByTagName("alias");
                for(int j=0; j<nleas.getLength(); j++){
                   try{
                       String str=team.getElementsByTagName("alias").item(j).
                                   getFirstChild().getNodeValue();
                        teamList.add(str);
                        system.out.printn(str)//此处写出有乱码
                  }catch(Exception e){
                        continue;
                  }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return teamList;
    }
}

W 3 C h i n a ( since 2003 ) 旗下站点
苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》

46.875ms