Obtenir XML Node valeur textuelle avec Java DOM

Question

Je ne peux pas récupérer la valeur du texte avec Node.getNodeValue(), Node.getFirstChild().getNodeValue() ou avec Node.getTextContent().

Mon XML est comme

<add job="351"> <tag>foobar</tag> <tag>foobar2</tag> </add>

Et j'essaie d'obtenir tag valeur (la récupération d'éléments non textuels fonctionne bien). Mon Java ressemble à

Document doc = db.parse(new File(args[0])); Node n = doc.getFirstChild(); NodeList nl = n.getChildNodes(); Node an,an2; for (int i=0; i < nl.getLength(); i++) { an = nl.item(i); if(an.getNodeType()==Node.ELEMENT_NODE) { NodeList nl2 = an.getChildNodes(); for(int i2=0; i2<nl2.getLength(); i2++) { an2 = nl2.item(i2); // DEBUG PRINTS System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):"); if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getTextContent()); if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getNodeValue()); System.out.println(an2.getTextContent()); System.out.println(an2.getNodeValue()); } } }

Il imprime

tag type (1): tag1 tag1 tag1 null #text type (3): _blank line_ _blank line_ ...

Merci pour l'aide.

jsight · Accepted Answer

J'imprimerais le résultat de an2.getNodeName() également à des fins de débogage. Je suppose que votre code d'exploration d'arborescence n'est pas analysé vers les nœuds que vous pensez qu'il est. Cette suspicion est renforcée par le manque de vérification des noms de nœud dans votre code.

En dehors de cela, le javadoc pour Node définit " getNodeValue ()" pour renvoyer null pour les nœuds de type Element. Par conséquent, vous devriez utiliser getTextContent (). Je ne sais pas pourquoi cela ne vous donnerait pas le texte que vous voulez.

Peut-être itérera-t-il les enfants de votre noeud de balise et verra quels types sont là?

J'ai essayé ce code et ça marche pour moi:

String xml = "<add job=\"351\">
" + " <tag>foobar</tag>
" + " <tag>foobar2</tag>
" + "</add>"; DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes()); Document doc = db.parse(bis); Node n = doc.getFirstChild(); NodeList nl = n.getChildNodes(); Node an,an2; for (int i=0; i < nl.getLength(); i++) { an = nl.item(i); if(an.getNodeType()==Node.ELEMENT_NODE) { NodeList nl2 = an.getChildNodes(); for(int i2=0; i2<nl2.getLength(); i2++) { an2 = nl2.item(i2); // DEBUG PRINTS System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):"); if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getTextContent()); if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getNodeValue()); System.out.println(an2.getTextContent()); System.out.println(an2.getNodeValue()); } } }

La sortie était:

#text: type (3): foobar foobar #text: type (3): foobar2 foobar2

toolkit · Answer

Si votre code XML est assez complexe, vous pouvez envisager d’utiliser XPath, fourni avec votre JRE, afin d’accéder au contenu beaucoup plus facilement à l’aide des éléments suivants:

String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()", document.getDocumentElement());

Exemple complet:

import static org.junit.Assert.assertEquals; import Java.io.StringReader; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathFactory; import org.junit.Before; import org.junit.Test; import org.w3c.dom.Document; import org.xml.sax.InputSource; public class XPathTest { private Document document; @Before public void setup() throws Exception { String xml = "<add job=\"351\"><tag>foobar</tag><tag>foobar2</tag></add>"; DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); document = db.parse(new InputSource(new StringReader(xml))); } @Test public void testXPath() throws Exception { XPathFactory xpf = XPathFactory.newInstance(); XPath xp = xpf.newXPath(); String text = xp.evaluate("//add[@job='351']/tag[position()=1]/text()", document.getDocumentElement()); assertEquals("foobar", text); } }

Zeus · Answer

J'utilise un très vieux Java. Jdk 1.4.08 et moi avons eu le même problème. La classe Node pour moi n'avait pas la méthode getTextContent(). Je devais utiliser Node.getFirstChild().getNodeValue() au lieu de Node.getNodeValue() pour obtenir la valeur du nœud. Cela a corrigé pour moi.

vtd-xml-author · Answer

Si vous êtes ouvert à vtd-xml , ce qui excelle à la fois performances et efficacité de la mémoire , voici le code pour faire ce que vous recherchez ... dans XPath et navigation manuelle ... le code général est beaucoup plus concis et plus facile à comprendre ...

import com.ximpleware.*; public class queryText { public static void main(String[] s) throws VTDException{ VTDGen vg = new VTDGen(); if (!vg.parseFile("input.xml", true)) return; VTDNav vn = vg.getNav(); AutoPilot ap = new AutoPilot(vn); // first manually navigate if(vn.toElement(VTDNav.FC,"tag")){ int i= vn.getText(); if (i!=-1){ System.out.println("text ===>"+vn.toString(i)); } if (vn.toElement(VTDNav.NS,"tag")){ i=vn.getText(); System.out.println("text ===>"+vn.toString(i)); } } // second version use XPath ap.selectXPath("/add/tag/text()"); int i=0; while((i=ap.evalXPath())!= -1){ System.out.println("text node ====>"+vn.toString(i)); } } }