PreviousNext…

Getting over that wall

I wish I’d known about O’Reilly’s onJava.com some time ago when I had to code a Java agent to parse XML into Lotus Notes. If you’re starting out with XML parsing in Java, you could do a lot worse than reading this article, even though it’s a couple of years old now:

Simple XML Parsing with SAX and DOM.

When I came to do this stuff, I hit a huge wall. I just wanted a simple “how-to”, taking some basic XML and parsing out each <whatever> node into a Document object in Notes. Now, thanks to Domino Designer Help and the new classes available, this is pretty straightforward in Java or Lotusscript in Notes 6.x.

But we use R5. What a palaver! There are so many ways to slice and dice XML-parsing that I went down a number of (X)paths before opting for org.w3c.dom.Document and org.xml.sax.Parser objects. One source I found particularly useful was the companion site to The Java Developers Almanac book. The almanac site is great: loads of wee snippets just to get you started.

Incidentally, it looks like the files I’ll be parsing soon are going to be huge, so I guess I’d better keep an eye out for memory leaks. If you’re in the same boat, check out Colin’s recent post in addition to Julian’s excellent notes on recycle() and the like — More Thoughts About recycle()— both could be life-savers!

Comments

  1. Had loads of memory leeks with the saxparser in r6 .. and this was with Lotusscript. There again it could have been my ropey coding ;-)spug-no-parser#
  2. You do yourself a disservice. I’d say Notes handles memory leaks all by itself very well :-) Ben Poole#
  3. i've found that leeks have no effect on my memory. at least, that i can remember! they do taste good in soup tho! ;-)

    i did some xml parsing in R5 a while back. boy that was painful. one of those things where, it worked, but i'm still not sure why. i've been trying to get the boss to pony up for a copy of xmlspy, was hoping the debugger would show me exactly what was going on with that particular xml engine. if i do that again, i'll be sure to check out that article, thanks for the link. :-)jonvon#
  4. There was a good article in e-Pro mag a little while back with a SAX-parsing example in Notes. Let's see… Ah, here it is:

    http://www.e-promag.com/eparchive/index.cfm?fuseaction=viewarticle&ContentID=2813&websiteid=

    I have to admit, I have a much easier time working with DOM parsers, but I also realize that SAX parsing is much more efficient (especially with large files). Depends on what you have to do.

    Also, if you end up having to do any DOM parsing in Java, I have a wrapper class on my Java tips page that you can drop into an agent (I played with it in R5, but it might work in ND6 too). Might save you a little time.

    - Julian
    Julian Robichaux#
  5. Ben, you're very right about onjava.com, it rocks! For me it is exactly what I'm looking for, nuts-n-bolts tutorials on specific java tasks. Once you have that working, the rest you can figure out yourself.

    In my environment we also had to parse XML in R5, yet Java was not allowed in the production system. Therefore, I wrote a very handy XML parser class in Lotusscript R5. Only problem is the limited XML message size; it is passed as a string for now. If anyone's interested let me know and I'll share.

    Regards,

    FerdyFerdy Christant#
  6. Try this: http://tersesystems.com/post/600002.jhtml


    It's just the methods for recursing down a DOM tree and getting out text, but it works for me. :-)Will Sargent#

  7. I have come to the conclusion that its allways a good idea to do code recursively when parsing xml.
    Heres a method with a very stupid name. You pass some Node and you write all the significant text in a java.vector, if there are (multi-value) around the in the dxl or not.

    private void fillWerteArrayForItem(org.w3c.dom.Node node) {
    int type = node.getNodeType();

    switch (type) {
    case org.w3c.dom.Node.ELEMENT_NODE :
    // child elemente ermitteln
    org.w3c.dom.NodeList cNodes = node.getChildNodes();
    if (cNodes != null) {
    int len = cNodes.getLength();
    for (int i=0; i < len; i++) {
    // rekursiver Aufruf.
    fillWerteArrayForItem(cNodes.item(i));
    }
    }
    break;

    case org.w3c.dom.Node.TEXT_NODE:
    // wird nach gesucht
    if ((node.getNodeValue() != null) && (node.getNodeValue().trim().length() >0)) {
    // als Element in Instanzvariable speichern
    werte.add(node.getNodeValue());
    }
    }


    }

    Much better are in my opinion
    JDom and for a lot of tasks apache.jakarta.common.Digester (though I haven't tried both with Domino.)Axel#
  8. Interesting approach, thanks! Will try that out. Now, if we’re talking about avoiding iteration, ultimately I think XPath has to be one of the main ways forward, but I haven’t done any in Domino.Ben Poole#

Comments on this post are now closed.

About

I’m a software architect / developer / general IT wrangler specialising in web, mobile web and middleware using things like node.js, Java, C#, PHP, HTML5 and more.

Best described as a simpleton, but kindly. You can read more here.

";