lekkimworld.com – Page 112 – Blog by Mikkel Flindt Heisterberg about everything and nothing

Luke – Lucene Index Toolbox

If you’re using the Jakarta Lucene fulltext search library you need Luke as well. Available as Java Web Start for easy installation.

Building XPath expression from XML node

When programmatically dealing with large XML (or DXL) documents it is often beneficial to be able to indicate, for logging or similar, which node the processing stopped at or where the “thing” you are logging was found. The simplest way to do this for XML is using XPath. The code below is from a library I wrote and constructs a XPath expression to the org.w3c.dom.Node supplied to the method.

Consider a XML document like the one below and the below table. The left column shows the title we supply to the method and the right column the returned XPath. Notice how the method will try to use “known” attributes to address the specific node (id/name attribute) to make the XPath more readable. If no “known” attribute is found we fall back to the sibling index.

Supplied node	XPath
Title node of “Harry Potter and the Chamber of Secrets”	bookstore/book[@id=’2′]/title[1]
Second tag node of “Harry Potter and the Prisoner of Azkaban”	bookstore/book[@id=’3′]/tags[1]/tag[2]

If you combine this with a nice logging engine like log4j you have a robust solution for reproducing parsing issues.

Use to your heart’s content…

<?xml version="1.0" encoding="iso-8859-1" ?>
<bookstore>
  <book id="1">
    <title>Harry Potter and the Philosopher's Stone</title>
    <isbn>0747532745</isbn>
    <tags>
      <tag>children</tag>
      <tag>stone</tag>
    </tags>
  </book>
  <book id="2">
    <title>Harry Potter and the Chamber of Secrets</title>
    <isbn>0747538484</isbn>
    <tags>
      <tag>children</tag>
      <tag>secrets</tag>
    </tags>
  </book>
  <book id="3">
    <title>Harry Potter and the Prisoner of Azkaban</title>
    <isbn>0747546290</isbn>
    <tags>
      <tag>children</tag>
      <tag>prisoner</tag>
    </tags>
  </book>
</bookstore>

/* *********************************************************************
 *                    *** DISCLAIMER ***
 * This code is covered by the Creative Commons Attribution 2.5 License
 * (http://creativecommons.org/licenses/by/2.5/).
 *
 * You may use this code in any way you see fit as long as you realize
 * that the code is provided AS IS without any warrenties and confers
 * to rights what so ever! The author cannot be held accountable for
 * any loss, direct or indirect, afflicted by using the code.
 *
 * *********************************************************************
 */

import java.util.Stack;

import org.w3c.dom.Element;
import org.w3c.dom.Node;

/**
 * Utility class for dealing with XML DOM elements.
 *
 *
 * @author Mikkel Heisterberg, lekkim@lsdoc.org
 */
public class ElementUtil {

   /**
    * Constructs a XPath query to the supplied node.
    *
    * @param n
    * @return
    */
   public static String getXPath(Node n) {
      // abort early
      if (null == n) return null;

      // declarations
      Node parent = null;
      Stack hierarchy = new Stack();
      StringBuffer buffer = new StringBuffer();

      // push element on stack
      hierarchy.push(n);

      parent = n.getParentNode();
      while (null != parent && parent.getNodeType() != Node.DOCUMENT_NODE) {
         // push on stack
         hierarchy.push(parent);

         // get parent of parent
         parent = parent.getParentNode();
      }

      // construct xpath
      Object obj = null;
      while (!hierarchy.isEmpty() && null != (obj = hierarchy.pop())) {
         Node node = (Node) obj;
         boolean handled = false;

         // only consider elements
         if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element e = (Element) node;

            // is this the root element?
            if (buffer.length() == 0) {
               // root element - simply append element name
               buffer.append(node.getLocalName());
            } else {
               // child element - append slash and element name
               buffer.append("/");
               buffer.append(node.getLocalName());

               if (node.hasAttributes()) {
                  // see if the element has a name or id attribute
                  if (e.hasAttribute("id")) {
                     // id attribute found - use that
                     buffer.append("[@id='" + e.getAttribute("id") + "']");
                     handled = true;
                  } else if (e.hasAttribute("name")) {
                     // name attribute found - use that
                     buffer.append("[@name='" + e.getAttribute("name") + "']");
                     handled = true;
                  }
               }

               if (!handled) {
                  // no known attribute we could use - get sibling index
                  int prev_siblings = 1;
                  Node prev_sibling = node.getPreviousSibling();
                  while (null != prev_sibling) {
                     if (prev_sibling.getNodeType() == node.getNodeType()) {
                        if (prev_sibling.getLocalName().equalsIgnoreCase(node.getLocalName())) {
                           prev_siblings++;
                        }
                     }
                     prev_sibling = prev_sibling.getPreviousSibling();
                  }
                  buffer.append("[" + prev_siblings + "]");
               }
            }
         }
      }

      // return buffer
      return buffer.toString();
   }
}

Free on-line XPath tool

If you occasionally need to do a XPath query against a XML document and don’t want to shell out the money for a professional tool to cover that need you should take a look at the BIT-101 XPath Query Tool.

Lotus Sametime vs. Microsoft Communicator

Anyone who knows about a positioning paper from IBM for Lotus Sametime vs. Microsoft Communicator? We have a customer looking into moving platform… 🙁 I’m looking for information which can help me talk intelligently to the customer about the technical capabilities of Microsoft Communicator using terms I, as knowledgeable about Sametime, would understand.

If there is a feature comparison as well it would be great.

Regex cheat sheet

Regex cheat sheet via Johan.

Folder is larger than supported; cannot perform operation

Sometimes it’s the small issues in Notes/Domino that are very bothersome. I have a customer that receive a lot of e-mail and I mean a lot – there are only 6 employees but they still receive thousands of e-mails a day. All e-mail is channeled into a single database as this is how they read e-mail and every employee needs to read every e-mail. The e-mails are preprocessed using a few mail rules and put into one of two views depending on these rules.

Recently they have started receiving the above error message when roaming around the office or to their home computer (the screenshot is from a user roaming to his home computer). I did a search on developerWorks and it appears to be a known issue. Unfortunately none of the workarounds will work for us. The situation is further worsened by the fact that the Notes mail client just isn’t fast enough for this type of broker-type user that needs to scan a LOT of e-mail. I know that this probably isn’t a core use-case so that’s a problem we have to deal with.

Since the workarounds will not work we are planning to do the following:

Instead of using mail rules to maintain an inbox with all messages and 2 views with the e-mail as processed by the mail rules we will maintain three mail databases (one for the inbox and one for each of the views).
The above will mean we can remove the mail rules which should help the server.
Create a mail only mail database and remove all calendar code from the template to improve client performance.
Do away with the ($Inbox) folder and only have a single view in the database showing all documents. The view will be styled as a simplified ($Inbox).
Try to implement a very aggressive archiving schedule to keep the mail databases down to a minimum.

It’s going to be interesting to see how it pans out.

RegexBuddy upgraded to v.3.0

Just received a complimentary update from my previous v. 2.3.2 of RegexBuddy to the new v. 3 since I bought it only a short while back. How I love the internet and the level of service provided by all the independent ISV’s. RegexBuddy has quickly become a great and invaluable tool for me as I use regular expressions almost every day. The new version sports a lot of improvements but keeps the low startup time which is crucial for me. A very nice new feature is the history view and the updated look’n’feel. The possibility of setting the regex flavor you are using (e.g. java.util.regex) is very nice too.

MP3 Trimmer

MP3 Trimmer – unfortunately only for Mac.

%Include in actions

I must admit that I have never been a big fan of using the built-in include files supplied with Notes e.g. lsconst.lss but recently I have started using them to make my code more readable. Using the include file you can write

Dim rc As Integer
rc = Msgbox("Should we continue deleting the contents of your harddrive?", MB_YESNO+MB_ICONQUESTION+MB_DEFBUTTON2, "Continue?")
If rc=IDNO Then
  Exit Sub
End If

which is much more readable than the same code without the constants but using magic numbers.

Dim rc As Integer
rc = Msgbox("Should we continue deleting the contents of your harddrive?", 4+32+256, "Continue?")
If rc=7 Then
  Exit Sub
End If

Today however I realized that you cannot use the include files in (form) actions since the compiler does not allow the use of the Public keyword. All constants in lsconst.lss is defined as Public so that’s a no go. Copying lsconst.lss to something like friendly_lsconst.lss, removing the use of Public and the reference at the bottom to lsprcval.lss and including the copy instead solves the problem. This restriction isn’t documented in Domino Designer help and severely limits the usability in my mind. Since the constants are inserted at compile time it shouldn’t be a problem.

Below is a screenshot of the compile error from Domino Designer. I’m running Notes 7.0.2 on Windows XP Prof. SP2.

Avoiding use of ExceptionUtils

For a long time for error reporting I have been using Apache Commons Lang to convert a stacktrace to a string (Avoiding use of org.apache.commons.lang.exception.ExceptionUtils.getStackTrace(Throwable)) but for a project I was fixing up today the dependency was overkill since it was the only thing I used. A simple replacement is:

StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw);
t.printStackTrace(pw);
String stack_trace = sw.getBuffer().toString();

Commons Lang is still very much in my arsenal since it has utility methods to so many nice to have functions…