<< How to extend Notes 8: coupling custom LiveText recognisers to a Java action | Home | How to extend Notes 8: capture group LiveText recognizers with a Java action >>

How to extend Notes 8: using LiveText with capture groups

I get so many question on how to extend Notes 8 that I finally decided to create a series of blog posts on how to do it. All the posts in the series may be found under the extending_notes8 tag. In this post you'll need familiarity with how MyWidgets and LiveText works. A basic understanding of regular expressions would be beneficial as well.

This post will set the stage for the next post on how to add Java actions that use LiveText recognizers that contain capture groups. No Java code or Eclipse extension point stuff in this post.

Let me start by explaining what capture groups are. As you probably know you use regular expressions to search for text occurrences when using LiveText. In short a regular expression is a text pattern to search for in a block of text. Regular expressions can be hard to grasp at first but once you "have it" they will become an invaluable tool in your arsenal.

A regular expression can range from the simple to the extremely complex. A simple regular expression to find a product number like OTGC-5431 could look like this one:

[A-Z]{4}-\d{4}
This regular expression tells the regular expression engine to search for 4 consecutive, uppercase, letters ([A-Z]{4}) followed by a hyphen (-) followed by 4 digits (\d{4}).

Now that's great but what if this isn't just a product number but it's actually a compound data format and that the product number is made up of a product family (OTGC) and a part number (5431) and I needed the two pieces of information separately? Well that's where capture groups become important.

In regular expressions there is a syntax to signal that the text that you find actually consists of multiple, separate, discrete, pieces of information. As in the example with a product number such as OTGC-5431. This product number consists of two parts - 1) the product family (OTGC) and 2) the part number (5431). If you need access to the product family and part number separately you can use capture groups to split up the product number in the regular expression itself instead of relying on parsing after recognition.

So instead of simply getting a match of "OTGC-5431" you also get information that the product family is "OTGC" and the part number is "5431".

So how does one use capture groups?

Well you start with the regular expression above and then you add capture groups by simply changing it to be

([A-Z]{4})-(\d{4})
Notice how I only added two sets of parentheses. That all. That tells the regular expression engine that the result is made of of two parts. So instead of getting just one result (OTGC-5431) I get three: OTGC-5431, OTGC, 5431 (the match in it's entirety and the two capture groups).

So how do I use these results in MyWidgets / LiveText? Well let me walk you through an example.

Lets imagine that you have a web service that allows you to search for product numbers but need the product family and part number separately. The syntax is something like http://www.example.com/prodquery?pf=<product family>&pn=<part number> (http://www.example.com/prodquery?pf=OTGC&pn=5431). Let me show you, end to end, how to do this using MyWidgets and LiveText.

  1. Start by creating a new widget. Choose to create a web widget and click Next.
  2. Now specify the URL as being "http://www.example.com/prodquery?pf=OTGC&pn=5431" (Please note: The address doesn't point to anything but it proves the point we need). Now click Next.
  3. A GET request is fine - just click Next.
  4. Now the web page is fetched. As we know the URL we specified doesn't work just click Next.
  5. In the "Configure a Widget" dialog name the widget "Demo Product Search" and choose "Wire as an action" at the bottom. Then click the "Advanced" tab at the top.
  6. Put a checkmark in both boxes in the "Configure" column as we need to map both URL parameters to our recognized LiveText. Then click Next.
  7. We need a new recognizer to recognize our product number. To do this click the "New Recognizer..." button.
  8. Name the recognizer "Demo Product Number" in the top text box. Now since our recognizer uses two capture groups we need to tell Notes how to map these to our widget (as widget properties) so we need a new Content Type. To do this click the "New Type..." button.
  9. In the "Configure a Content Type" dialog box you name the parts of the text you recognize. We have two parts so we click the "Add"-button twice and fill the text fields like specified below. We do this to indicate we have two properties called "pf" and "pn". Then click OK.
  10. Back in the "Configure a Recognizer" dialog our new Content Type ("Demo Product Number") has been chosen for us. Now we specify our regular expression with the two capture groups. Then we click the "Add"-button twice and map capture group 1 (the "product family") to content property "pf" and capture group 2 (the "part number") to content property "pn" as shown below. Then click OK.
  11. Back in the "Wire an action to configure a widget" dialog our newly created recognizer has been chosen for us. Now we need to map the widget properties (the parts of the recognizer) to the URL parameters. We do this on the "Advanced" tab so click on that near the top.
  12. On the "Advanced" tab add a second parameter box by clicking the "Add"-button and map the URL parameters to our widget properties as shown below. Then click Finish.
Now take a deeeeeeeeeeeeeeeeep breath... :-)

That's what's required to create a new widget with a new recognizer and new content type. It may seem like much, and I think it is, but remember that you may now add a second web widget that uses the same recognizer by following the same steps but ignoring step 7-10. Also when you've done it a few times it becomes second nature and you use it all the time. I do.

Now we need some text to test on. Create a new e-mail message, add a subject, some text to the body field including a couple of product numbers and e-mail the message to yourself. To make it easier you may copy/paste the below text:

Ullamcorper veniam aliquip duis, vel vero dolore in dolor 
aliquam dolore lobortis delenit vel duis, magna. Eros 
iusto, consequat iriure eu enim nulla exerci minim nulla 
facilisis, ex te ut nulla volutpat qui: OTGC-5431, 
OTMM-6615. Odio nulla amet ea quis volutpat suscipit exerci 
eros et dolore feugiat, ea dolor ad, vulputate, delenit 
enim sed autem tation enim zzril blandit iusto. Dolor 
facilisi vero feugait iriure, et consequat ut, et euismod 
ipsum praesent quis duis zzril in hendrerit, at et, dolor 
hendrerit dignissim. Ut commodo odio consequat, onsectetuer 
augue dignissim nulla dolore velit. 
Now when you open the message from your inbox you should see something like this:

As you can see the Notes client recognized two text strings as shown by the blue dotted lines. If you hover over the text and click the down arrow (will appear to the right of the text) you'll see a small menu as shown below. From that menu select "Display Demo Product Number Properties". That will show a dialog box explaining exactly what Notes found and what text goes into which capture groups and hence into which widget properties.

In the next post in the series I'll show how to use these widget properties from a Java action. Stay tuned...



Avatar: Darren

Re: How to extend Notes 8: using LiveText with capture groups

Great post, I was puzzling how to divide up a flight number into airline and the number... I knew the concept of groups but this explained it really well.

Add a comment Send a TrackBack