Regular Expressions (RegEx)

Using Regex in Woopra segmentation

Woopra's segmentation allows you to use regular expressions to match very specific patterns in Woopra user and event property values. The regex is compiled and run using Oracle's regular expressions, the documentation for which can be found here: https://docs.oracle.com/javase/tutorial/essential/regex/

This document is not designed to be a full introduction to regex, but rather to serve as a reference for common patterns and usages you might want to use in Woopra segmentation.

Using OR

One of the most common reasons that a Woopra user who is not already familiar with regex might want to use regex, is to match any of multiple values. For example, let us imagine we are attempting to build a segment in Woopra of people who's company field is "woopra" OR "oracle." The best way to do this would be to create a segment of people who "ARE" company contains "woopra" or "oracle". While you can do this in Woopra without using regular expressions, it can become quite tedious to build a separate filter for each possible value you are looking for, and then to combine all of these "OR'ed" conditions with some other "AND'ed" ones. The way to express "OR" in a regular expression is with the bar: |. So to find people who's company field is either "woopra" or "oracle", use this regular expression:

woopra|oracle

📘

Trouble getting your regex to work. Try putting '(?i)' or '.*' before the expression (without quotes).

Case-insensitive matching

In Oracle regex, you can make a group or character class case insensitive by using the (?i) pattern. It will apply to the next group or class you have in the regex, so to only allow the "W" in Woopra to be case insensitive: (?i)Woopra. However, if you want the entire word to be case insensitive, you can group the word by using parentheses, and the case insensitivity will apply to all the characters in the match group (within parentheses) that follows the (?i). So to put it all together:

(?i)(.*woopra.*)

That pattern will match all of these: "woopra", "woopra, inc.", and "Woopra, Inc."

Going one step further, if you want to find multiple companies, like with the example above using OR, you may do it all together like this:

(?i)(.*woopra.*|.*oracle.*)

Common pitfalls, and some Examples

why doesn't the regex woopra match Woopra, INC.?

A Regex matches the entire length of the value that is being checked. So the above regex, woopra will only match values that are exactly "woopra" with no other spaces or characters in the value, and also, with no capital letters. To Match "Woopra Inc.", you would need to do two things: make your regex case insensitive, and alter the regex to allow for other characters to be present, as long as "woopra" is in the value somewhere.

Matching other characters

One of the most useful patterns in regex is .* (dot star). The dot, in regex, means match ANY character at all (including spaces, and non-alphanumeric characters). The star is what's known as a quantifier, it means "match 0 or more of the preceding character or group. (A group could be a character class in square brackets, or it could be anything in parentheses).

So to match "Woopra, Inc." as well as "Woopra Incorporated" along with any other potential names for the same company, what you essentially want to say is "match any company that includes the word 'woopra'". You would do this by surrounding your woopra regex with .*s like this:

.*woopra.*