Woopra has an advanced ID mapping system that allows it to use many different fields to uniquely refer to a person profile. The Basic Concepts section will introduce some conventional database concepts and Woopra's challenges. Then we will dive into how Woopra's ID system works and how to make the best use of it.
In the realm of database systems, a unique identifier is a field on database entities, the particular value of which is guaranteed to uniquely refer to a single entity in that database. If you think of a database as a table of rows and columns, each column is a field, and each row is a particular entity in the table. The unique ID is a column that holds a value that is different for each row, and can be used to unequivocally refer to that single row. In Woopra's profile database system, you can think of a row as a single profile.
A unique identifier is special in that only one database entity may have a given id value, so it can be used to find one specific entity.
In SQL databases, the concept of a Primary Key refers to this idea of a unique identifier. Primary keys refer to a single SQL row, and are guaranteed to be unique for each row, and every row is guaranteed to have one.
There is some necessary but unfortunate equivocation in the terminology we use here:
ID. There is the concept of an identifier, which could be a database ID field, or could be some other identifying field like an email address, or even a cookie. Then there is the value of one of these identifiers which might be referred to as a person's or profile's ID, email ID, device ID (i.e. cookie), etc.
Then there is the actual ID field per se, which is the highest order identifier in the Woopra ID hierarchy. Due to how this ID field is sent in the tracking SDK code, we sometimes refer to this highest order Woopra identifier as
If you find any places in the text where the context cannot help you overcome the equivocation, please let us know by clicking "Suggest an Edit" above, and letting us know what needs clarification.
The Woopra system faces a number of challenges that traditional SQL database concepts do not help solve, and often even actively stand in the way of resolving. We will limit this discussion to the problem of uniquely identifying tracked person profiles in Woopra.
The Woopra system needs to be able to tell which person profile performed an incoming tracked action (or property update.) The problem is that people in Woopra can exist in a number of different levels of being identified. They could be a first-time anonymous visitor to a website, or a long time paying customer.
Sometimes a person will make a few visits to your site anonymously over a year leading up to the time they decide to sign up for your newsletter, giving you their email. Sometimes this can even mean that what was previously considered to be two different people in Woopra, is now known to be a single person--perhaps originally from two devices--requiring a merge of the two profiles.
In the traditional database world, merging two rows with different identifiers is a messy business (Which primary unique identifier is kept? What if a database user asks for the row with an ID that was removed?) Additionally, because every row must have an ID, you cannot wait until you know that all of a single person's actions and traits are in the one database row that represents them.
Another issue is that if you want to track anonymous behavior, and even attribute it to known people in the future as they identify themselves to you--a key value proposition in the Woopra system--then using a single id value per person become more complex.
Similarly, if you want to track behavior across channels--another key value proposition of Woopra--then it is basically impossible to maintain the database ID for the profile between your website, and, say, your email marketing automation service.
These and other more nuanced issues make this problem of identifiers significant in the Woopra System. Woopra solves this problem by dedicating an entire sub-system to managing identifiers.
Woopra needs to be able to take whatever information is available about a person performing the action in an incoming track request, and use it to determine with the highest accuracy possible, which other actions this person has performed, and thus to which profile the incoming actions belong.
If a user is coming anonymously to your website, all you have is a cookie, which is conceptually, a device ID pointing to that browser on that machine. It will be the same next time the person visits your site from that machine and that browser, but if they visit from a different browser, or on their phone for instance, you will have a new cookie. So Woopra needs to be able to use multiple cookies to eventually refer to one person, assuming that one day you find out who that person is and can associate all their devices to them.
Similarly, you may have an incoming "Email Sent" event from your email marketing tool that is not from a browser, and has no cookie. This event has an email address--another major identifier. Woopra needs to eventually (when the person signs in with that email address on their browser with that cookie they had in the past) be able to consider the actions performed by cookie 1, cookie 2, and email 1 all to belong to the same profile.
Woopra's profile ID system adds a dimension to traditional database identifiers by allowing multiple different identifier fields. The ID fields exist in a hierarchy, and are stored in their own database that associates or "maps" person profiles to the values of various identifiers in the ID hierarchy that have been given to a person.
In Computer Science, a Map or Mapping, (or hash map) is a data structure that associates values. Again you can think of a map as a two-column table, with each row containing values considered to be related. Similarly to how a geographical map can help you get from point A to point B, Woopra's data structure map of identifiers can help you get from one identifier, say a person's email to another, like a browser cookie.
When it comes down to it, an identifier is a visitor property with some special behavior. There are a number of different identifiers that it makes sense to use in the context of customer data. Woopra has a few of these pre-defined in all Woopra instances. Here they are in increasing position in the ID hierarchy:
- Browser cookie --The Lowest order id.
- Any Custom Identifiers ...
- Email Address
- ID (cv_id, external database id)
While the above values are built-in to the Woopra system, Woopra is not limited to a predetermined set of Identifiers. In fact, enterprise users can define custom unique identifiers as well. This allows you to use identifiers from external contact management systems, for instance.
Also, some integrations, for example those that send phone events to your Woopra (Sonar, Bellgram, Ringostat, Routee) will create custom identifiers for you, like a phone number. This way, when a track request comes in saying that a person with phone number 239-4567 sent a text to your support team, the Woopra system can determine to which profile this event belongs.
As mentioned above, Woopra maintains a hierarchy of multiple identifiers and has an entire sub-system dedicated to maintaining the mappings between them. The hierarchy itself determines which identifier takes precedence in determining which profile is referenced.
The first and most obvious thing that the ID system needs to do is to determine the profile on which a newly tracked event should go. A tracking event must come in with at least one visitor identifier, or else it will be dropped on the grounds that Woopra cannot know who did the event. This identifier is submitted to the ID system and the ID system returns the profile to which that event logically belongs.
If there are multiple identifiers on a track request, two things happen. First, Woopra selects the highest-order identifier from the ID hierarchy, and submits it to the ID system to find the appropriate profile as normal. Second, Woopra tells the ID system that these two identifiers should point to the same profile, and a cascade of updates in the system are possible.
First, the two IDs are mapped in the ID system. So, for instance, the system notes that cookie
abcd123 and email
email@example.com belong to the same user and should point to the same profile. From now on, a tracked event with either one of these IDs--whether or not the other ID is present--will go to this profile.
Then, if this mapping is new, the system looks to see if each ID previously pointed to its own separate profile. If so, a profile merge occurs. This can be a very complex process and it is not really reversible.