Integration
Guide for integrating any type of data with your MemberJunction CDP
Overview
MemberJunction is a solution built to let organizations capitalize on the strengths of industry leading tools to create a robust, scalable, and user-friendly Common Data Platform (CDP) environment. We ascribe to an integration strategy focused primarily on maintaining data integrity, reducing complexity, and ensuring cost-effectiveness, all while providing a robust and scalable solution. The core of this strategy relies on
- Practical Data Storage: Utilizing well-supported database management systems and a pure relational database model for ease of use, lower cost, and broad compatibility.
- Sustainable and Targeted Scope: Executing frequent, but not instantaneous updates to maximize the benefits of unified data while avoiding the pitfalls that frequently affect real-time system delivery
To deliver this strategy, our methodology follows a two-phase ELT (Extract, Load, Transform) process consisting of the following phases:- Phase 1 – Data Ingestion
• Ingest complete datasets from various systems – for example CRM, LMS, or FMS – into standalone schemas in SQL Server.
• Synchronize data regularly, balancing frequency with system performance. - Phase 2 – Data Unification
• Identify common elements across different systems.
• Use connector tables in a shared schema for linking related records from different source systems.
• Using well-established tools, such as Azure Data Factory, to handle the intricate process of data matching and unification.
- Phase 1 – Data Ingestion
By following a flexible unification model, our approach to integration enables data to be tailored to the unique matching rules and logic desired by each organization and enables easy access to data, regardless of schema complexity. Additional benefits include:
• Maintained Data Integrity: Each source system's raw data is preserved in its original form.
• Scalability and Flexibility: Can handle simple to complex unification needs.
• Enhanced Accessibility: AI-driven tools ensure easy access to unified data.
• Reduced Complexity: Eliminates the need for real-time updates and complex financial reporting from the CDP.
Purpose
Most organizations have a wide array of different business applications. Some are heavily used across an enterprise like a CRM system. Others are more specialized and are used only by a fraction of the employees in the organization. The central idea behind MemberJunction's CDP is that you can choose to bring in any/all of these data sources into the platform so that you can unify the data in a single location, thereby enabling a wide array of use cases such as analytics, new application development, and of course all sorts of AI.
The goal of this guide is to help you think through the approach to integration of your various data sources and discuss the general approach that most people will find effective.
iPaaS/ETL Tools
There are a wide array of tools that can help you migrate data from one system to another, bring data together and perform transformations of that data as part of the process. MemberJunction is tool-agnostic, meaning that there is no "preferred" tooling. That being said, many people are using Microsoft's Azure Data Factory (ADF) as it provides a robust, easy to use, cost-effective way of handling these needs. You are free to use any tool(s) you prefer for migrating your data into the MemberJunction database.
Schemas
Generally speaking, we suggest an approach where you create a new schema in the database for each of the source systems. For example you might have systems such as these:
System | Description | schema |
---|---|---|
Customer/Member Management | Whatever your CRM/AMS system is, you can map the raw data from that source system into a set of tables in a schema in your MemberJunction CDP database. | crm |
ERP/FMS | Your financial system handles accounting, financial statements, payments, etc. You typically will want to pull this data in to get a more complete view of your enterprise. | fms |
Online Community | Most organizations have some sort of online community platform used for engaging their audience in discussion, peer-to-peer knowledge sharing, etc. Pulling this data into a MemberJunction CDP can provide significant value to enhance your insights on overall engagement and also fuel AI models. | community |
Marketing Automation | If you use a system like HubSpot, Marketing Cloud, ActiveCampaign, etc, you can pull in rich amounts of data via their APIs into your MemberJunction environment. | marketing |
The idea in the above table is that you have these different systems out there and you map the raw data from those systems into a schema in your MemberJunction CDP.
Think of a schema as a "container" for data. Setting up a separate schema for each source system separates the systems cleanly to minimize integration problems.
Unification
While integrating your data into separate schemas aids in simplifying the integration strategy, you need to have a method of unifying/linking the data across schemas. The reason this is important is to make it possible to see a full view of your data across systems. Take for example a common scenario:
- You have a member record in a CRM or AMS type of system.
- You have that same person attend a meeting and the meeting registration is in a different system
- That same person also logs into your online community
- etc.
These various bits of information exist across multiple disparate systems, and you need a way of unifying the data once it is in MemberJunction in order to get a complete view.
We suggest a "Simpler is Better" mindset when it comes to unification. Rather than trying to unify a large portion of the data, start out small and grow over time. For example, the most basic thing you need to do is unify information for:
- People
- Companies/Organizations
If you start with just these two areas, you can later add unification tables for things like Activities, Transactions, and more. MemberJunction does not have a prescribed naming convention or other "standards" suggesting you include or exclude particular types of data from the unification strategy, rather the idea described here is more of a practical suggestion to make it easier to get started and maintain the system. MemberJunction is a flexible framework that will support whatever you choose to do now and in the future for your data model.
Once you define the elements of data you want to unify, you can create a small set of tables in a shared schema. You could call this schema "shared", "common", or whatever else you want. You could also choose to keep these unification tables in the default "dbo" schema as well. Here's an example of a shared Person and Company table setup:
Example Person Table
Column | Type | Description |
---|---|---|
ID | int, not null, identity | The primary key of the record, an auto-incrementing int value |
FirstName | nvarchar(50) | First/Given Name |
LastName | nvarchar(100) | Last/Family Name |
nvarchar(100) | Email Address | |
CompanyID | int, null, foreign key to Company table | An optional link to the Company table, shown below |
Example Company Table
Column | Type | Description |
---|---|---|
ID | int, not null, identity | The primary key of the record, an auto-incrementing int value |
Name | nvarchar(100) | Name of the company |
Website | nvarchar(100) | Website for the company |
These tables are very simple and of course in a real-world scenario you might have quite a bit more information here. Now, the question is, if you have raw data in a table for similar information in each of your application-specific schemas, how do you get data from those tables into the shared tables that are similar. Moving the data in and having rules around which data sources might take priority are things you do as you are implementing and maintaining your CDP. The key beyond that is to build "linking" tables that will link records from the various sources that feed a shared record.
Here's an example of how that might work for Person data
System/Schema | Table | Description |
---|---|---|
crm | Member | In your AMS or CRM system, perhaps there is a table called Member or similar. This table, to the AMS/CRM is a central part of the database and many other database elements tie to it. You bring the raw data from the source system into MemberJunction and into a table that matches the source in the system-specific schema. This table would typically have FirstName, LastName, Email, etc. |
community | Users | In your online community software, perhaps there is a table called Users and that has information about all of the community users including their name, email, etc. |
In the above example you have two separate sources of information for the same record. The question then is, how do you link them together. Initially, you have to have a strategy to find matches using a value that is reasonably reliable across systems. For individuals, email is often a good field to start with because it tends to be pretty consistent. It isn't perfect, but a good start. For organizations, their website domain is similarly a good starting point as a relatively unique value to find matches.
The idea is that you have a process, using Azure Data Factory, or another tool of your choice, to iterate through newly added/updated rows in the source schema and either create records in the shared Person table, or link to existing ones. We need to build an intermediate "linking" table that will give us a connection between the tables. To do this, we propose using a linking table as shown here:
Person_crm_Member (Linking Table)
Column | Type | Description |
---|---|---|
ID | int, not null, identity | Primary Key |
PersonID | int, not null | Foreign Key to the shared.Person table |
MemberID | int, not null | Foreign Key to the crm.Member table |
CreatedAt | datetime, not null | Auto populated field tracking date this record was created |
UpdatedAt | datetime, not null | Auto populated field tracking date this record was updated |
Person_community_Users (Linking Table)
Column | Type | Description |
---|---|---|
ID | int, not null, identity | Primary Key |
PersonID | int, not null | Foreign Key to the shared.Person table |
UserID | int, not null | Foreign Key to the community.Users table |
CreatedAt | datetime, not null | Auto populated field tracking date this record was created |
UpdatedAt | datetime, not null | Auto populated field tracking date this record was updated |
In the above example, we have two separate linking tables, one that is used for linking records in the Person table to Member records in the crm schema, and the other doing the same thing for Users table records in the community schema. The point here is that you would have these linking tables for every connection between the raw tables in the system-specific schemas, and with your shared/common tables that represent data that is combined from multiple systems.
By approaching it this way, you are able to simply ingest raw data into the various schemas that "containerize" the data from each system, while also pulling it together in a logical way and clearly maintaining links back to the raw data.
Updated 10 months ago