/
Multiple Extraction Identifiers

Multiple Extraction Identifiers

Background

In RDMP you indicate which column in your dataset contains the linkage identifier (e.g. CHI) that can be used to link the dataset with others that share the same identifier type.

When a single column is marked IsExtractionIdentifier RDMP will use this column for building cohorts. When two or more columns are marked IsExtractionIdentifier (e.g. Mother CHI, baby CHI for a Maternity dataset) then you must choose which column to use.

When building a cohort

When adding a dataset with multiple IsExtractionIdentifier columns to a CohortIdentificationConfiguration you will be prompted to pick which column to pull identifiers from. Since cohort building in RDMP is based on set operations (UNION , INTERSECT etc) it is not possible to select two columns at once for a given set.

After adding a dataset you can change the identifier column by right clicking and selecting Edit.

If you want your cohort to contain identifiers in multiple columns (e.g. Mothers and Children) you will have to import the dataset twice selecting a different column in each case. If you have complicated filters that you do not want to replicate you can define them in one and create a link in the other:

When extracting a dataset

When extracting a dataset in an ExtractionConfiguration RDMP performs a JOIN on a cohort table. The cohort table stores the Release identifier for all Private identifiers in the cohort. For example:

Example Cohort Table:

PrivateIdentifier

ReleaseIdentifier

PrivateIdentifier

ReleaseIdentifier

101

anon1

202

anon2

303

anon3

When extracting a dataset with multiple IsExtractionIdentifier columns (e.g. see below) all private identifiers will be removed. Where the private identifier exists in the cohort it will be substituted for the Release Identifier. If it does not exist then it will be substituted for null. This allows you to build a cohort based on Mothers or Children.

Maternity Dataset:

MotherIdentifier

ChildIdentifier

Date

MotherIdentifier

ChildIdentifier

Date

101

707

2001-01-01

202

909

2001-02-02

202

808

2002-01-01

Extracting the above dataset with the above cohort (mothers only) would produce a table with nulls in the ChildIdentifier column

Related content

Project specific catalogue
Project specific catalogue
More like this
Project Extraction (Data Export)
Project Extraction (Data Export)
More like this
Catalogue
Catalogue
More like this
Datasets
Datasets
More like this
Breaking RDMP into its component parts
Breaking RDMP into its component parts
More like this
RDMP-113 Dataset Linkage
RDMP-113 Dataset Linkage
More like this