Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 11 Next »

Instructions


What is a Catalogue?


A Catalogue is RDMP's representation of one of your datasets e.g. 'Hospital Admissions'.  A Catalogue consists of:

  • Human readable names/descriptions of what is in the dataset it is

  • A collection of items mapped to underlying columns in your database. 

  • Validation rules for each of the extractable items in the dataset

  • Graph definitions for viewing the contents of the dataset (and testing filters / cohorts built)

Attachments which help understand the dataset (e.g. a pdf file)

Each of these:

  • Can be extractable or not, or extractable only with Special Approval

  • Can involve a transform on the underlying column (E.g. hash on extraction, UPPER etc)

  • Have a human readable name/description of the column/transform

  • Can have curated WHERE filters defined on them which can be reused for project extraction/cohort generation etc

A Catalogue can be a part of project extraction configurations, used in cohort identification configurations.  They can be marked as Deprecated, Internal etc.

The separation of dataset and underlying table allows you to have multiple datasets both of which draw data from the same table.  It also makes it easier to handle moving a table/database (e.g. to a new server or database) / renaming etc.

If you expand a Catalogue (e.g. biochemistry) you can see the ‘Catalogue Items’ node.   These are the extractable columns in the dataset ‘biochemistry’.  If you expand a Catalogue Item, you can see two nodes:

  • the first is the Extraction Information logic for the column

  • the second is the underlying database column reference (ColumnInfo). 

In Summary

  • No labels