Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Pipeline components can include user written plugins (e.g. for imaging operations)

...

Project

All extractions through RDMP must be done through Projects. A Project has a name, extraction directory and optionally Tickets (if you have a ticketing system configured). A Project should never be deleted even after all ExtractionConfiguration have been executed as it serves as an audit and a cloning point if you ever need to clone any of the ExtractionConfigurations (e.g. to do an update of project data 5 years on).

The ProjectNumber must match the project number of the ExtractableCohort in your ExternalCohortTable.

...

ProcessTask

Describes a specific operation carried out during a LoadMetadata execution (DLE run). This could be 'unzip all files called *.zip in for loading' or 'after loading the data to live, call sp_clean_table1' or 'Connect to webservice X and download 1,000,000 records which will be serialized into XML'

A ProcessTask has a ProcessTaskType which defines how it is run by RDMP. These include C# classes (which can include plugin components) such as Attachers and DataProviders or traditional ETL steps such as SQL scripts or launching standalone processes.

...

SupportingDocument

Describes a document (e.g. PDF / Excel file etc) which is useful for understanding a given dataset (Catalogue). This can be marked as Extractable in which case every time the dataset is extracted the file will also be bundled along with it (so that researchers can also benefit from the file). You can also mark SupportingDocuments as Global in which case they will be provided (if Extractable) to researchers regardless of which datasets they have selected e.g. a PDF on data governance or a copy of an empty 'data use contract document'.

...

SupportingSQLTable

Describes an SQL query that can be run to generate useful information for the understanding of a given Catalogue.

...

If the Global flag is set then the SQL will be run and the result provided to every researcher regardless of what datasets they have asked for in an extraction, this is useful for large lookups like ICD / SNOMED CT which are likely to be used by many datasets.

...

TableInfo

Describes an sql table (or table valued function) on a given DBMS Server from which you intend to either extract and/or load / curate data. A TableInfo represents a cached state of the live database table schema. You can synchronize a TableInfo at any time to handle schema changes (e.g. dropping columns).

...

UNION

Mathematical set operation which matches unique (distinct) identifiers in any datasets being combined (e.g. SetA UNION SetB returns any patient in either SetA or SetB).

...