Data Security

Data Security

PURPOSE

This SOP describes HIC’s (Health Informatics Centre) measures to provide security, confidentiality and privacy in the following scope of; 

  • An overview of HIC’s data security 

  • Access to HIC secure rooms, networks and data 

  • Transferring and managing data 

  • Providing a project dataset to data users  

  • Project Level Anonymisation 

  • The HIC Trusted Research Environment

RESPONSIBILITIES

ROLE

RESPONSIBILITY

HIC Data Analyst

  • Receives, stores and quality assures and processes data within HICs secure computing environment.

HIC Technical Staff (and supervisors)

  • Responsible for requesting addition and removal to datasets' access control lists.

Process Manager

  • Senior staff or delegated process manager whom is responsible for managing the process.

  • Handles access requests from HIC Technical Staff, as well as ensuring access rights are immediately revoked when Technical staff leave HIC.

Data Controller

  • Is required to authorise datasets when patient identifiable data is required. Is also responsible for access to data not supplied by HIC and for the release of certain datasets

DEFINITIONS

  • Approved Project: An approved project is a project that is logged into the Project Management System and has Ethics, Caldicott and NHS R&D governance approval, as required. 

  • Caldicott Guardian: A Caldicott Guardian is a senior person responsible for protecting the confidentiality of patient and service-user information and enabling appropriate information-sharing.  

    • Each NHS organisation is required to have a Caldicott Guardian; this was mandated for the NHS by Health Service Circular: HSC 1999/012. The mandate covers all organisations that have access to patient records, so it includes acute trusts, ambulance trusts, mental health trusts, primary care trusts, strategic health authorities, and special health authorities such as NHS Direct. 

    • Caldicott Guardians were subsequently introduced into social care in 2002, mandated by Local Authority Circular: LAC 2002/2. 

    • The Guardian plays a key role in ensuring that NHS, Councils with Social Services Responsibilities and partner organisations satisfy the highest practical standards for handling patient identifiable information. 

    • Acting as the 'conscience' of an organisation, the Guardian actively supports work to enable information sharing where it is appropriate to share and advises on options for lawful and ethical processing of information. 

  • CHI: Community Health Index number. Unique 10-digit NHS (Scotland) patient identifier consisting of patient's date of birth (as DDMMYY), followed by four digits: two digits randomly generated, the third digit identifying gender at birth (odd for men, even for women) and a check digit. HIC uses CHI to link cohort records across datasets when creating Project datasets. 

  • Control: A means of managing risk by providing safeguards. This includes policies, procedures, guidelines, other administrative controls, technical controls, or management controls. 

  • Data: Information held in electronic or paper form.

  • Data Controller: A group or individual responsible for determining the purposes for which and the manner in which any personal data are, or are to be, processed. For example, NHS Tayside and Fife are Data Controllers for regional NHS data processed on their behalf by HIC Services. 

  • HIC Data Analyst: An employee of the University of Dundee authorised to develop software and process data on behalf of HIC. HIC Data Analysts will be located within the HIC secure offices and will work on NHS networks.

  • HIC Client: Refers to an individual or organisation that receives services from Health Informatics Centre (HIC) and agrees to follow HIC's contractual obligations, policies, and procedures, ensuring compliance with legal, ethical, and professional standards.

  • Information: Any communication or representation of knowledge such as facts, data, or opinions in any medium or form including textual, numerical, graphic, cartographic, narrative, and audio-visual. 

  • Personal Data: Information relating to an identified or identifiable living person. The 8 Data Protection Principles in relation to protecting personal data are listed in the Policy document.  

  • Pro-CHI: Project-specific identifier used by HIC to uniquely anonymise a typical NHS dataset CHI number across all datasets within the overall Project Dataset.

  • Project: ​​​​​​​One or more services that covers a client's needs.

  • Project Dataset: A Project Dataset that has been anonymised uniquely and specifically for use within an Approved Project. The dataset must relate to the cohort and purpose defined in the Project Description.

  • Project Description: A Project Description will specify the study cohort, aims, and methods. It will also carry a date and a version number. This document is used to help decide what data is required to fulfil the study objectives.

  • Policy: Overall intention and direction as formally expressed by management. 

  • RDMP: Research Data Management Platform. An open source application for the loading, linking, anonymisation and extraction of datasets stored in relational databases.

  • Risk: The potential for an unwanted event to have a negative impact as a result of exploiting a weakness. It can be seen as a function of the value of the asset, threats, and vulnerabilities.

  • Risk Assessment: Overall process of identifying and evaluating risk.

  • System Administrator: An employee of the University of Dundee authorised to provide and support the technical infrastructure, including maintaining secure IT environments, backup and off-site mirroring of data.

  • TRE: Trusted Research Environment (TRE) is a secure computing environment. It is specifically designed for handling sensitive data in a way that protects privacy and ensures security.

OVERVIEW

Infrastructure overview v7-2.jpg

 

WORKING ENVIRONMENT

1. Access Control

  • HIC enforces strict access control measures for physical premises, networks, and datasets to ensure security and compliance. These apply to anyone, internal or external to HIC, who require access.

  • Physical Access

    • Controlled access applies to all HIC secure rooms.

    • Visitors will not be granted access to HIC’s secure areas unless they have signed in and are accompanied by authorised personnel.

    • Physical premises containing sensitive data are secured when no HIC staff are present.

    • Server rooms are access-controlled and accessible only to authorised staff. HIC uses both University of Dundee and externally contracted ISO 27001-accredited hosting providers. While their IT administrators may access the rooms, only authorised personnel can access the equipment. These contain multiple separate networks: NHS, University and Cloud. 

  • Network & Data Access

    • HIC staff will not be given access to any data or network until their Disclosure Scotland, or a suitable alternative security check based on the staff’s location has been completed and a certificate received.

    • Access to NHS data is made solely through NHS Tayside remote service.  

    • Role-Based Access Control (RBAC) governs access to specific networks and datasets.

  • Access Management

    • Any request for access change, including removal, must follow HIC’s Change Management Process and be authorised accordingly.

    • Access requests will be actioned by the relevant HIC Process Manager or their designated deputy.

    • When access to HIC networks or data is no longer needed, a request is logged with the HIC System Administrator to revoke access.

    • When a staff member leaves HIC, all physical, network, and dataset access will be revoked.

    • When HIC Technical Staff no longer need remote access to the NHS Tayside environment, a request is logged with NHS Tayside IT to revoke access.

    • Access lists will be audited to ensure access is correctly granted with outcomes stored in the PM System. 

2. Remote Working

  • HIC policy is to allow employees to work remotely provided that access to HIC data and networks is:  

    • In keeping with the University Hybrid Working Policy. 

    • Data remains on those networks unless otherwise authorised. 

    • Any offline files must be copied back to the correct location at the next opportunity where applicable. 

  • Only authorised devices as defined in University of Dundee Remote Access Policy can be used.

3. Clear Screen and Clear Desk

  • Care must be taken to prevent unauthorised access to computing devices. To avoid this:

    • Users must lock or log off their computers when unattended. 

    • As a failsafe, devices must automatically lock after a maximum of 15 minutes of inactivity where the devices is capable of such. 

    • Staff working with confidential information will not work in an area that is overlooked, or that allows unauthorised persons to view the information. 

    • All information held in a physical format marked as, or falling into the HIC Data Classification of Confidential, must be appropriately secured when staff are absent from their workplace and at the end of each working day, to reduce its potential exposure to unauthorised access. 

    • No hardcopy of data falling into the HIC Data Classification of Confidential should be taken off-site unless approved in advance.

4. Portable and Personal Devices

  • Strict login access controls are in place across all HIC networks. 

  • All devices connecting to HIC networks must be protected by a password or pin code in line with the University’s password best practice guide. 

  • Use of personal devices must be used in adherence with the University’s Acceptable use policy, including receiving University "Permission to Connect" and encryption before connecting to the University network.

  • No personal devices will be used to connect to the NHS network.

  • Data on portable and personal devices must be deleted as soon as is no longer required. 

  • In the event of loss of theft to a device the user must follow HIC’s incident management standard operating procedure and inform a Team Lead.

DATA HANDLING

1. Transfer of Data to and from HIC

  • Transfer of datasets will be managed and processed by HIC Data Analysts or appointed deputy or process.

  • HIC will require written approval from Data Controllers (e.g. Caldicott Guardian for NHS data) prior to releasing data.  

  • Identifiable data will be transferred by a HIC approved method of transfer. 

  • Datasets must not be transferred via portable media such as CD/DVD, memory stick, external HDD except when no other practicable method exists, for example;

    • The network infrastructure is not capable of transferring the required volume of data.

    • Limited bandwidth availability prevents data being transferred in an acceptable amount of time.

    • The transfer is likely to cause disruption to NHS clinical and business network traffic.

  • Where portable media must be used, it must be encrypted. In the case of NHS identifiable data, the device must be NHS approved. 

  • Datasets should not be transferred via unsecured email unless the dataset itself is secured using an approved method. 

  • When HIC receives data via an unapproved method:

    • The data will be secured within the HIC IT network.

    • Any unsecure copies of the data, such as emails, including drafts, responses and deleted items will be permanently deleted.

    • Any portable media used will be archived, cleared, or securely destroyed.

    • The sender will be instructed on how to securely transfer data for any subsequent transfers.  

  • If data is to be routinely loaded, HIC will use automation where possible to reduce error and improve consistency. Using RDMP for this is advised as it creates audit records including: 

    • The number of INSERT and UPDATE operations performed. 

    • The username, start and end time of the data load. 

    • Load progress messages including the source filenames. 

    • The original file, archived after the successful data load.  

    • Details of any errors.

2. HIC Approved Methods of Transfer Steps

  1. Data transferred to and from HIC will utilise one of the following methods: 

    1. Internal NHS email 

    2. NHS Secure File Transfer system. 

    3. Transfer via secure web method (https). 

    4. Secure File Transfer Protocol (e.g. SFTP). 

  2. If these methods are not available data can be transferred by any means providing it is encrypted using:

    1. Advanced Encryption Standard compression (e.g. AES-256 Zip). 

    2. Asymmetric encryption (e.g. PGP Public Private Key cryptography). 

  3. NHS data will remain in the NHS environment unless suppled by an external NHS host or Data Controller approval is granted.

3. Releasing a Project Dataset

  • Controlled Delegation of Responsibility - Dataset releases are executed only by authorised HIC Data Analysts or through an approved and documented delegation process.

  • Compliance with Defined Requirements - Releases are limited to the cohort and dataset described in the Data Requirements Specification, which aligns with the project's aims and methods as specified in the Project Description.

  • Verified User Associations - Only HIC Client's linked to the project and identified in the PM System are eligible to access the dataset.

  • Governance and Approvals - Information governance requirements must be completed and verified by appointed HIC Governance personnel before release. All approvals must be documented and uploaded or linked within the PM System.

  • Documentation  - All data releases must be documented on the PM System.

  • Traceability and Preventing Cross Project Linkage - To enable traceability, project-specific de-identification will be carried out on the Project Dataset unless patient identifiable data is approved. This process reduces the risks to the rights and freedoms of natural persons that may result from personal data processing.

  • Secure and Controlled Access - Project datasets are released into the TRE, except:

  • Project datasets are of consented participants and the project approvals do not require use of the HIC TRE. 

  • Data is being exported to other Safe Havens. 

  • Data is being sent to a data supplier or controller with appropriate approvals already in place. 

  • Contingency Measures - In the event of PM System unavailability, governance checks will be recorded manually and updated in the system as soon as feasible 

4. Importing and Hosting Datasets from External Sources

  • When HIC are requested to host new data:

    • This will require an approval from the appropriate Data Controller. 

    • The Data Controller can request the removal of the data. 

    • Synthetic or open data must also have Data Controller approval or evidence that its use is acceptable. This can be in its documentation or licencing agreement, which will be stored in the PM System.

TRUSTED RESEARCH ENVIRONMENT

1. Overview

  • HIC TRE is a restricted, secure IT environment, where the HIC Client is given remote access to analyse data. 

  • Access to the HIC TRE is restricted and controlled using role based access control (RBAC).

  • The HIC TRE utilises a secure remote-access environment to enable a data access model. In this model data is not released directly to the HIC Client, but is instead retained under HIC’s control.

  • Commonly used tools for data analysis are provided for use within this environment.

  • HIC Clients are not able to print, copy and paste out of the environment, or access the internet.

  • HIC Clients are not permitted to copy individual-level data outside the environment via any means including, for example but not limited to, photographic, recording, screen grabbing and note taking.  

  • User-specific files such as look-up tables and stats scripts can be imported into HIC TRE via the TRE files input process. All files are:

    • Checked for vulnerability.

    • Checked for individual level data.

    • Checked for information governance approvals.

  • To aid HIC Clients in reuse of cleaning techniques applied to data in other projects, files can be transferred from other projects with prior agreement.  This work must be carried out by a HIC Data Analyst and entails: 

    • The pseudonymisation process being reversed, then reapplied for the new project.

    • The file being restricted to the destination project cohort.

    • The file being restricted to datasets and fields approved for the project.

    • The System Administrator will provide details about current available software and versions on request. 

    • New software can be added on request, providing it is licenced appropriately, and poses no security risk to the environment. HIC reserve the right to refuse to add tools to the environment, or to remove them when they are seen to affect the security of the environment. 

2. Account Creation

  • New TRE User account requests must be confirmed by a project’s Principal Investigator and be reviewed by a HIC Data Analyst.

  • Accounts will only be granted to allow HIC Clients to access their Approved Project.

  • Before being given access to the HIC TRE, TRE Users must:

    • Read and sign the latest HIC Data User Declaration.

    • Return evidence that they have completed appropriate training .

OUTPUT DISCLOSURE CONTROL

The following rules are applicable to all output requests unless agreement with the Data Controller states otherwise. The egress (output) mechanism must be followed to enable files to be removed from the HIC TRE.

1. Standard Human-Readable File Types

  • Individual-level data are not permitted to be removed from the HIC TRE, only analysis outputs and user created documents or code e.g. reports, summaries, aggregates, graphs etc. may be removed. 

  • HIC reserve the right to withhold any files prior to output temporarily to allow for completion of a detailed risk assessment.   

  • HIC reserve the right reject any file that cannot be practicably read by HIC staff.

2. Complex Files Including AI / ML Models

  • These files and models are often binary files, or similar, where it’s not possible to directly inspect the contents for the presence of individual-level data.   

  • If HIC are unable to ascertain whether individual-level data is included in requested output files, HIC will assess the request in the frame of minimal risk:

    • Risk of the files including individual-level.

    • Risk to disclosure of personal data.

    • Risk to the rights and freedoms of the individuals who the data is about.

  • HIC will carry out a detailed risk assessment seeking satisfaction that the request presents minimal risk.  Analysis and outcomes will be recorded and stored in PM System.

  • Approved files will be made available to the HIC Client via the standard HIC Release mechanism. 

3. Consented Bio-Resource Data Studies 

  • For these studies, individual-level data are permitted to be removed from the HIC TRE, but must obey the following rules:  

    • No dates will be included in the release of derived data.

    • Study ID (to enable linkage to biological samples or data).

    • Sex (M/F.) 

    • Age (not DOB).

    • Diagnostic or event status (aggregated from multiple hospital entries and multiple ICD10 codes and other sources). 

    • Aggregated Biochemical values (e.g. mean untreated Cholesterol). 

    • Drug response (e.g. model derived beta or odds ratio; or absolute or percentage change in biochemical parameter). 

    • Drug adherence (% prescription encashment). 

    • Duration of treatment (time-not calendar period). 

  • The output files will be reviewed by a HIC Data Analyst or delegated documented process.

  • Once verified as meeting the above criteria the file will be transferred to the HIC Client. 

DEIDENTIFICATION

1. Deidentification Methods

  • HIC’s deidentification process utilises various methods, using the method deemed most sensible based on the data format and its use.  The steps taken to deidentify the data are: 

    • Mapping – including pseudonymisation.  A random ID is generated to replace the identifiable data item.  The mapping is stored securely by HIC to allow reidentification by reversal of this process if required and approved. 

    • Dilution – The identifiable data item is diluted to a point where it covers a much larger range of people, and thus is deidentified. For example, reducing postcode (average of 15 households) to postcode district (average of 20,000 people). 

    • Obfuscation – Removing the identifiable portion whilst retaining other usable aspects of the data.  For example, this could be free-text data where names have been replaced with a placeholder, or image data where pixels containing identifiers are over-written and replaced by a coloured box. 

    • Removal – Removing the data item completely by not offering it for extraction or release.

  • Where practicable, every project dataset will be uniquely pseudonymised to enable traceability to the the project of origin.

2. Project Level Deidentification

  • The following steps will help ensure that identification of individuals is not possible, while retaining the ability to link data across multiple datasets for a particular project.

    • CHI number is pseudonymised into a project specific PROCHI. It allows traceability to the project of origin.  PROCHI mappings are generated at random and stored only within HIC securely on the NHS network.  The pseudonymisation cannot be reversed without this mapping table.

    • Where there is no CHI a unique pseudonymised identifier will be allocated to each individual. 

    • Removing all names and addresses from the dataset.

    • Dates such as date of birth are diluted – to 3 months whereby the day will become ‘01’ and the month will become the middle month of the appropriate quarter e.g. 24/01/2005 becomes 01/02/2005. 

    • Postcode is diluted to outward postcode / postcode district, e.g. removing the last 3 digits DD1 9SY becomes DD1. 

    • The General Practice (GP) code, General Medical Council registration (GMC) number, the GP Practice code and the Pharmacy code are all replaced by pseudonymous versions.  

    • Identifiable portions of images are over-written with new pixels, usually of a single colour, thus removing the identifiers completely from the image. 

    • DICOM (Digital Imaging and Communications in Medicine) or other file types containing metatags are deidentified according to the rules set out above.

  • Exemption of these steps will be treated as a request for identifiable data.

  • If data provided to HIC has already been effectively deidentified no additional deidentification is required. 

  • Aggregated data provided for study feasibility will not show values <5. 

3. Reversing the Deidentification

  • There are occasions when it is necessary to reverse the deidentification process.

    • Case note review: To enable gathering data from a data source such as a patient administration system, or medical notes, the identity of individuals is needed.

    • Validate findings: To confirm details by linking to patient files. 

    • Patient Safety: To identify individuals who may, for their own benefit, need further tests or treatment. This action would only be initiated by the opinion of a qualified clinician collaborating with the study.

  • Any requests for reversal of deidentification or identifiable data will require appropriate Data Controller (or deputy) approval.

  • This permission must specify who can access to the required identifiable data, what data is to be made available, and how HIC can identify the relevant cohort.

  • Once this approval has been obtained, it will be recorded by HIC.

APPLICABLE REFERENCES

  • Data Access Approvals

  • Incident Management

  • Archiving a Project Dataset

  • Staff Confidentiality Agreement 

  • HIC TRE User Agreement

  • Asset Management

  • University of Dundee Acceptable Use Policy 

  • University of Dundee Remote Access Policy

  • University Password Guide

DOCUMENT CONTROLS

Process Manager

Point of Contact

Process Manager

Point of Contact

Chris Hall

hic-ops@dundee.ac.uk

Revision Number

Revision Date

Revision Made

Revision By

Revision Category

Approved By

Effective Date

Revision Number

Revision Date

Revision Made

Revision By

Revision Category

Approved By

Effective Date

1.0

01/01/24

  • Moved SOP to Confluence from SharePoint and updated into new template.

Bruce Miller and Symone Sheane

Superficial

Governance Co-Ordinator: Symone Sheane

10/01/24

1.1

04/04/24

  • Updated Roles and Responsibilities.

Bruce Miller

Superficial

Governance Co-Ordinator: Symone Sheane

5/04/24

1.2

10/04/24

  • Formatted document control table and added in revision category.

Symone Sheane

Superficial

Governance Co-Ordinator: Symone Sheane

10/04/24

1.3

16/04/24

  • Deleted Appendix C from applicable references. No longer an applicable reference used across ISMS.

Symone Sheane

Superficial

Governance Co-Ordinator: Symone Sheane

16/04/21

1.4

19/04/24

  • Updated Approved by title.

Symone Sheane

Superficial

Governance Co-Ordinator: Symone Sheane

19/04/24

1.5

30/04/24

  • Updated Header to conform with BSI guidelines.

Bruce Miller

Superficial

Governance Co-Ordinator: Symone

30/04/24

1.6

02/05/24

  • Updated links to Definitions in ISMS Glossary.

Bruce Miller

Superficial

Governance Co-Ordinator: Symone Sheane

02/05/24

1.7

09/10/24

  • Incorporated and followed up on comments. Added labels to comply with 2022 standard.

Bruce Miller

Superficial

Governance Co-Ordinator: Symone Sheane

05/11/24

1.8

18/11/24

  • Updated Approved Data User terminology to HIC Client.

  • Removed Policy section as it’s duplication. Policy found in Applicable References section. Added links to applicable references.

Symone Sheane

  • Material

  • Superficial

  • Leadership Team