TRE Glossary
Introduction
Welcome to the HIC Knowledge Base (HKB) Trusted Research Environment (TRE) Glossary. Our HKB hopes to provide you with the appropriate training and tools to successfully work in our TRE. This glossary will provide you with clear, concise definitions of key terms and concepts that frequently come up in our HKB How-to articles. It is hoped that these sources of information will help you navigate the field of secure data management and your research in our TRE.
Each term relates to relevant sections within the HIC Knowledge Base, so you can explore concepts in-depth and access additional resources.
Glossary
Term | Definition |
Amazon Web Services (AWS) | Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms. Our TRE is built on a virtual private cloud that is secure and isolated (no exposure to the public internet). This is the technology that HIC’s TRE is built on and is known as Service Workbench. |
AppStream | Amazon AppStream (2.0) is a fully managed, secure service for streaming desktop applications to users without the need to rewrite or modifty them. We depend on AppStream and Service Workbench to provide the TRE. |
Artificial Intelligence/ Machine Learning (AI/ML) | A program or algorithm that can find patterns or make decisions from a dataset. Artificial Intelligence (AI) systems can be built in various ways, with the most common current method being Machine Learning (ML). AI/ ML models are ‘trained’ on a large dataset to recognise patterns and generate outputs. |
Disclosure control | The process by which outputs requested from the TRE by Users are reviewed by HIC staff to ensure there are no risks of identifying individuals in the released data. |
Five Safes Framework | A widely used set of principles developed to guide researchers and organizations in the creation of TREs for handling sensitive data. The Five Safes framework consists of five key principles that should be considered when handling sensitive data: Safe Projects, Safe People, Safe Settings, Safe Data, Safe Output. |
Information Governance | Information Governance (IG) is how an organisation takes care of its information or data. It involves strategies and processes for collecting, storing, securing, using, protecting and disposing of data safely, whilst also respecting privacy. IG ensures that data is managed well throughout its life cycle, following guidelines and laws. It helps organisations handle data responsibly, protect it from risks, and use it in a way that follows rules and keeps people's information safe. |
Multi-Factor Authentication (MFA) | Also known as two-factor authentication, is a way to increase security by requiring a secondary method of verification before a User can sign in, e.g. a code from a mobile authenticator app in addition to a password. See also: One-Time-Password |
Object storage (S3) | Object storage is a computer data storage approach that manages data as "blobs" or "objects", as opposed to other storage architectures like file systems, which manage data as a file hierarchy, and block storage, which manages data as blocks within sectors and tracks. Amazon Simple Storage Service (S3) is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. |
One Time Password | Is a security feature that provides a temporary unique code for authentication. We rely on time-based one time passwords generated by an authenticator app for users to access the TRE. |
Package | A reusable collection of files which can be added to a program to add some functionality. Think of a package as a toolbox, and each file within it as a different tool. |
Principal Investigator (PI) | The researcher in charge of a study at a particular site (e.g. hospital or university). They are responsible for overseeing the study's progress, coordinating with the team members involved, and ensuring that the research is conducted according to the planned protocols. The PI plays a crucial role in managing the study. |
Pseudonymised data | Data where the direct identifiers e.g. names have been removed and replaced by a unique identifier (ID) or “pseudonym”, typically random codes that make no sense. Some details, like the exact date of birth, might also be changed to be less specific. |
Python | An open-source programming language which can be used for various purposes including data analysis and data visualisation. Python is very popular due to its relative simplicity and readability. |
R/RStudio | An open-source programming language for statistical computing and data visualization. RStudio is a development environment in which you can write and edit R programs. |
Remount (D drive) | Resetting the connection to the D drive - a typical ‘turn it off and on again’ - which can fix several common bugs in the TRE. ‘Remount D drive’ should appear as a desktop shortcut for all Windows TRE workspaces. |
Repository | Centralised storage locations for versioned files, code, documentation, or other digital assets. |
Sensitive data | Information that must be protected due to its confidential nature. This includes personally identifiable information such as names, addresses and health information. This can still be sensitive once it has been de-identified (has had all personal identifiable information removed) if there is potential for re-identification, particularly when used with other data. Sensitive data also includes information that is protected by laws or regulations, e.g. GDPR, Common law, Data Protection Act 2018. |
Service Workbench (SWB) | A cloud-based platform from Amazon Web Services (AWS) infrastructure on which the TRE is built on. |
Slurm | An open-source cluster management and job scheduling system. Useful if you have a lot of jobs to run in parallel. |
SPSS | A statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. |
Structured Query Language (SQL) | A language that helps organise and work with information stored in databases. It allows people to easily find and use data from databases, like looking up specific information or making changes to the data. |
Project (TRE) | A project is an organized investigation conducted to answer research questions or objectives, and often known as Study. Projects being conducted within our TRE are dependent on the secure safe setting, and likely involve sensitive data analyses. |
Trusted Research Environment (TRE) | A secure computing environment that is specifically designed for handling sensitive data in a way that protects privacy and ensures security. Our TRE enables users to conduct research in a flexible, scalable environment. |
TRE User Agreement | A legal document that sets forth the terms and conditions governing the use of the TRE service. All new TRE users are required to sign a TRE User Agreement before they are granted access to the safe setting. |
Ubuntu Linux | Linux is a family of open-source operating systems. Ubuntu is a popular Linux distribution which is available for use in the HIC TRE as an alternative to Windows. |
Workspace | This is your virtual desktop where you will analyse your project data and complete your work. |
For queries or comments regarding HIC How To Articles contact, HICSupport@dundee.ac.uk