RDMP-168 RDMP Export to AWS Automation
Problem
Currently, when the DLS team want to release data into the TRE they have to perform some manual steps.
See https://hicservices.atlassian.net/wiki/spaces/DATA/pages/1545551 and https://hicservices.atlassian.net/wiki/spaces/DATA/pages/1542414.
The process for releasing flat files, such as CSVs, and releasing databases differ and each have their own pitfalls.
This adds unnecessary work for the DLS team that can be automated, allowing the process to be quicker, more auditable and less prone to human error.
Proposed Solution
Flat Files
When a member of the DLS team wishes to release a flat file to AWS, they currently have to copy the files to an S3 gateway mounted on their machine.
As part of the improved release process, the user can select which S3 bucket they wish to write to and RDMP will automatically write the flat files generated to that destination.
SQL Databases
When a member of the DLS team wishes to release a database to AWS, they are required to copy the .bak files to AWS then manually run a database restore command using SQL Management Studio.
As part of the improved release process, the user can add the appropriate TRE database and an external server and upload the database to the TRE database automatically.
Technical Underpinnings
User Authentication
There are a number of options for authenticating users from RDMP from code not running on AWS:
IAM Identity Centre Authentication
IAM Roles Anywhere
Assume a role
AWS Access Keys
As RDMP is not web based, we are unable to use web-based identity providers such as IDP, OIDC etc.
IAM Identity Centre Authentication - https://docs.aws.amazon.com/sdkref/latest/guide/access-sso.html
This method is the recommended method from AWS.
Users sign into the AWS access portal
Use the “Get Credentials” functionality for a specific permissions set
This permission set should be to list items in the specific bucket and to write objects to the specific bucket
The User gives these credentials to RDMP to create an “RDMP” profile in the users local AWS configuration.
RDMP can now use these credentials to perform list and write commands to the specified bucket
From Testing:
We can create users with specific permissions
They can log in to a custom webpage to get their credentials
These credentials can be used to authenticate into AWS programatically
IAM Roles Anywhere -https://docs.aws.amazon.com/sdkref/latest/guide/access-rolesanywhere.html
Create a trust anchor and profile within AWS
profile should have the correct read/write access for the S3 bucket
User uses the “credential helper tool“ to get some temporary credentials - https://docs.aws.amazon.com/rolesanywhere/latest/userguide/credential-helper.html
User passes these credentials to RDMP to perform the upload
Assume a Role -https://docs.aws.amazon.com/sdkref/latest/guide/access-assume-role.html
set up an IAM role to read/write to the S3 bucket
Assuming the user has their local system configured with their AWS credentials, RDMP can assume this role to perform the read/write actions
AWS Access Key
It is strongly recommended we do not use this method
Create an access key that can read/write to the bucket
Give this to this users
Anyone with this key can read/write to the S3 bucket
Connecting to S3 Bucket
Writing to an s3 Bucket is fairly trivial using the AWS S3 API.
Given the user is authenticated (see above), we can use the C# AWS sdk to put objects into an S3 bucket. Below is an example code snippet
public static async Task<bool> UploadFileAsync(
IAmazonS3 client,
string bucketName,
string objectName,
string filePath)
{
var request = new PutObjectRequest
{
BucketName = bucketName,
Key = objectName,
FilePath = filePath,
};
var response = await client.PutObjectAsync(request);
if (response.HttpStatusCode == System.Net.HttpStatusCode.OK)
{
Console.WriteLine($"Successfully uploaded {objectName} to {bucketName}.");
return true;
}
else
{
Console.WriteLine($"Could not upload {objectName} to {bucketName}.");
return false;
}
}
Connecting to AWS Database
Further research on the Database solution in required