This guide explains how to connect Peak to either a Peak managed or Customer managed data lake when onboarding to the platform.
Contents
- Process overview
- Entering the data lake details
- Peak managed configuration
- Customer managed configuration
- Reviewing your connection
Process overview
You must configure a data lake before you can start using Peak.
Currently, Peak supports Amazon S3 data lakes.
There are three steps to complete during this process:
Details
This step lets you name your data lake and choose between a Peak managed or Customer managed configuration.Configuration
This step lets you specify the region where your data lake is located.
If you choose Customer managed, this step will also guide you through creating an IAM role in your AWS S3 account so that Peak can connect to your S3 bucket.Review
Once you have entered all of your configuration details, this step lets you review everything before saving.
Entering the data lake details
When signing into Peak for the first time, you will be prompted to connect to a data lake.
To get started, click ADD DATA LAKE and follow the prompts.
Name your data lake connection.
The name must be unique to your Peak organization.
Only alphanumeric characters and underscores are allowed.
The name cannot be changed after the connection has been set up.- Choose from either Peak managed or Customer managed.
- Click NEXT to move to the Configuration stage.
See below for details of each type of configuration.
Peak managed configuration
This is the quickest process as Peak holds all of the security credentials that are required to make a connection.
Choose the data lake region where your data will be physically stored.
Peak then creates and manages the data lake for your organization.Make sure that your chosen region complies with your local storage regulations.
Once you have chosen the region, either save it as a draft or click NEXT to move to the Review stage.
For details, see Reviewing your connection (below).
Customer managed configuration
During this process, you configure your Amazon S3 data lake to work with your organization.
You will need to create an IAM role in your AWS S3 account so that Peak can connect to your S3 bucket. The Peak platform generates the IAM policy that you will need to use while creating the IAM role.
To configure your Amazon S3 data lake to work with Peak:
Choose the data lake region where your data will be physically stored.
Make sure that your chosen region complies with your local storage regulations.
Enter the Bucket name.
This is the name of the Amazon S3 bucket where your data is stored.Enter the Root Path.
This is path from root to your S3 bucket.After entering the bucket and root path, click GENERATE POLICY.
This generates an Amazon Identity and Access Management (IAM) policy so that Peak can access your S3 bucket.After the policy has been generated, go to your Amazon IAM web service.
Create an IAM role in your AWS account and add Peak as a trusted entity.
For more information, see AWS IAM Configuration.The IAM role created is linked to the data lake region, bucket name and path. This means that if there is a change in user, you will need to generate new IAM policy and update the role in your AWS account.
The Data lake region, bucket name and path can be edited while the data lake configuration is in draft state. Once setup is complete, it is not possible to make further edits.Once the IAM role is created in your account, copy the IAM role ARN and paste it into the IAM Role ARN field.
Peak will use this to connect to your Amazon S3 bucket.Click TEST to test the connection.
If successful, proceed to the Review stage.
If it is unsuccessful, check your connection details and try again.
Reviewing your connection
Before you complete the configuration process, you can review the details you have given at each stage of the process.
- To make changes, find the option you want to change and click Edit.
- Once the details are correct, click FINISH.
You will be taken back to the Data Bridge listing screen and your newly configured data lake will be shown as ‘Active’.