With AWS Lake Formation, you’ll be able to construct knowledge lakes with a number of AWS accounts in a wide range of methods. For instance, you could possibly construct an information mesh, implementing a centralized knowledge governance mannequin and decoupling knowledge producers from the central governance. Such knowledge lakes allow the info as an asset paradigm and unleash new potentialities with knowledge discovery and exploration throughout organization-wide customers. Whereas enabling the ability of knowledge in decision-making throughout your group, it’s additionally essential to safe the info. With Lake Formation, sharing datasets throughout accounts solely requires a couple of easy steps, and you may management what you share.
Lake Formation has launched Model 3 capabilities for sharing AWS Glue Knowledge Catalog assets throughout accounts. When shifting to Lake Formation cross-account sharing V3, you get a number of advantages. When shifting from V1, you get extra optimized utilization of AWS Useful resource Entry Supervisor (AWS RAM) to scale sharing of assets. When shifting from V2, you get a couple of enhancements. First, you don’t have to keep up AWS Glue useful resource insurance policies to share utilizing LF-tags as a result of Model 3 makes use of AWS RAM. Second, you’ll be able to share with AWS Organizations utilizing LF-tags. Third, you’ll be able to share to particular person AWS Id and Entry Administration (IAM) customers and roles in different accounts, thereby offering knowledge house owners management over which people can entry their knowledge.
Lake Formation tag-based entry management (LF-TBAC) is an authorization technique that defines permissions based mostly on attributes referred to as LF-tags. LF-tags are completely different from IAM useful resource tags and are related solely with Lake Formation databases, tables, and columns. LF-TBAC lets you outline the grant and revoke permissions coverage by grouping Knowledge Catalog assets, and subsequently helps in scaling permissions throughout a lot of databases and tables. LF-tags are inherited from a database to all its tables and all of the columns of every desk.
Model 3 provides the next advantages:
- True central governance with cross-account sharing to particular IAM principals within the goal account
- Ease of use in not having to keep up an AWS Glue useful resource coverage for LF-TBAC
- Environment friendly reuse of AWS RAM shares
- Ease of use in scaling to lots of of accounts with LF-TBAC
On this submit, we illustrate the brand new options of cross-account sharing Model 3 in a producer-consumer state of affairs utilizing TPC datasets. We stroll by way of the setup of utilizing LF-TBAC to share knowledge catalog assets from the info producer account to direct IAM customers within the shopper account. We additionally undergo the steps within the receiving account to just accept the shares and question the info.
Answer overview
To display the Lake Formation cross-account Model 3 options, we use the TPC datasets out there at s3://aws-data-analytics-workshops/shared_datasets/tpcparquet/. The answer consists of steps in each accounts.
In account A, full the next steps:
- As an information producer, register the dataset with Lake Formation and create AWS Glue Knowledge Catalog tables.
- Create LF-tags and affiliate them with the database and tables.
- Grant LF-tag based mostly permissions on assets on to personas in shopper account B.
The next steps happen in account B:
- The buyer account knowledge lake admin evaluations and accepts the AWS RAM invites.
- The information lake admin offers CREATE DATABASE entry to the IAM person
lf_business_analysts. - The information lake admin creates a database for the advertising and marketing crew and grants CREATE TABLE entry to
lf_campaign_manager. - The IAM customers create useful resource hyperlinks on the shared database and tables and question them in Amazon Athena.
The producer account A has the next personas:
- Knowledge lake admin – Manages the info lake within the producer account
- lf-producersteward – Manages the info and person entry
The buyer account B has the next personas:
- Knowledge lake admin – Manages the info lake within the shopper account
- lf-business-analysts – The enterprise analysts within the gross sales crew wants entry to non-PII knowledge
- lf-campaign-manager – The supervisor within the advertising and marketing crew wants entry to knowledge associated to merchandise and promotions
Stipulations
You want the next stipulations:
- Two AWS accounts. For this demonstration of how AWS RAM invitations are created and accepted, it is best to use two accounts that aren’t a part of the identical group.
- An admin IAM person in each accounts to launch the AWS CloudFormation stacks.
- Lake Formation mode enabled in each the producer and shopper account with cross-account Model 3. For directions, seek advice from Change the default permission mannequin.
Lake Formation and AWS CloudFormation setup in account A
To maintain the setup easy, we’ve got an IAM admin registered as the info lake admin.
- Signal into the AWS Administration Console within the
us-east-1Area. - On the Lake Formation console, beneath Permissions within the navigation pane, select Administrative roles and duties.
- Choose Select Directors beneath Datalake directors.
- Within the pop-up window Handle knowledge lake directors, beneath IAM customers and roles, select IAM admin person and select Save.
- Select Launch Stack to deploy the CloudFormation template:

- Select Subsequent.
- Present a reputation for the stack and select Subsequent.
- On the subsequent web page, select Subsequent.
- Evaluate the main points on the ultimate web page and choose I acknowledge that AWS CloudFormation would possibly create IAM assets.
- Select Create.
Stack creation ought to take about 2–3 minutes. The stack establishes the producer setup as follows:
- Creates an Amazon Easy Storage Service (Amazon S3) knowledge lake bucket
- Registers the info lake bucket with Lake Formation
- Creates an AWS Glue database and tables
- Creates an IAM person (
lf-producersteward) who will act as producer steward - Creates LF-tags and assigns them to the created catalog assets as specified within the following desk
| Database | Desk | LF-Tag Key | LF-Tag Worth | Useful resource Tagged |
lftpcdb |
. | Sensitivity |
Public |
DATABASE |
lftpcdb |
gadgets |
HasCampaign |
true |
TABLE |
lftpcdb |
promotions |
HasCampaign |
true |
TABLE |
lftpcdb |
clients desk columns = "c_last_name","c_first_name","c_email_address" |
Sensitivity |
Confidential |
TABLECOLUMNS |
Confirm permissions in account A
After the CloudFormation stack launches, full the next steps in account A:
- On the AWS CloudFormation console, navigate to the Outputs tab of the stack.

- Select the
LFProducerStewardCredentialsworth to navigate to the AWS Secrets and techniques Supervisor console. - Within the Secret worth part, select Retrieve secret worth.
- Be aware down the key worth for the password for IAM person
lf-producersteward.
You want this to log in to the console later because the person lf-producersteward.

- On the LakeFormation console, select Databases on the navigation pane.
- Open the database
lftpcdb. - Confirm the LF-tags on the database are created.

- Select View tables and select the
gadgetsdesk to confirm the LF-tags.

- Repeat the steps for the
promotionsandclientstables to confirm the LF-tags assigned.


- On the Lake Formation console, beneath Knowledge catalog within the navigation pane, select Databases.
- Choose the database
lftpcdband on the Actions menu, select View Permissions. - Confirm that there are not any default permissions granted on the database
lftpcdbforIAMAllowedPrincipals. - For those who discover any, choose the permission and select Revoke to revoke the permission.
- On the AWS Administration Console, select the AWS CloudShell icon on the highest menu.

This opens AWS CloudShell in one other tab of the browser. Enable a couple of minutes for the CloudShell setting to arrange.
- Run the next AWS Command Line Interface (AWS CLI) command after changing
{BUCKET_NAME}withDataLakeBucketfrom the stack output.
If CloudShell isn’t out there in your chosen Area, run the next AWS CLI command to repeat the required dataset out of your most well-liked AWS CLI setting because the IAM admin person.
- Confirm that your S3 bucket has the dataset copied in it.
- Log off because the IAM admin person.
Grant permissions in account A
Subsequent, we proceed granting Lake Formation permissions to the dataset as an information steward throughout the producer account. The information steward grants the next LF-tag-based permissions to the buyer personas.
| Shopper Persona | LF-tag Coverage |
lf-business-analysts |
Sensitivity=Public |
lf-campaign-manager |
HasCampaign=true |
- Log in to account A as person
lf-producersteward, utilizing the password you famous from Secrets and techniques Supervisor earlier. - On the Lake Formation console, beneath Permissions within the navigation pane, select Knowledge Lake permissions.
- Select Grant.
- Below Principals, choose Exterior accounts.
- Present the ARN of the IAM person within the shopper account (
arn:aws:iam::<accountB_id>:person/lf-business-analysts) and press Enter.

- Below LF_Tags or catalog assets, choose Sources matched by LF-Tags.
- Select Add LF-Tag so as to add a brand new key-value pair.
- For the important thing, select
Sensitivityand for the worth, selectPublic. - Below Database permissions, choose Describe, and beneath Desk permissions, choose Choose and Describe.

- Select Grant to use the permissions.
- On the Lake Formation console, beneath Permissions within the navigation pane, select Knowledge Lake permissions.
- Select Grant.
- Below Principals, choose Exterior accounts.
- Present the ARN of the IAM person within the shopper account (
arn:aws:iam::<accountB_id>:person/lf-campaign-manager) and press Enter. - Below LF_Tags or catalog assets, choose Sources matched by LF-Tags.
- Select Add LF-Tag so as to add a brand new key-value pair.
- For the important thing, select
HasCampaignand for the worth, select true.

- Below Database permissions, choose Describe, and beneath Desk permissions, choose Choose and Describe.
- Select Grant to use the permissions.
- Confirm on the Knowledge lake permissions tab that the permissions you might have granted present up appropriately.

AWS CloudFormation setup in account B
Full the next steps within the shopper account:
- Log in as an IAM admin person in account B and launch the CloudFormation stack:

- Select Subsequent.
- Present a reputation for the stack, then select Subsequent.
- On the subsequent web page, select Subsequent.
- Evaluate the main points on the ultimate web page and choose I acknowledge that AWS CloudFormation would possibly create IAM assets.
- Select Create.
Stack creation ought to take about 2–3 minutes. The stack units up the next assets in account B:
- IAM customers
datalakeadmin1,lf-business-analysts, andlf-campaign-manager, with related IAM and Lake Formation permissions - A database referred to as
db_for_shared_tableswithCreate_Tablepermissions to the lf-campaign-manager person - An S3 bucket named
lfblog-athenaresults-<your-accountB-id>-us-east-1 with ListBucketand write permissions tolf-business-analystsandlf-campaign-manager
Be aware down the stack output particulars.

Settle for useful resource shares in account B
After you launch the CloudFormation stack, full the next steps in account B:
- On the CloudFormation stack Outputs tab, select the hyperlink for
DataLakeAdminCredentials.
This takes you to the Secrets and techniques Supervisor console.
- On the Secrets and techniques Supervisor console, select Retrieve secret worth and duplicate the password for
DataLakeAdminperson. - Use the
ConsoleIAMLoginURLworth from the CloudFormation template output to log in to account B with the info lake admin person identify datalakeadmin1 and the password you copied from Secrets and techniques Supervisor. - Open the AWS RAM console in one other browser tab.
- Within the navigation pane, beneath Shared with me, select Useful resource shares to view the pending invites.
It’s best to see two useful resource share invites from the producer account A: one for database-level share and one for table-level share.

- Select every useful resource share hyperlink, evaluation the main points, and select Settle for.
After you settle for the invites, the standing of the useful resource shares modifications from Lively from Pending.
Grant permissions in account B
To grant permissions in account B, full the next steps:
- On the Lake Formation console, beneath Permissions on the navigation pane, select Administrative roles and duties.

- Below Database creators, select Grant.

- Below IAM customers and roles, select
lf-business-analysts. - For Catalog permissions, choose Create database.
- Select Grant.
- Log off of the console as the info lake admin person.
Question the shared datasets as shopper customers
To validate the lf-business-analysts person’s knowledge entry, carry out the next steps:
- Log in to the console as lf-business-analysts, utilizing the credentials famous from the CloudFormation stack output.
- On the Lake Formation console, beneath Knowledge catalog within the navigation pane, select Databases.

- Choose the database
lftpcdband on the Actions menu, select Create useful resource hyperlink.

- Below Useful resource hyperlink identify, enter
rl_lftpcdb. - Select Create.
- After the useful resource hyperlink is created, choose the useful resource hyperlink and select View tables.
Now you can see the 4 tables within the shared database.

- Open the Athena console in one other browser tab and select the
lfblog-athenaresults-<your-accountB-id>-us-east-1 bucketbecause the question outcomes location. - Confirm knowledge entry utilizing the next question (for extra info, seek advice from Operating SQL queries utilizing Amazon Athena):
The next screenshot exhibits the question output.

Discover that account A shared the database lftpcdb to account B utilizing the LF-tag expression Sensitivity=Public. Columns c_first_name, c_last_name, and c_email_address in desk clients have been overwritten with Sensitivity=Confidential. Subsequently, these three columns aren’t seen to person lf-business-analysts.
You possibly can preview the opposite tables from the database equally to see the out there columns and knowledge.
- Log off of the console as
lf-business-analysts.
Now we are able to validate the lf-campaign-manager person’s knowledge entry.
- Log in to the console as lf-campaign-manager utilizing the credentials famous from the CloudFormation stack output.
- On the Lake Formation console, beneath Knowledge catalog within the navigation pane, select Databases.
- Confirm that you could see the database
db_for_shared_tablesshared by the info lake admin.

- Below Knowledge catalog within the navigation pane, select Tables.
It’s best to be capable of see the 2 tables shared from account A utilizing the LF-tag expression HasCampaign=true. The 2 tables present the Proprietor account ID as account A.

As a result of lf-campaign-manager acquired desk degree shares, this person will create table-level useful resource hyperlinks for querying in Athena.
- Choose the promotions desk, and on the Actions menu, select Create useful resource hyperlink.

- For Useful resource hyperlink identify, enter
rl_promotions.

- Below Database, select
db_for_shared_tablesfor the database to include the useful resource hyperlink. - Select Create.
- Repeat the desk useful resource hyperlink creation for the opposite desk gadgets.
Discover that the useful resource hyperlinks present account B as proprietor, whereas the precise tables present account A because the proprietor.

- Open the Athena console in one other browser tab and select the
lfblog-athenaresults-<your-accountB-id>-us-east-1bucket because the question outcomes location. - 11. Question the tables utilizing the useful resource hyperlinks.
As proven within the following screenshot, all columns of each tables are accessible to lf-campaign-manager.

In abstract, you might have seen how LF-tags are used to share a database and choose tables from one account to a different account’s IAM customers.
Clear up
To keep away from incurring prices on the AWS assets created on this submit, you’ll be able to carry out the next steps.
First, clear up assets in account A:
- Empty the S3 bucket created for this submit by deleting the downloaded objects out of your S3 bucket.
- Delete the CloudFormation stack.
This deletes the S3 bucket, customized IAM roles, insurance policies, and the LF database, tables, and permissions.
- You could select to undo the Lake Formation settings additionally and add IAM entry again from the Lake Formation console Settings web page.
Now full the next steps in account B:
- Empty the S3 bucket
lfblog-athenaresults-<your-accountB-id>-us-east-1used because the Athena question outcomes location. - Revoke permission to
lf-business-analystsas database creator. - Delete the CloudFormation stack.
This deletes the IAM customers, S3 bucket, Lake Formation database db_for_shared_tables, useful resource hyperlinks, and all of the permissions from Lake Formation.
If there are any useful resource hyperlinks and permissions left, delete them manually in Lake Formation from each accounts.
Conclusion
On this submit, we illustrated the advantages of utilizing Lake Formation cross-account sharing Model 3 utilizing LF-tags to direct IAM principals and obtain the shared tables within the shopper account. We used a two-account state of affairs by which an information producer account shares a database and particular tables to particular person IAM customers in one other account utilizing LF-tags. Within the receiving account, we confirmed the position performed by an information lake admin vs. the receiving IAM customers. We additionally illustrated overwrite column tags to masks and share PII knowledge.
With Model 3 of cross-account sharing options, Lake Formation makes doable extra fashionable knowledge mesh fashions, the place a producer can straight share to an IAM principal in one other account, as an alternative of the whole account. Knowledge mesh implementation turns into simpler for knowledge directors and knowledge platform house owners as a result of they’ll simply scale to lots of of shopper accounts utilizing the LF-tags based mostly sharing to organizational models or IDs.
We encourage you to improve your Lake Formation cross-account sharing to Model 3 and profit from the enhancements. For extra particulars, see Updating cross-account knowledge sharing model settings.
In regards to the authors
Aarthi Srinivasan is a Senior Large Knowledge Architect with AWS Lake Formation. She likes constructing knowledge lake options for AWS clients and companions. When not on the keyboard, she explores the newest science and expertise developments and spends time together with her household.
Srividya Parthasarathy is a Senior Large Knowledge Architect on the AWS Lake Formation crew. She enjoys constructing analytics and knowledge mesh options on AWS and sharing them with the neighborhood.