AWS Auto Discovery

1. Introduction

AWS Auto Discovery streamlines the process of integrating your AWS resources with LightBeam. This feature automatically detects and registers supported AWS services, eliminating the need for manual data source registration. It also provides visibility into unsupported resources, helping you maintain an up-to-date inventory of your AWS environment.

2. Supported Services

2.1 Fully Supported Services

  • Amazon S3 (Simple Storage Service)

  • Amazon RDS (Relational Database Service)

    • MySQL

    • PostgreSQL

    • Microsoft SQL Server

    • Oracle

  • Amazon Redshift

  • AWS Glue

  • Amazon DynamoDB

  • Amazon FSx (File Systems)

2.2 Discoverable Services (Limited Support)

  • Amazon EC2 (Elastic Compute Cloud)

  • Amazon EFS (Elastic File System)

  • Amazon S3 Glacier

  • AWS Backup

  • Amazon CloudWatch

3. Onboarding Process

3.1 Create an IAM Role in AWS

To enable LightBeam to access your AWS resources, you need to create an IAM role with the necessary permissions. For detailed steps for creating the IAM role go to Appendix.

3.2 Accessing the Onboarding Screen

  1. Navigate to the "Datasources" header in the main navigation.

  2. Click on the "Cloud Platforms" tab.

  3. Select AWS from the left sidebar menu.

  4. Click on the Onboard Now button.

Alternative Method:

  • Scroll to the bottom of the Datasources page to find a list of supported cloud platforms.

  • Click on the AWS icon to start the onboarding process.

3.4 Entering AWS Account Details

  1. After selecting "AWS" as the cloud platform,

enter the following AWS account details:

  • Name

  • Description(Optional)

  • Primary Owner

  • Co-owner(Optional)

  • AWS Account ID

  • Access Key ID

  • Secret Access Key

  • Session Key(Optional)

  1. After entering the Secret Key, the system automatically runs an internal test connection API to validate the credentials.

  2. Once you see the message Connection Verified, proceed to the next step.

3.5 Configuring Discovery Settings

  1. Set the frequency for scanning resources:

    • Options include daily, weekly, or monthly scans.

  2. Selecting Automatic Registration

    • Choose data sources for automatic registration:

      • A dropdown list of supported data sources will be available.

      • Example: S3 buckets can be set for auto-registration upon discovery.

  3. Click "Save" to confirm your settings and initiate the discovery process.

4. Resource Discovery Process

  1. After saving, an in progress message will appear: Resource discovery is in progress, and this process may take some time.

  2. During this process:

    • A Terraform workflow is triggered.

    • This workflow creates necessary IAM roles and policies.

    • The created roles and policies are assigned to the appropriate service account.

  3. The process typically takes about 5 minutes for a single AWS account.

3.6 Reviewing Discovered Resources

  1. Refresh the page to view the results.

  1. You'll see your AWS account listed.

  2. Discovered resources are displayed in a table format with the following details:

    • Data source name and type

    • Account name

    • Region

    • Owner

    • Status (Registered or Unregistered)

    • Date added

Naming Convention for Discovered Resources:

LightBeam uses a specific naming convention for automatically discovered AWS resources. This convention helps in easily identifying and categorizing different types of resources across your AWS account.

1. Data Sources with One Instance per Account

For resources that typically have one instance per AWS account, such as S3 and DynamoDB:

Format: <Data source type>_<Account Name>

Examples:

  • S3_Lightbeam

  • DynamoDB_Lightbeam

2. Data Sources with One Instance per Region

For resources that are limited to one instance per region, such as AWS Glue:

Format: <Data source type>_<region>_<Account Name>

Example:

  • Glue_us-east-1_Lightbeam

3. Data Sources with Multiple Instances per Region

For resources that can have multiple instances within a single region:

Format: The name will be the same as the cluster or instance name in AWS.

Example:

  • If you have an RDS instance named "production-db" in AWS, it will appear as "production-db" in LightBeam.

4. Handling Duplicate Names

In cases where duplicate names are detected:

  • The first instance will use the standard naming convention.

  • Subsequent instances with the same name will have suffixes appended: *1, *2, and so on.

Example:

  • First instance: MyDatabase

  • Second instance with the same name: MyDatabase*1

  • Third instance with the same name: MyDatabase*2


This naming convention helps in maintaining consistency and clarity across your discovered AWS resources in LightBeam. It allows for easy identification of resource types, associated AWS accounts, and, where applicable, the AWS region.


  1. Initial status:

    • All discovered resources are marked as "Unregistered" with an orange status icon.

    • Each resource has a "Register" button for manual registration.

  2. Discovery scope:

    • Regional services: RDS, Redshift, Glue, FSx

    • Organization-level services: S3, DynamoDB

  3. Search Bar: Enables quick resource lookup using keywords.

  4. Filter Options: Allows refinement of results based on specific criteria.

  5. Status Tabs: Toggles between "Supported" and "Unsupported" resources.


4. Registration Process

4.1 Automatic Registration

  • Pre-selected services are registered automatically upon discovery.

4.2 Manual Registration (Example: Redshift)

  1. Click on the "Register" button next to the Redshift datasource.

  2. Add basic details for the Redshift data source and click on Next.

    Since IAM roles are already set up, the Host, Port and Frequency of Scanning fields are prefilled.

  3. Enter the Username and Password.

  4. Click "Test Connection" to verify access.

  5. The rest of the workflow is the same as described in AWS Redshift document.

4.3 Post-Registration

  • The registered data source will appear in your list of data sources.

  • In the Cloud Platforms view, the status for AWS S3 will update to "Sync On".

  • You can click on the data sourc name to access its dashboard.

5. Operational Procedures

5.1 Manual Sync

Users can trigger a manual sync to update the status of discovered resources:

  1. Navigate to the Cloud Platforms dashboard

  2. Click the "Manual Sync" button

  3. Wait for the sync process to complete

5.2 Error Handling

  • Resources with errors during discovery or registration are flagged with appropriate status indicators

  • Users can attempt to resolve issues and re-run the registration process

5.3 Additional Views

An "All Datasources" section is available with two tabs:

  • Scanning: Shows data sources currently being scanned.

  • Unregistered: Displays discovered but not yet registered data sources.

6. Security Considerations

  • High-privileged credentials used during onboarding are discarded after initial setup

  • IAM roles with least-privilege permissions are created for ongoing operations

  • All communications with AWS APIs use secure, encrypted connections

7. Limitations and Future Enhancements

  • Some discoverable services (e.g., EC2, CloudWatch) have limited metadata available

  • Future versions may include enhanced support for currently limited services

  • Additional AWS services may be added to the list of fully supported services in upcoming releases


Required IAM Permissions

For LightBeam AWS Auto Discovery to function properly, the AWS IAM user needs the following permissions. Create a policy with this JSON:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sts:GetCallerIdentity",
                "eks:DescribeCluster",
                "ec2:DescribeTags",
                "iam:CreateRole",
                "iam:CreatePolicy",
                "iam:AttachRolePolicy",
                "iam:SimulatePrincipalPolicy",
                "organizations:DescribeAccount",
                "iam:GetRole",
                "iam:GetPolicy",
                "iam:GetPolicyVersion",
                "iam:ListRolePolicies", 
                "iam:ListAttachedRolePolicies",
                "iam:ListInstanceProfilesForRole",
                "iam:ListPolicyVersions"
            ],
            "Resource": "*"
        }
    ]
}

1. Create IAM Policy

  1. Navigate to Console Home page

  2. Click on IAM

  1. Click on "Policies" in left sidebar.

  1. Click the "Create policy" button.

  1. Select "JSON" editor tab.

  1. Configure the policy by erasing the existing code in editor.

  1. Type the required JSON policy.

  1. Click on "Next".

  2. Give a name to the policy, for example: "lightbeam-aws-discovery-policy-test1".

  3. Click on "Create policy".

2. User Creation

  1. Navigate to IAM → Users

  • Click "Create user" button

  • Set User Details

    • Username: "lightbeam-aws-discovery-user-test1"

  • Click "Next"

  • Set Permissions

    • Select "Attach policies directly"

  • Search for "lightbeam-aws-discovery-policy-test1".

  • Check the box next to the policy.

  • Click "Next".

  • Click "Create user"

3. Access Key Generation

  1. Access User Credentials

    • Search for created user "lightbeam-aws-discovery-user-test1"

    • Click on created user "lightbeam-aws-discovery-user-test1".

  • Click on "Security credentials" tab.

  • Start Access Key Creation

    • Locate "Access keys" section.

    • Click "Create access key" button.

  • Configure Access Key Purpose

    • Select "Command Line Interface (CLI)" use case.

  • Check the box "I understand the above recommendation and want to proceed to create an access key."

  • Create and Save Credentials

    • Click on Create access key

4. Securely Save AWS Access Keys

After creating the access key, you have two options to save the credentials:

  1. Download .csv File Method

    • Click "Download .csv file" button

    • File will contain:

      • Access Key ID

      • Secret Access Key

    • Store this file securely

  2. Manual Copy Method

    • Access Key ID:

      • Copy directly from interface by clicking on the icon 📑 next to Access Key

    • Secret Access Key:

      • Click "Show" to reveal

      • Copy directly from interface by clicking on the icon 📑

    • Record both values securely.

Important Security Notes:

  • Never store access keys in code repositories

  • Keep credentials secure and private

  • Recommended to use short-term credentials where possible

  • Maximum of two active access keys allowed per user


About LightBeam

LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customers’ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.

For any questions or suggestions, please get in touch with us at: [email protected].

Last updated