LightBeam Documentation
Installer GuidesData SourcesPlaybooksInsightsPrivacyOpsGovernance
  • 💡What is LightBeam?
  • 🚀Getting Started
    • ⚙️Installer Guides
      • Pre-Requisites / Security Configurations
        • Firewall Requirements
        • Securing LightBeam on EKS with AWS Certificate Manager on Elastic Load Balancer
        • Configure HTTPS for LightBeam Endpoint FQDN Standalone deployment
        • Using Custom Certificates with LightBeam
        • Securing LightBeam on GKE with Google Certificate Manager and GCE Ingress
      • Core
        • LightBeam Deployment Instructions
        • LightBeam Installer
        • Web App Deployment
        • LightBeam Diagnostics
        • LightBeam Cluster Backup & Restore using Velero
      • Platform Specific
        • AWS
        • Microsoft Azure
        • Google Cloud (GKE)
        • Standalone Virtual Machine
        • Deployment on an Existing Managed Kubernetes Cluster
        • Azure Marketplace Deployment
      • Integration and Setup
        • Setting Up AWS PrivateLink for RDS-EKS Interaction
        • Twingate and LightBeam Integration Guide
        • Data Subject Request Web Application Server
        • Generate CSR for LightBeam
  • 🧠Core Features
    • 🔦Spectra AI
      • 🔗Data Sources
        • Cloud Platforms
          • AWS Auto Discovery
          • GCP Auto Discovery
        • Databases and Datalakes
          • PostgreSQL
          • Aurora (PostgreSQL)
          • Snowflake
          • MS SQL
          • MySQL
          • Aurora (MySQL)
          • BigQuery
          • AWS Redshift
          • Oracle
          • DynamoDB
          • MongoDB
          • CosmosDB (PostgreSQL)
          • CosmosDB (MongoDB)
          • CosmosDB (NoSQL)
          • Looker
          • AWS Glue
          • Databricks
          • SAP HANA
          • CSV Files as a Datasource
        • Messaging
          • Gmail
          • Slack
          • MS Teams
          • MS Outlook
        • Developer Tools
          • Zendesk
          • ServiceNow
          • Jira
          • GitHub
          • Confluence
        • File Repositories
          • NetDocuments
          • AWS S3
          • Azure Blob
          • Google Drive
          • OneDrive
          • SharePoint
          • Viva Engage
          • Dropbox
          • Box
          • SMB
        • CRM
          • Hubspot
          • Salesforce
          • Automated Data Processing (ADP)
          • Marketo
          • Iterable
          • MS Dynamics 365 Sales
          • Salesforce Marketing Cloud
      • 🔔PlayBooks
        • What is LightBeam Playbooks?
        • Policy and Alerts
          • Types of Policies
          • How to create a rule set
            • File Extension Filter
          • Configuring Retention Policies
          • Viewing Alerts
          • Sub Alerts
            • Reassigning Sub-Alerts
            • Sub-alert States
          • Levels of Actions on Alerts
          • User Roles and Permissions
            • Admin View
            • Alert Owner View
            • Onboarding New Users
              • User Management
              • Okta Integration
              • Alert Assignment Settings
              • Email Notifications
            • Planned Enhancements
          • Audit Logs
          • No Scan List
          • Permit List
          • Policy in read-only mode
      • 📊Insights
        • Entity Workflow
        • Document Classification
        • Attribute Management Overview
          • Attributes Page View
          • Attribute Sets
          • Creating Custom Attribute
          • Attributes List
        • Template Builder
        • Label Management
          • MIP Integration
          • Google Labels Integration
      • 🗃️Reporting
        • Delta Reporting
        • Executive Report
        • LightBeam Lens
      • Scanning and Redaction of Files
        • On-demand scanning
      • How-to Guides
        • Leveraging LightBeam insights for structured data sources
      • LightBeam Dashboard Outlay
      • Risk Score
    • 🏛️PrivacyOps
      • Data Subject Request (DSR)
        • What is DSR?
        • Accessing the DSR Module
        • DSR Form Builder (DPO View)
          • Creating a New DSR Form
            • Using a Predefined Template
            • Creating a Custom Form
          • Form Configuration
          • Form Preview and Publishing
          • Multi-Form Management
          • Messaging Templates
        • Form Submission & Email Verification (Data Subject View)
        • DSR Management Dashboard (DPO View)
        • Processing DSR Requests
          • Data Protection Officer (DPO) Workflow
          • Self Service Workflow (Direct Validation)
          • Data Source Owner (DSO) Workflow
        • DSR Report
      • 🚧Consent Management
        • Overview
        • Consent Logs
        • Preference Centre
        • Settings
      • 🍪Cookie Consent
        • Dashboard
        • Banners
        • Domains
        • Settings
        • CMP Deployment Guide for Google Tag Manager
        • FAQs
      • 🔏Privacy Impact Assessment (PIA)
        • PIA Templates
        • PIA Assessment Workflow
        • Collaborator View
        • Process Owner Login View (With Collaborator)
        • Filling questionnaire without collaborator
        • Submitting the assessment for DPO review
        • DPO review process
        • Marking the assessment as reviewed
        • Editing and resubmitting assessments after DPO review
        • Revoke review request
        • Edit Reviewer
        • PIA Reports
      • ⏺️Records of Processing Activity (RoPA)
        • Creating a RoPA Template
          • How to clone a template
          • How to use a template
        • How to create a process
          • Adding Process Details
          • Adding Data Elements
          • Adding Data Subjects
          • Adding Data Retention
          • Adding Safeguards
          • Adding Transfers
          • Adding a Custom Section
          • Setting a Review Schedule
          • Data Flow Diagram
        • How to add a collaborator
        • Overview Section
        • Generating a RoPA Report Using LightBeam
        • Collaborator working on a ticket
    • 🛡️Governance
      • Access
        • Dashboard
        • Users
        • Groups
        • Objects
        • Active Directory Settings
        • Access Governance at a Data Source Level
        • Policies and Alerting
        • Access Governance Statistics
        • Governance Module Dashboard
      • Privacy At Partners
  • 📊Tools & Resources
    • 🔀API Documentation
      • API to Create Reports for Structured Datasource
    • ❓Onboarding Assessments
      • Structured Datasource Onboarding Questionnaire
        • MongoDB/CosmosDB Questionnaire
        • Oracle Datasource Questionnaire
      • SMB Questionnaire
    • 🛠️Administration
      • Audit Logs
      • SMTP
        • Basic and oAuth Configuration
      • User Management
        • SAML Identity Providers
          • Okta
            • LightBeam Okta SAML Configuration Guide
          • Azure
            • Azure AD SAML Configuration for LightBeam
          • Google
            • Google IDP
        • Local User Management
          • Adding a User to the LightBeam Dashboard
          • Reset Default Admin Password
  • 📚Support & Reference
    • 📅Release Notes
      • LightBeam v2.2.0
      • Reporting Release Notes
      • Q1 2024 Key Enhancements
      • Q2 2024 Key Enhancements
      • Q3 2024 Key Enhancements
      • Q4 2024 Key Enhancements
    • 📖Glossary
Powered by GitBook
On this page
  • Overview
  • Onboarding PostgreSQL Data Source
  • APPENDIX
  • Troubleshooting
  • Minimal permissions setup
  • Validate permissions to the database
  • About LightBeam
  1. Core Features
  2. Spectra AI
  3. Data Sources
  4. Databases and Datalakes

PostgreSQL

Connecting PostgreSQL to LightBeam

PreviousDatabases and DatalakesNextAurora (PostgreSQL)

Last updated 3 months ago


Overview

LightBeam Spectra users can connect various data sources to the LightBeam application and these data sources will be continuously monitored for PII, PHI data.

Example: PostgreSQL, Snowflake, SMB, etc.


Onboarding PostgreSQL Data Source

  1. Login to your LightBeam Instance.

  2. Click on DATASOURCES on the Top Navigation Bar.

  3. Click on “Add a data source”.

Figure 1. Add Data Source

4. Click on PostgreSQL.

5. Choose Database Scanning Type

  • A pop-up window appears, prompting you to select one of the following (See Figure 3):

    A. Live Database Scanning – Use this option to connect to a real-time PostgreSQL instance.

    B. Snapshot Scanning – Choose this option to onboard a static snapshot of your database. Snapshot Scanning is a broad method that can be used for offline backups.

    CSV Scanning Note: If you are onboarding a CSV file stored in cloud storage (often referred to as “Postgres Offline”), LightBeam treats the CSV file as an offline snapshot. For detailed CSV onboarding steps, please refer to the CSV Files as a Datasource document.

  1. Click Proceed to continue.

A. Configure Live Database Scanning

If you If you selected Live Database Scanning, the next screen will ask for:

  1. Basic Details

    • Instance Name – A unique identifier for this PostgreSQL connection.

    • Primary Owner – The email of the user responsible for this data source.

    • Description (Optional) – A brief description of the data source.

    • Mark as Source of Truth (Optional) – Toggle this setting if the database serves as a definitive reference for entity resolution.

  2. Connection Details:

    • Username & Password – Credentials for database authentication.

    • Host – The FQDN or IP address of your PostgreSQL server.

    • Port – Default is 5432, or specify a custom port.

  3. Authentication Mechanisms

    You can choose one of the following:

    1. Basic Authentication (Username/Password)

    2. AWS Access Keys

    3. AWS IAM Role (newly added)

    Option 1: Basic Authentication

    • Best for: Standard PostgreSQL instances that use direct credential-based authentication.

    • Required Inputs:

      • Username

      • Password

      • Host (FQDN or IP Address)

      • Port (Default: 5432 or Custom Port)

    • Optional Security Features: Upload SSL Certificate, SSL Key, and SSL CA Certificate for encrypted communication.

Option 2: AWS Access Keys

  • Best for: Connecting to PostgreSQL instances hosted on AWS RDS using programmatic credentials.

  • Required Inputs:

    • AWS Access Key ID

    • AWS Secret Key

    • Host (FQDN or IP Address)

    • Port (Default: 5432 or Custom Port)

    • Region Selection (e.g., US East - us-east-1)

  • Optional Security Features: Upload SSL Certificate, SSL Key, and SSL CA Certificate for enhanced security.

Option 3: AWS IAM Role

  • Best for: Secure authentication within AWS environments where IAM roles are preferred over static credentials.

  • Required Inputs:

    • Host (FQDN or IP Address)

    • Port (Default: 5432 or Custom Port)

    • Region Selection (e.g., US East - us-east-1)

  • Optional Security Features: Upload SSL Certificate, SSL Key, and SSL CA Certificate to enable encrypted connections.

4. Additional Details (Optional)

In this section, you can specify metadata attributes related to the PostgreSQL data source:

  • Location: Select the geographic region where the database is hosted.

  • Purpose: Define the purpose of data collection (e.g., Analytics, Compliance, Security).

  • Stage: Indicate the stage of data processing (e.g.,Collection, Processing, Storage).

B. Configure Snapshot (Offline) Scanning

When Snapshot Scanning is chosen, you can onboard:

  1. Offline Database Backups or

  2. CSV Files stored in a connected cloud storage service.

  1. Basic Details:

  • Instance Name – A unique identifier for the database connection.

  • Primary Owner – The email address of the responsible user.

  • Description (Optional) – A short description of the data source.

  • Mark as Source of Truth (Optional) – Toggle this setting if the database serves as a definitive reference for entity resolution.

  1. Connection Details:

  • Select Data Source Where the File is Present – Choose from connected Google Drive, OneDrive, or SharePoint.

  • Select a Drive – Drive owner’s email (for example, owner@mycompany.com).

  • Folder Link or Folder Name

    • For Google Drive: paste the folder link (e.g., https://drive.google.com/drive/folders/<some_id>).

    • For OneDrive or SharePoint: enter the folder path (e.g., folder1, or folder1/nested for subfolders).

  1. Click Test Connection.

  2. If the connection is successful, a Connection Success! message appears.

  3. Click Next and select the list of databases (or CSV snapshots) to scan.

  4. Click Start Sampling.

The PostgreSQL data source is now ready for scanning.


APPENDIX

Troubleshooting

If you don’t see any data being scanned without error, it might be a permission issue. Consider running a SELECT * query on a table and see if you are able to see the data. If you see a message of permission denied, consider granting permission to the user.

Minimal permissions setup

We require the following permissions to scan only a subset of the databases for the instance:

  • CONNECT permissions

  • For each database - CONNECT and SELECT permissions

Use the following script to create a user with such permissions. In this example, we are creating a user with the permissions to connect to the LightBeam database.

User with restricted permissions for a single database

-- CREATE USER test1 WITH PASSWORD 'lbadmin12345';

-- GRANT SELECT ON ALL TABLES IN SCHEMA public TO test1;

-- GRANT CONNECT ON DATABASE lightbeam to test1;

-- GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO test1;

Use the user you just created to register PostgreSQL datasource.

Full permissions setup

If you want to, you can scan all the databases and allow wider scope permissions. LightBeam recommends a full read-only user that can access a list of databases, connect to every database, and read data.

Validate permissions to the database

Next, the user needs to validate these permissions to the database. This ensures authorized access to the database by the credentials provided by the user. After validating the permissions to the database, the user can configure LightBeam Spectra on the system.

Prerequisite

The following tools need to be installed on the system in order to verify database permissions:

  • Git

  • PSQL tool

Steps

  1. Go into sql_user_check_postgres directory

  2. Please refer to the README.md file in the directory for detailed instructions.


About LightBeam

LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customers’ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.

Figure 2. Selection of PostgreSQL Data Source
Figure 2.1 PostgreSQL Data Source
Figure 6. List of Databases

First, clone the repository

For any questions or suggestions, please get in touch with us at: .

🧠
🔦
🔗
https://github.com/lightbeamai/lb-installer
support@lightbeam.ai
Figure 3: PostgreSQL Datasource - Selecting Database Scanning Type
Figure 4: Basic Details for Live Database Scanning - Basic Authentication
Figure 4.1 :Basic Details for Live Database Scanning - AWS Access Keys
Figure 4.2: Basic Details for Live Database Scanning - AWS IAM Role