Structured Datasource Onboarding Questionnaire
Preliminary Questions
Target Database: Which database do we want to scan?
Server Version: Database server(SQL Server, MySQL, Postgresql) version.
Connectivity
Hosting: Where is the database hosted? (Azure, managed instance, VMs on-prem)
Authentication: Does the database require any special type of authentication other than password-based? E.g., AWS IAM Authentication for RDS.
Communication: Connectivity between the database instance and Lightbeam cluster services:
IP Whitelisting: Whitelist IPs, VPN. Is VPN required?
Firewall: Ensure the RDS instance firewall rules allow Lightbeam access.
Network ACLs: Are there NACLs that could prevent access?
Cloud Account Consistency: Are the Lightbeam and DB instances on the same cloud accounts? Or are they on different accounts?
Network Access: Ensure LightBeam can access an RDS instance on a private subnet (via VPC peering).
External Connections: How do other services connect to DB instances?
SSL/Certificate: Is SSL/certificate required for LightBeam to connect to the database?
Nature of the Database
Primary Keys: Do the DB tables have primary keys?
Database Statistics:
Number of databases.
Number of tables per database.
Max/Average number of rows in tables per database.
Permissions
Validation: Validate that the Postgres credentials can access the required databases and schemas.
Verification: For Postgres, use
psql
to verify the credentials.Assistance: Lightbeam can provide SQL scripts to validate the permissions.
Configured Query Limits
Query Restrictions: Does the database have any limits on the runtime or size of queries we can run?
Callout
Latency: If Lightbeam is in a different region from the database (e.g., US-WEST-2 / US-EAST-1 - 25ms latency), starting with the same DB region is advisable to avoid latency issues.
Network: Private networks may need additional customer intervention.
Costs & Delays: Ensure database access doesn't lead to unexpected egress costs. Data transfer could add extra runtime to the entity builder.
Note:
Resources: Describe the database resources (CPU, Memory) that LightBeam will scan. For instance, DBs with 10M records should typically have at least 4vCPU and 16GB memory.
Additional Details: Include any other relevant details about the database.
Further Details for MongoDB and CosmosDB:
For MongoDB and CosmosDB users, please visit our detailed questionnaire specific to these databases. This will help us tailor our integration process to better suit your setup.
Further Details for Oracle Datasource:
For Oracle Datasource, please visit our detailed questionnaire specific to this datasource. This will help us tailor our integration process to better suit your setup.
About LightBeam
LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customersβ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.
Last updated