Snowflake
Connecting Snowflake to LightBeam
Last updated
Connecting Snowflake to LightBeam
Last updated
LightBeam Spectra users can connect various data sources to the LightBeam application and these data sources will be continuously monitored for PII, PHI data.
Examples: Snowflake, SMB, MySQL, PostgreSQL, etc.
Login to your LightBeam Instance.
Click on DATASOURCES on the Top Navigation Bar.
Click on “Add a data source”.
Search for “Snowflake”.
3. Fill in the details as shown below and click Next:
Instance Name: This is the unique name given to the data source.
Description: This is an optional field needed to describe the use of this data source.
Primary owner: Email address of the person responsible for this data source which will get alerts by default.
Source of Truth: LightBeam Spectra would have monitored data sources that contain data acting as a single point of truth and that can be used for looking up entities/attributes that help to identify if the other attributes/entities found in any other data source are accurate or not. A Source of Truth data set would create entities based on the attributes found in the data.
Provide the following details in the Connection section:
Username: The Snowflake account username (e.g., admin
).
Password: The password associated with the username.
Account Name: Enter the account name in the format <account_locator>.<region>.<cloud>
. For example: rs31112.europe-west4.gcp
.
For AWS us-west-2
, use only the <account_locator>
.
Alternatively, <org_name>-<account_name>
can also be used.
Role: The role assigned to the user for accessing the Snowflake instance (e.g., lightbeam_users
).
Warehouse: Specify the warehouse (e.g., compute_wh
).
5. Click Test Connection to validate the credentials. If successful, a Test Connection Success message will appear.
Click Next to continue.
6. In this step, you can choose either of two scan setting options –
i) Show all databases to select
ii) Select specific database(s) that you have permission for
i) To show all databases, select the first scan setting. This will show a list of all the Snowflake databases.
ii) To select specific databases you have permission for, select the second scan setting.
Click on Add database name.
Type the name of the database you would like to scan in the Search box and choose the correct option from the drop-down list.
7. After completing step 6, check the tickboxes next to the databases you would like to add.
Now we are ready to connect to the test database and proceed.
Click on Start Sampling.
This will show you the following message:
Click on Proceed with Sampling.
Now you can browse the updated datasource.
During the scanning process, some databases may be skipped if required permissions or configurations are missing. LightBeam provides a clear way to identify and address these skipped databases.
In the Datasources section, skipped databases are highlighted in the Overview panel:
A yellow notification banner indicates the number of skipped databases (e.g., "12 Skipped") and states the reason, for example: "necessary permissions haven’t been configured."
Click View Details on the yellow banner.
A modal window will appear, listing the names of the skipped databases (e.g., sandbox
, automation_stuff
, test2
).
The modal provides a description explaining that these databases were skipped due to missing permissions or errors.
Once permissions are configured, LightBeam will automatically include these databases in the next scan cycle.
If you don’t see any data being scanned without error, it might be a permission issue. Consider running a SELECT * query on a table and see if you are able to see the data. If you see a message of permission denied, consider granting permission to the user.
Whitelisting IP address
By default, Snowflake allows users to connect to the service from any computer or device. If there is an active policy that allows access only from certain networks, add the public IP of all the nodes where the LightBeam cluster is running. 1. Goto admin -> security from snowflake UI. Create a new network rule containing the Public IP address of all LightBeam nodes.
2. Attach this network rule to the active network policy as Allowed.
A private link is a feature for securing connectivity between your clients and the Snowflake without traversing the public Internet.
Following these links if you want to setup that:-
We need to create a new user containing all permissions required by LightBeam to scan the datasource.
User: A user in Snowflake is an account in the system, generally associated with an individual person. Users can log into Snowflake, issue SQL commands, manage data, and perform other operations. A user is associated with specific properties, such as login name, password, and default role.
Role: A role in Snowflake, on the other hand, is a named set of access privileges that can be granted to users or other roles. These privileges determine what actions a user can perform and on which database objects.
A user can be assigned multiple roles and can switch between them during a session to access different sets of privileges as needed.
In essence, a user is who logs into the system, and a role determines what that user can do once they are logged in. This distinction allows Snowflake to provide flexible and granular control over access to its resources.
The following SQL snippet can be used for creating a role, a user, and assigning permissions to a single database.
If you want to scan more than one database it is recommended to create a user and assign read permissions to all databases.
Step 1: First, create a user and assign permissions to use a warehouse.
Step 2: Now assign permissions to the role to access all databases in the account.
In the SQL snippet below replace the ROLE
placeholder with the name of the role created in step 1. Run the SQL snippet, it will print a bunch of SQL statements for granting permissions. Copy the output and run those statements again.
Note: If you want to exclude some databases modify the SQL snippet in step 2 to include the name of the database alongside SNOWFLAKE_SAMPLE_DATA
.
Provide the createdUsername
, Password
, Role Name
, Warehouse Name
and Account Name
to register the Snowflake datasource.
Next, the user needs to validate these permissions to the database. This ensures authorized access to the database by the credentials provided by the user. After validating the permissions to the database, the user can configure LightBeam Spectra on the system.
Install snowsql
on the machine.
Steps
Go into sql_user_check_snowflake/
directory
* WAREHOUSE_NAME:
Name of the warehouse in your Snowflake instance.
* ACCOUNT_NAME:
Name of your Snowflake account.
* ROLE_NAME:
Name of the role assigned to the user in Snowflake.
* SF_USERNAME:
Username for the Snowflake instance.
* SF_DATABASE:
Name of the database in Snowflake to which you wish to establish a connection and validate the permissions.
To validate whether the commands were successful, check the output of the file generated from the commands.
LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customers’ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.
AWS:-
Azure:-
GCP:-
First, clone the repository
For any questions or suggestions, please get in touch with us at: .