Snowflake
Connecting Snowflake to LightBeam
Overview
LightBeam Spectra users can connect various data sources to the LightBeam application and these data sources will be continuously monitored for PII, PHI data.
Examples: Snowflake, SMB, MySQL, PostgreSQL, etc.
Onboarding Snowflake Data Source
Login to your LightBeam Instance.
Click on DATASOURCES on the Top Navigation Bar.
Click on “Add a data source”.
Search for “Snowflake”.
3. Fill in the details as shown below and click Next:
Basic Information
Instance Name: This is the unique name given to the data source.
Description: This is an optional field needed to describe the use of this data source.
Primary owner: Email address of the person responsible for this data source which will get alerts by default.
Source of Truth: LightBeam Spectra would have monitored data sources that contain data acting as a single point of truth and that can be used for looking up entities/attributes that help to identify if the other attributes/entities found in any other data source are accurate or not. A Source of Truth data set would create entities based on the attributes found in the data.

Connection Details
Provide the following details in the Connection section:
Username: The Snowflake account username (e.g.,
admin
).Password: The password associated with the username.
Account Name: Enter the account name in the format
<account_locator>.<region>.<cloud>
. For example:rs31112.europe-west4.gcp
.For AWS
us-west-2
, use only the<account_locator>
.Alternatively,
<org_name>-<account_name>
can also be used.
Role: The role assigned to the user for accessing the Snowflake instance (e.g.,
lightbeam_users
).Warehouse: Specify the warehouse (e.g.,
compute_wh
).


5. Click Test Connection to validate the credentials. If successful, a Test Connection Success message will appear.
Click Next to continue.
6. In this step, you can choose either of two scan setting options –
i) Show all databases to select
ii) Select specific database(s) that you have permission for
i) To show all databases, select the first scan setting. This will show a list of all the Snowflake databases.
ii) To select specific databases you have permission for, select the second scan setting.
Click on Add database name.
Type the name of the database you would like to scan in the Search box and choose the correct option from the drop-down list.
7. After completing step 6, check the tickboxes next to the databases you would like to add.
Now we are ready to connect to the test database and proceed.
Click on Start Sampling.
This will show you the following message:
Click on Proceed with Sampling.
Now you can browse the updated datasource.
Viewing Details of Skipped Databases
During the scanning process, some databases may be skipped if required permissions or configurations are missing. LightBeam provides a clear way to identify and address these skipped databases.
In the Datasources section, skipped databases are highlighted in the Overview panel:
A yellow notification banner indicates the number of skipped databases (e.g., "12 Skipped") and states the reason, for example: "necessary permissions haven’t been configured."

Resolving Scan Issues for Skipped Databases
Click View Details on the yellow banner.
A modal window will appear, listing the names of the skipped databases (e.g.,
sandbox
,automation_stuff
,test2
).The modal provides a description explaining that these databases were skipped due to missing permissions or errors.
Once permissions are configured, LightBeam will automatically include these databases in the next scan cycle.

APPENDIX
Troubleshooting
If you don’t see any data being scanned without error, it might be a permission issue. Consider running a SELECT * query on a table and see if you are able to see the data. If you see a message of permission denied, consider granting permission to the user.
Whitelisting IP address
By default, Snowflake allows users to connect to the service from any computer or device. If there is an active policy that allows access only from certain networks, add the public IP of all the nodes where the LightBeam cluster is running. 1. Goto admin -> security from snowflake UI. Create a new network rule containing the Public IP address of all LightBeam nodes.

2. Attach this network rule to the active network policy as Allowed.

Connecting through Private Links
A private link is a feature for securing connectivity between your clients and the Snowflake without traversing the public Internet.
Following these links if you want to setup that:-
Setting up a new user in Snowflake
We need to create a new user containing all permissions required by LightBeam to scan the datasource.
User: A user in Snowflake is an account in the system, generally associated with an individual person. Users can log into Snowflake, issue SQL commands, manage data, and perform other operations. A user is associated with specific properties, such as login name, password, and default role.
Role: A role in Snowflake, on the other hand, is a named set of access privileges that can be granted to users or other roles. These privileges determine what actions a user can perform and on which database objects.
A user can be assigned multiple roles and can switch between them during a session to access different sets of privileges as needed.
In essence, a user is who logs into the system, and a role determines what that user can do once they are logged in. This distinction allows Snowflake to provide flexible and granular control over access to its resources.
The following SQL snippet can be used for creating a role, a user, and assigning permissions to a single database.
-- Create a new role containing all permissions. Replace ROLE placeholder with a role name of your choice
CREATE ROLE <ROLE>
-- Grant access to a warehouse. Replace WAREHOUSE placeholder with an existing warehouse in your instance.
GRANT usage on warehouse <WAREHOUSE> to role <ROLE>;
GRANT operate on warehouse <WAREHOUSE> to role <ROLE>;
-- Grant access to connect to the specific database and schema. Replace DATABASE placeholder with the name of database you want to scan.
GRANT USAGE on DATABASE <DATABASE> to ROLE <ROLE>;
-- Replace SCHEMA placeholder with the name of schema you want to scan. If you have multiple schemas in the database repeat the below SQL statement for every schema.
GRANT USAGE on SCHEMA <SCHEMA> to ROLE <ROLE>;
-- Grant access to scan existing and future tables in a database.
-- Repeat this for all databases that you want to scan.
GRANT SELECT ON ALL TABLES IN DATABASE <DATABASE> to ROLE <ROLE>;
GRANT SELECT ON FUTURE TABLES IN DATABASE <DATABASE> to ROLE <ROLE>;
-- Create a user with a strong password and default warehouse. Replace USERNAME placeholder with the username of your choice.
-- Replace the value of DEFAULT_WAREHOUSE with an existing warehouse in your instance.
CREATE USER <USERNAME> PASSWORD='<PASSWORD>' DEFAULT_WAREHOUSE=<WAREHOUSE>;
-- Assign the role created above to the new user.
GRANT ROLE <ROLE> TO USER <USERNAME>;
If you want to scan more than one database it is recommended to create a user and assign read permissions to all databases.
Step 1: First, create a user and assign permissions to use a warehouse.
-- Create a new role. Replace ROLE placeholder with a role name of your choice
CREATE ROLE <ROLE>
-- Grant access to a warehouse. Replace WAREHOUSE placeholder with an existing warehouse in your instance.
GRANT usage on warehouse <WAREHOUSE> to role <ROLE>;
GRANT operate on warehouse <WAREHOUSE> to role <ROLE>;
-- Create a user with a strong password and default warehouse. Replace USERNAME placeholder with the username of your choice.
-- Replace the value of DEFAULT_WAREHOUSE with an existing warehouse in your instance.
CREATE USER <USERNAME> PASSWORD='<PASSWORD>' DEFAULT_WAREHOUSE=<WAREHOUSE>;
-- Assign the role created above to the new user.
GRANT ROLE <ROLE> TO USER <USERNAME>;
Step 2: Now assign permissions to the role to access all databases in the account.
In the SQL snippet below replace the ROLE
placeholder with the name of the role created in step 1. Run the SQL snippet, it will print a bunch of SQL statements for granting permissions. Copy the output and run those statements again.
SELECT
'grant usage on database ' || DATABASE_NAME || ' to role <ROLE>;'
FROM
information_schema.databases
WHERE
database_name NOT in('SNOWFLAKE', 'SNOWFLAKE_SAMPLE_DATA')
UNION ALL
SELECT
'grant usage on all schemas in database ' || DATABASE_NAME || ' to role <ROLE>;'
FROM
information_schema.databases
WHERE
database_name NOT in('SNOWFLAKE', 'SNOWFLAKE_SAMPLE_DATA')
UNION ALL
SELECT
'grant select on all tables in database ' || DATABASE_NAME || ' to role <ROLE>;'
FROM
information_schema.databases
WHERE
database_name NOT in('SNOWFLAKE', 'SNOWFLAKE_SAMPLE_DATA')
UNION ALL
SELECT
'grant select on future tables in database ' || DATABASE_NAME || ' to role <ROLE>;'
FROM
information_schema.databases
WHERE
database_name NOT in('SNOWFLAKE', 'SNOWFLAKE_SAMPLE_DATA')
Note: If you want to exclude some databases modify the SQL snippet in step 2 to include the name of the database alongside SNOWFLAKE_SAMPLE_DATA
.
Provide the createdUsername
, Password
, Role Name
, Warehouse Name
and Account Name
to register the Snowflake datasource.
Setting up Resource Monitor
A resource monitor in snowflake can help control costs and avoid unexpected credit usage caused by running warehouses. Resource monitor can suspend a warehouse if credit limit for warehouse exceeds than specified limit.
Note: resource monitor can be created from user with ACCOUNTADMIN role only
You can create Resource Monitor by executing these queries from snowflake
CREATE OR REPLACE RESOURCE MONITOR <name of resource limiter>
WITH CREDIT_QUOTA = <credit limit>
FREQUENCY = <DAILY/WEEKLY/MONTHLY/YEARLY>
START_TIMESTAMP = IMMEDIATELY
NOTIFY_USERS = (<list of users to send notification>)
TRIGGERS ON 50 PERCENT DO NOTIFY
ON 60 PERCENT DO SUSPEND
ON 100 PERCENT DO SUSPEND_IMMEDIATE;
To assign the created resource monitor to a warehouse use this query
ALTER WAREHOUSE <name fo warehouse>
SET RESOURCE_MONITOR = <name of resource limiter>;
Creating Resource Monitor from UI
Follow these steps to create resource monitor from snowflake UI
Click on Admin in snowflake UI
Figure 9: select Admin Click on resource monitors tab
Figure 10: List resource monitors Create a resource limiter from Resource monitor button in top right corner. If you want to use exisitng resource monitor skip to step
Figure 11: resource monitor button After clicking add resource monitor button, you should see a pop up this, Fill name, credit quota, monitor type , select warehouse. By default schedule is monthly, you can customize it according to your needs, follow step 5 for schedule
Figure 12: Fill resource monitor information Click on customize and select frequency type and start date from UI . You can provide custom range of time for resource monitor as well.
Figure 13: cutomize frequcny cyle of resource monitor Click on apply and click on create resource monitor button to save the entered resource monitor information. Resource monitor is created successfully, you can see information of resource monitor like this when you click on a resource limiter from list.

Validate permissions to the database
Next, the user needs to validate these permissions to the database. This ensures authorized access to the database by the credentials provided by the user. After validating the permissions to the database, the user can configure LightBeam Spectra on the system.
Prerequisite
Install snowsql
on the machine.
Steps
First, clone the repository https://github.com/lightbeamai/lb-installer
Go into
sql_user_check_snowflake/
directory
Run the script
WAREHOUSE_NAME=<WAREHOUSE NAME> ACCOUNT_NAME=<SNOWFLAKE ACCOUNT NAME> ROLE_NAME=<ROLE ASSIGNED TO USER> SF_USERNAME=<USERNAME> SF_DATABASE=<DATABASE TO CONNECT> bash run.sh
User Credentials
* WAREHOUSE_NAME:
Name of the warehouse in your Snowflake instance.
* ACCOUNT_NAME:
Name of your Snowflake account.
* ROLE_NAME:
Name of the role assigned to the user in Snowflake.
* SF_USERNAME:
Username for the Snowflake instance.
* SF_DATABASE:
Name of the database in Snowflake to which you wish to establish a connection and validate the permissions.
To validate whether the commands were successful, check the output of the file generated from the commands.
About LightBeam
LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customers’ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.
For any questions or suggestions, please get in touch with us at: [email protected].
Last updated