Delta Reporting
1. Overview
Delta Reporting provides organizations with a detailed overview of their data landscape, focusing on sensitive data discovery and trends over a four-week period. It provides automated, daily assessments of both structured and unstructured data sources, identifying sensitive data, tracking changes, and measuring policy compliance. This document outlines the technical specifications, usage instructions, and interpretation guidelines for the Delta Report. The Delta Report helps organizations in:
Providing a regular, automated assessment of an organization's data landscape to identify potential risks and opportunities.
Identifying and quantifying sensitive data across both structured and unstructured data sources, enabling targeted data protection efforts.
Tracking changes in sensitive data metrics over time to monitor progress and identify trends.
Supporting data governance efforts and ensuring compliance with data retention policies and regulatory requirements.
Enabling data-driven decision-making by providing actionable insights into the organization's data ecosystem.
The Delta Report includes percentage change calculations for various metrics, such as sensitive data counts, file counts, and column counts, to provide additional context and help users understand the relative magnitude of changes over time. These percentages are calculated using the exact counts to ensure accuracy, even though the tables and graphs may display abbreviated values.
Percentage Change = (Current Week's Count - Previous Week's Count) / Previous Week's Count * 100
Report Generation Frequency and Data Coverage
Delta Reports are generated daily, providing a rolling four-week view of the data landscape. Each report covers data from the preceding 28 days, offering an up-to-date analysis of sensitive data within the organization. This frequency ensures that organizations have access to the most recent and relevant information to effectively manage their data assets.
Report Availability and Accessibility Timeframe
Once generated, the Delta Report remains available for download for 24 hours. After this timeframe, a new report will be generated, reflecting the most recent data landscape changes. This ensures that users always have access to the latest information while maintaining a manageable storage footprint.
2. Generating the Delta Report
To download the Delta Report, follow these steps:
Navigate to the Datasources section within the application.
Locate and click on the "Delta Summary (4 weeks)" button. This will initiate the report generation process.
The Delta Report will automatically download to your designated download location, typically within a few seconds.
3. Interpreting the Delta Report
Report Structure Breakdown
The Delta Report is organized into two main sections:
Structured Data Sources: This section focuses on sensitive data findings within structured data sources, such as databases and spreadsheets.
Unstructured Data Sources: This section covers sensitive data discoveries within unstructured data sources, such as documents, emails, and files.
Page-wise Breakdown:
3.1 Table of Contents
The table of contents lists the main sections of the Delta Report and their corresponding page numbers, allowing users to quickly navigate.
3.2 Overview
The overview page provides a high-level summary of key metrics across all monitored data sources for the reporting period from April 17, 2024, to May 14, 2024. It includes:
Monitored Data Sources: The total number of data sources monitored, and the count of sources added or deleted during the period.
Total Files: The aggregate count of files across all sources, including the number of files added or deleted.
Tables with Sensitive Data: The total number of database tables containing sensitive data, and the count of tables added or deleted.
Columns with Sensitive Data: The total number of database columns with sensitive data, and the count of columns added or deleted.
Sensitive Data Types: The total number of sensitive data types identified, and the count of types added or deleted.
Sensitive Data Count: The total count of sensitive data instances across all sources, including instances added or deleted.
Document Classification: The total number of files analyzed and the count of document classifications identified.
Policies: The number of data handling or security rule sets violated and the total number of rule sets defined.
Data Subject Categories: The number of data subject categories considered at risk based on sensitive data findings and the total number of categories defined.
Percentage Change: The table also includes the Percentage Change in the total count from one week to the next.
This page provides a quick snapshot of the organization's sensitive data landscape and changes during the reporting period.
3.3 Unstructured Datasources
This page focuses on unstructured data sources, such as files and documents. It presents a table with the following columns:
Data Source Name: The name or identifier of each unstructured source.
Added High Sensitivity Instances: The number of high-sensitivity instances added to each source during the period.
Deleted High Sensitivity Instances: The number of high-sensitivity instances deleted from each source during the period.
Percentage Change: The table also includes the Percentage Change in the total number of instances from one week to the next.
The table summarizes the changes in high-sensitivity data within each unstructured source, enabling quick identification of sources with significant additions or deletions of critical sensitive information.
3.4 Total Files
This page provides a detailed analysis of the total number of files across all unstructured sources over the 4-week reporting period. It includes:
4 Weeks Trend: A summary of file count changes, including the current count, files added, and files deleted.
Trend for Total Files: A line graph visualizing the trend of the total file count over the 4-week period.
Delta Changes for PII Files: A bar graph illustrating the number of files added and deleted each week.
Date Range Breakdown Table: A table providing a granular view of file count changes for each week, including the starting and ending counts, files added, and files deleted. The table also includes the Percentage Change in the total number of files from one week to the next.
This page allows users to analyze file volume, growth, and notable changes at different granularities across unstructured sources.
3.5 Sensitive Data Count: High Sensitivity
This page focuses on the count of high-sensitivity instances within unstructured sources over the 4-week period. It includes:
4 Weeks Trend: A summary of changes in high-sensitivity instances, including the current count, instances added, and instances deleted.
Trend for Total High Sensitivity Instances: A line graph visualizing the trend of high-sensitivity instances over the 4-week period.
Delta Changes for High Sensitivity Instances: A bar graph illustrating the number of high-sensitivity instances added and deleted each week.
Date Range Breakdown Table: A table providing a granular view of high-sensitivity instance changes for each week, including the starting and ending counts, instances added, and instances deleted. The table also includes the Percentage Change in the total number of instances from one week to the next.
This page enables users to analyze the volume, growth, and notable changes of critical sensitive data within unstructured sources at different granularities.
3.6 Sensitive Data Count: Total V/S High
This page compares the total sensitive data count to the high-sensitivity count for unstructured sources over the 4-week period. It includes:
4 Weeks Trend: A summary of changes in total sensitive data instances and high-sensitivity instances, including current counts, instances added, and instances deleted for both categories.
Total vs. High Sensitivity: A stacked bar graph illustrating the comparison between total sensitive data instances and high-sensitivity instances over the 4-week period.
Date Range Breakdown Table: A table offering a weekly comparison of total and high sensitivity counts, allowing users to compare the values and identify significant changes or patterns. The table also includes the Percentage Change in total attributes and high sensitivity attributes from one week to the next.
This page helps users understand the proportion and trends of critical sensitive data within the overall sensitive data landscape for unstructured sources.
3.7 Structured Datasources
This page focuses on structured data sources, such as databases. It presents a table with the following columns:
Data Source Name: The name or identifier of each structured source.
Added High Sensitivity Instances: The number of high-sensitivity instances added to each source during the period.
Deleted High Sensitivity Instances: The number of high-sensitivity instances deleted from each source during the period.
The table also includes the Percentage Change in Deleted High Sensitivity instances and Added High Sensitivity instances from one week to the next.
The table summarizes the changes in high-sensitivity data within each structured source, enabling quick identification of sources with significant additions or deletions of critical sensitive information.
3.8 Tables with Sensitive Data
This page provides an in-depth analysis of the number of database tables containing sensitive data within structured sources over the 4-week period. It includes:
4 Weeks Trend: A summary of changes in the number of tables with sensitive data, including the current table count, tables added, and tables deleted.
Trend for Total Tables with Sensitive Data: A line graph visualizing the trend of the total number of tables with sensitive data over the 4-week period.
Delta Changes for PII Tables: A bar graph illustrating the number of tables with sensitive data added and deleted each week.
Date Range Breakdown Table: A table providing a granular view of changes in the number of tables with sensitive data for each week, including the starting and ending counts, tables added, and tables deleted.
The table also includes the Percentage Change in the number of tables with sensitive data added and deleted from one week to the next.
This page enables users to analyze the volume, growth, and notable changes in sensitive tables within structured sources at different granularities.
3.9 Columns with Sensitive Data - All
This page presents an in-depth analysis of the number of database columns containing sensitive data across all structured sources over the 4-week period. It includes:
4 Weeks Trend: A summary of changes in the number of columns with sensitive data, including the current column count, columns added, and columns deleted.
Trend for Total Columns with Sensitive Data: A line graph visualizing the trend of the total number of columns with sensitive data over the 4-week period.
Delta Changes for PII Columns: A bar graph illustrating the number of columns with sensitive data added and deleted each week.
Date Range Breakdown Table: A table providing a granular view of changes in the number of columns with sensitive data for each week, including the starting and ending counts, columns added, and columns deleted. The table also includes the Percentage Change in the number of columns containing sensitive data from one week to the next.
This page enables users to analyze the volume, growth, and notable changes in sensitive columns within structured sources at different granularities.
3.10 Sensitive Data Detected
This section provides a comprehensive overview of the various sensitive data attributes detected across all monitored structured data sources during the analysis period.
Sensitive Data Attribute Inventory An inventory of all identified sensitive data attributes is presented in a clear and well-organized table format. For each attribute, the following details are provided:
Attribute Name: The label or description assigned to the specific type of sensitive data.
Sensitivity Level: The risk level associated with the attribute, classified as High, Medium, or Low sensitivity.
Total Instances: The total number of instances or occurrences of the attribute detected across all monitored structured sources.
Data Sources Containing Attribute: The number of distinct structured data repositories where instances of the attribute were identified.
3.11 Document Classification:
This section focuses on the results of the automated document classification analysis performed on files within the monitored unstructured data sources.
Document type: The different categories of documents identified during the classification process are listed and briefly described. Examples may be provided for clarity, such as:
Unclassified: Files that do not fit into any defined category.
Identity Documents: Files containing personal identification information (e.g., government IDs, passports, driving licenses).
Financial Documents: Files related to financial data (e.g., invoices, receipts, tax forms, account statements).
Files: For each identified document category, the total number of files belonging to that category is clearly presented.
3.12 Policies Violated:
This section highlights any instances where the organization's defined data handling policies or security rules were violated within the monitored data sources during the reporting period.
Policy Violation Summary: A clear and concise summary is provided for each policy violation detected, including:
Policy Rule Set Name: The name or identifier of the violated policy or rule set.
Policy Type: The category or area the violated policy falls under (e.g., Access Controls, Data Discovery & Classification).
Affected Data Source(s): The name(s) of the monitored data source(s) where the violation occurred.
Objects Impacted: The number of data objects (e.g., files, database tables) affected by the policy violation.
Entities Impacted: The number of users, individuals, or other entities potentially impacted due to the violation.
3.13 Unstructured Data Source Overviews (p. 13, 17, 21, 28)
These pages provide an overview of each monitored unstructured data source, such as
"lb-s3
" (Page 13), "Gmail Ds
" (Page 17), "lb-g-drive
" (Page 21), and "lb-outlook
" (Page 28).
Each page includes the following key elements:
Data Source Summary: This section provides a high-level summary of the unstructured data source, including the data source name, owner, and any relevant metadata.
Key Metrics Table: This table presents essential metrics related to the data source, such as:
Total Files: The total number of files within the data source.
Sensitive Data Count (Total): The total count of sensitive data instances within the data source.
Sensitive Data Count (High): The count of high sensitivity data instances within the data source.
Entities at Risk: The number of entities (e.g.,
users
,data subjects
) associated with the sensitive data in the data source.Rule Sets Violated: The number of data handling or security rule sets violated within the data source.
Percentage Change: The table also includes the Percentage Change in the total number of files from one week to the next.
The key metrics table provides a quick overview of the data source's size, sensitive data presence, and policy compliance status.
Top Sensitive Data Types Chart: A donut chart visually breaking down the top 5 sensitive data types detected and their instance counts:
Data type names (e.g.,
ID Number
,US Driver License
,SSN
) displayedThe instance count for each data type shown
Sensitive Data Count - All Chart: A second donut chart showing all sensitive data types identified and their prevalence:
All data type names (e.g.,
Name
,Birth Date
,Address
) listedTotal instances of each data type displayed
Document Classification - Sensitive Chart: A pie chart highlighting the top 5 sensitive document classifications found:
Document classification names (e.g.,
Identity
,Financial
) shownFile counts for each sensitive document category displayed
This overview page allows users to quickly gauge each unstructured source's sensitive data risk profile. The metrics, breakdowns, and visualizations enable identifying areas of concern, prioritizing remediation efforts, and making informed data security decisions specific to each repository.
3.14 Total Files (P. 14-16, 18-20, 22-24, 29-31):
These sets of pages offer a detailed analysis of each unstructured data source over the 4-week period. For example,
Pages 14-16 focus on "lb-s3
",
Pages 18-20 on "Gmail Ds
",
Pages 22-24 on "lb-g-drive
",
Pages 29-31 on "lb-outlook
".
Each set of pages includes the following key elements:
Total Files Analysis:
4 Weeks Trend: A summary of the file count changes over the 4-week period, including the current file count, files added, and files deleted.
Trend Graph: A line graph visualizing the trend of the total file count over the 4-week period, allowing users to identify patterns, growth, or decline.
Delta Changes Graph: A bar graph illustrating the number of files added and deleted during each week, providing a visual comparison of file changes across different weeks.
Date Range Breakdown Table: A table offering a granular view of the file count changes for each week, including the starting and ending file counts, as well as the specific numbers of files added and deleted. The table also includes the Percentage Change in the total number of files from one week to the next.
High Sensitivity Instances Analysis:
4 Weeks Trend: A summary of the changes in high sensitivity instances over the 4-week period, including the current count, instances added, and instances deleted.
Trend Graph: A line graph visualizing the trend of high sensitivity instances over the 4-week period, allowing users to identify patterns, growth, or decline.
Delta Changes Graph: A bar graph illustrating the number of high sensitivity instances added and deleted during each week, providing a visual comparison of instance changes across different weeks.
Date Range Breakdown Table: A table offering a granular view of the high sensitivity instance changes for each week, including the starting and ending counts, as well as the specific numbers of instances added and deleted. The table also includes the Percentage Change in high sensitivity instances from one week to the next.
Total vs. High Sensitivity Comparison:
4 Weeks Trend: A summary of the changes in total sensitive data instances and high sensitivity instances over the 4-week period, presenting the current counts, instances added, and instances deleted for both categories.
Comparison Graph: A stacked bar graph illustrating the comparison between total sensitive data instances and high sensitivity instances over the 4-week period, showing the proportion and trends of high sensitivity data within the overall sensitive data landscape.
Date Range Breakdown Table: A table offering a weekly comparison of total and high sensitivity counts, allowing users to compare the values and identify significant changes or patterns. The table also includes the Percentage Change in total attributes and high sensitivity attributes from one week to the next.
3.15 Monitored Data Sources - Structured (Individual Sources) (P.25,32)
These pages provide an overview of each monitored structured data source, such as "MySQL
" (Page 25) and "Postgres DS
" (Page 32). Each page includes the following key elements:
1. Data Source Summary:
Data Source Name: The specific name assigned to the source (e.g., MySQL, Postgres DS).
Owner: The individual or team responsible for managing and administering the data source.
Metadata: Any additional relevant details about the data source, such as its purpose, database schema information, or access restrictions.
2. Key Metrics Table This table presents essential metrics related to the sensitive data landscape within the structured source:
Tables with Sensitive Data: The total number of database tables that contain sensitive information.
Columns with Sensitive Data: The total number of columns across all tables that hold sensitive data.
Sensitive Data Count (High): The count of instances identified as highly sensitive data within the source.
Tables/Columns Added: New database tables or columns containing sensitive data added during the reporting period.
Tables/Columns Deleted: Database tables or columns with sensitive data that were removed during the reporting period.
Entities at Risk: The number of individuals or entities potentially impacted by sensitive data exposure within this source.
Rule Sets Violated: The count of data handling or security policies violated within this particular data source.
Percentage Change: The tables also include the Percentage Change in the number of sensitive tables and columns from one week to the next.
3. Sensitive Data Visualizations This section provides visual representations of the sensitive data types present within the source:
Sensitive Data Count: High Sensitivity (Top 5): A donut chart displaying the top 5 highly sensitive data types and their respective instance counts.
Sensitive Data Count: All (Top 5): A donut chart showing the top 5 overall sensitive data types and their total instance counts.
4. Sensitive Tables Distribution This section focuses on the distribution of sensitive data across different databases within the structured source:
Distribution of Sensitive Tables Across Databases (Top 5): A pie chart illustrating the top 5 databases and the number of sensitive tables within each.
Total number of sensitive tables identified: The overall count of database tables containing sensitive data within the source.
5. Top Sensitive Data Types This section provides a detailed breakdown of the most prevalent sensitive data types detected within the source:
List of sensitive data types: A comprehensive list of the sensitive data types identified, such as Social Security Numbers, credit card information, or health records.
Instance counts: The number of instances recorded for each sensitive data type, indicating its prevalence within the source.
Risk assessment: This information helps understand the potential risks associated with the data source and prioritize data protection efforts accordingly.
3.16 Datasource Analysis - Structured (Individual Sources)(P. 26-27, 33-34)
These sets of pages offer a detailed analysis of each structured data source over the 4-week period. Pages 26-27 focus on "MySQL
",
Pages 33-34 cover "Postgres DS
".
Each set of pages includes the following key elements:
Tables with Sensitive Data Analysis:
4 Weeks Trend: A summary of the changes in the number of tables containing sensitive data over the 4-week period, including the current table count, tables added, and tables deleted.
Trend Graph: A line graph visualizing the trend of the total number of tables with sensitive data over the 4-week period, allowing users to identify patterns, growth, or decline.
Delta Changes Graph: A bar graph illustrating the number of tables with sensitive data added and deleted during each week, providing a visual comparison of table changes across different weeks.
Date Range Breakdown Table: A table offering a granular view of the changes in the number of tables with sensitive data for each week, including the starting and ending counts, as well as the specific numbers of tables added and deleted. The table also includes the Percentage Change in the number of sensitive tables from one week to the next.
Columns with Sensitive Data Analysis:
4 Weeks Trend: A summary of the changes in the number of columns containing sensitive data over the 4-week period, including the current column count, columns added, and columns deleted.
Trend Graph: A line graph visualizing the trend of the total number of columns with sensitive data over the 4-week period, allowing users to identify patterns, growth, or decline.
Delta Changes Graph: A bar graph illustrating the number of columns with sensitive data added and deleted during each week, providing a visual comparison of column changes across different weeks.
Date Range Breakdown Table: A table offering a granular view of the changes in the number of columns with sensitive data for each week, including the starting and ending counts, as well as the specific numbers of columns added and deleted. The table also includes the Percentage Change in the number of sensitive columns from one week to the next.
3.17 End of Document
This page marks the end of the Delta Report document.
About LightBeam
LightBeam automates Privacy, Security, and AI Governance, so businesses can accelerate their growth in new markets. Leveraging generative AI, LightBeam has rapidly gained customers’ trust by pioneering a unique privacy-centric and automation-first approach to security. Unlike siloed solutions, LightBeam ties together sensitive data cataloging, control, and compliance across structured and unstructured data applications providing 360-visibility, redaction, self-service DSRs, and automated ROPA reporting ensuring ultimate protection against ransomware and accidental exposures while meeting data privacy obligations efficiently. LightBeam is on a mission to create a secure privacy-first world helping customers automate compliance against a patchwork of existing and emerging regulations.
Last updated