Virginia Secure Analytics and Governance Environment (SAGE)
DataSAGE is Virginia’s Secure Analytics and Governance Environment and launched August 2020. The Commonwealth of Virginia has over 1,400 different data systems that hold open and restricted data assets that include de-identified data as well as personally identifiable information (PII). The ability to link information about individuals across multiple systems relies on matching PII across these systems in a secure way. To increase efficiency and remove vulnerabilities in sharing PII, Virginia's first Chief Data Officer, Carlos Rivero, created the . Essentially, DataSAGE is the technical implementation of the Commonwealth Data Trust. The PII data from different systems are ingested into a consolidated, secure environment managed by the Office of Data Governance and Analytics to build an anonymized crosswalk creating a master person identifier for each unique individual. This universal identifier is then mapped to the local identifier for that same individual in whatever system they may exist. The anonymized crosswalk table is used to match the de-identified attribute data for an individual across multiple systems. De-identified attribute data may include fields like gender, driver’s license status, participation in social service programs, etc.
DataSAGE facilitates sharing of restricted-use data with research partners, state agencies, Commonwealth localities, and other organizations whose projects have been approved by the appropriate data owners.
Restricted-use data can be aggregated and summarized to allow for public consumption through the Open Data Portal. A variety of different data products can be automatically published directly to the Open Data Portal. This process and these decisions are governed by the . Learn more about the that occurs before the restricted data assets reach DataSAGE.
Access the DataSAGE platform.
Enterprise Data Catalog
The Enterprise Data Catalog is an inventory of the Commonwealth’s data assets. This repository of data combined with metadata is a powerful tool for self-service business intelligence, empowering Commonwealth agencies to make more informed decisions. The Commonwealth Metadata Dictionary is accessible to everyone, supporting data discovery and sharing. Metadata is essentially “data about data” or what is needed to enable the discovery and use of data by a defined set of users. This includes names of data tables, data elements within the tables and views, descriptions of the data elements, and how it is related to other data that individuals might want to use. The metadata provides valuable context about data assets that enable individuals to share data in useful ways. Well-defined metadata can enable the transformation of data into intelligence and actionable insight. Some metadata contains information relevant to the retention and archiving requirements for data and allows organizations to track data assets for compliance purposes. Finally, when fully documented, metadata contains provenance and sourcing information that enables data analysts and data scientists to accurately cite data for research purposes. Metadata standards are important to help ensure interoperability between systems and enhance the discovery of, and access to, data.
The Commonwealth Metadata Dictionary provides valuable context about data assets that enable individuals to share data in useful ways, ultimately enabling the transformation of data into intelligence and actionable insight.
Virginia Open Data Portal
The Virginia Open Data Portal is a collaboration with the Library of Virginia and serves to extend access to Commonwealth of Virginia data empowering our constituents to interpret, analyze, and transform our data into actionable intelligence. Secure and appropriate data sharing is fundamental to the success of our society because information supports engagement. Commonwealth data is a strategic asset that when leveraged, can drive innovation, increase quality of life, and promote economic growth. The Virginia Open Data Portal provides more than just data access. Within the portal, you can view stories and dashboards, create visualizations, filter data, and access it via APIs (application programming interfaces) to build solutions in web and mobile applications.
Data Documentation Process
The Data Documentation Process enables data stewards from different agencies to connect to the and run a local tool that allows them to connect to their respective databases and then register the metadata for the tables and views they select. This process empowers and engages the data stewards in a meaningful way and gives them the authority to identify and register their highest value data assets.
Once the data assets are well documented, the team tags them for review which notifies the agency data steward. If the data asset documentation is approved by the data steward it can be published into the Enterprise Data Catalog.
This workflow allows the data asset to be automatically harvested by and incorporated into the Enterprise Data Catalog.