# Tables

Tables map one-to-one with relations in the source system; in some cases they'll contain a subset of the fields from the source table, but they will never include fields that don't exist in the source system.

### Setup

It's important that you configure your tables properly during onboarding to ensure that data quality is high and privacy is properly measured. There are two properties that need to be configured for each field: the semantic **content** of the column and any specific **properties** that apply to the fields, including privacy-related properties.

#### Content

Content labels describe what sort of data is contained in each column, which affects how the system models and trains on your data. Content labels will usually be auto-populated, but it's important that you review all labels for correctness.

<table><thead><tr><th width="127.6796875">Content</th><th width="341.60546875">Description</th><th>Examples</th></tr></thead><tbody><tr><td>Categorical</td><td>A nominal field with a discrete set of values</td><td>Gender, ZIP codes, ICD-10 codes</td></tr><tr><td>Numeric</td><td>An ordinal numeric field</td><td>Age, Height</td></tr><tr><td>Datetime</td><td>A date or datetime representation</td><td>7/15/22 10:41:55, August 11 2022</td></tr><tr><td>Currency</td><td>A string that corresponds to a USD ($) amount. Currency symbol must be the first character.</td><td>$99.99, $1.05</td></tr><tr><td>Binary</td><td>Any field that contains 2 unique values</td><td>1/0, yes/no, on/off</td></tr></tbody></table>

#### Properties

Properties provide additional metadata that's important for privacy evaluation and other important tasks. Subsalt can provide support from third-party auditors for populating HIPAA-compliant privacy labels if necessary.

<table><thead><tr><th width="175.71875"></th><th></th><th></th></tr></thead><tbody><tr><td>Indirect identifier</td><td>A field that combined with other information would help single out an individual in a dataset</td><td>Age, Gender, Home state</td></tr><tr><td>Direct identifier</td><td>A field that can be used to directly single out an individual in a dataset</td><td>Names, SSNs, Contact info</td></tr><tr><td>Person's age</td><td>A field that indicates a person's age</td><td>Age, Birthdate</td></tr><tr><td>Foreign key</td><td>A field that can be used to join two or more synthetic tables.</td><td>Patient ID, Facility ID</td></tr><tr><td>Medical code</td><td>A field that contains ICD-10 codes or other classification codes</td><td>Diagnoses, procedures</td></tr><tr><td>Entity identifier</td><td>A field that contains unique IDs for entities that need to be modeled over time</td><td>Patient ID</td></tr><tr><td>Context field</td><td>A field that is static for an entity over time</td><td>Birthdate</td></tr><tr><td>Sequence key</td><td>Datetime fields that indicate the sequence of events for the entity</td><td>Visit dates</td></tr></tbody></table>

#### Ineligible fields

The only requirement for any field in a table in Subsalt is that the field must be at least 50% non-null; fields that do not meet this requirement will be automatically marked as ineligible. These fields will not be included in the synthetic database schema, so they will not be visible to or queryable by data consumers.

### Lookup tables

Lookup tables are static fact tables that contain non-personal information, such as an OMOP Concept Tables or a list of ICD-10 codes and their classifications and/or definitions. These tables have two important properties:

* They have no relationship to patients or patient populations on their own, and therefore carry no privacy risk until they're joined with patient-related information
* It's important to be able to join synthetic patient information with accurate lookup table information; the definition of a particular Concept ID shouldn't change from row to row.

Tables that have these two properties can be configured as "lookup tables" during data onboarding; Subsalt copies lookup tables into the Subsalt cluster, and *these tables are not synthesized and are exempt from privacy audits.*&#x20;

{% hint style="info" %}
It's unnecessary to configure foreign keys for or to a Lookup table. Foreign keys are only necessary to identify relationships between synthetic tables.
{% endhint %}

Be sure to review potential lookup tables with appropriate stakeholders before marking a table as a lookup table; this setting has significant privacy implications.
