Data Model Overview
When working with the various APIs and SDKs, it can be helpful to understand the MyDataHelps data model. Each interface may present the data in different ways, befitting the use cases for that interface. This section provides you with an overview of the general data concepts.
Projects
In MyDataHelps, a project represents the whole set of participants and all of the points of interaction (surveys, notifications, sensor data collection, etc.) with those participants. A project will contain all of the collected data and responses from participants. Your MyDataHelps organization can have multiple projects.
While many participants will only be involved in one project, they can join as many as they like.
Tip
All MyDataHelps/MyDataHelps interfaces are project-specific. This means you will only receive participant data, custom fields, survey responses, task information, etc. for the project being queried.
Institutions
Some projects may create institutions to partition participant access. Participants and coordinator users can be assigned to institutions, and coordinators can then only view/manage participants within their institutions. Since API access operates using a service account, it is not affected by institutions. You would need to explicitly filter participants based on their institution code when querying.
Participants
Participants are all the individuals taking part in a project. Many participants will use the MyDataHelps app or website on their own device, but others will use a shared device in a clinical setting or have coordinators enter data manually on their behalf.
Participant Data
Participant data includes:
- Survey Data (status, tasks, answers)
- Demographic Data (name, date of birth, address, etc.)
- Device and Sensor Data (varies by project and participant–see Device and Sensor Data for details)
- Custom Fields (project-specific data–see Custom Fields for details)
Custom Fields
Projects can define custom data fields for participants. You might use this to store the participant’s cohort/group, a procedure or surgery date, smoking status, and more. Custom fields are stored as name/value pairs. For example, in the REST API participant query it is a simple JSON object:
"customFields": {
"Cohort": "Red",
"Smoker": 0,
"SurgeryDate": "2021-08-10"
}
Through the API or Web View steps, you can store complex data within custom fields using JSON strings.
Tip
Custom fields are internally stored as strings, but each field has a data type defined in the project settings. You need to know this type in order to properly store and interpret the data.
See Using Custom Fields for more information about setting up and using custom fields in your project.
Participant Identifiers
Participants in MyDataHelps have several kinds of identifiers:
Field |
Description |
Notes |
ID participantID |
Internal, stable, unique identifier for the participant. |
This is auto-generated by MyDataHelps and cannot be changed. It is usually the best ID to use as a primary key when storing/finding participants. |
Participant Identifier participantIdentifier |
Project-set, unique identifier for the participant, visible in MyDataHelps Designer. |
This can be set in the participant’s invitation to the project, or will be auto-generated when they join. Can be changed by the project coordinators at any time. |
Link Identifier linkIdentifier |
An auto-generated identifier code used only for legacy survey completion via link feature. |
This feature allows participants to complete a survey without creating an account or installing the MyDataHelps app. See Completing Surveys with Survey Links for more information about setting up survey links. |
Segments
Segments help to organize your participants into groups. You can use segments to control who receives certain surveys or notifications. See the Filtering With Participant Segments
for more information about setting up and using participant segments.
Surveys and Tasks
Participants typically contribute data by completing surveys. Surveys can be completed in mobile apps, in web browsers, and even by project coordinators on the participants’ behalf.
Survey Schedules and Delivery
Surveys can be delivered to participants in a variety of ways. Commonly they will be delivered on an automatic schedule or triggered based on certain events (like a participant joining the project). See the scheduling user guide for more information about how survey schedules and triggers can be configured. A project coordinator can also manually deliver surveys to participants, and you can deliver surveys through the API/SDK.
Sometimes surveys are not delivered, per se, but made available to participants to complete on-demand. This can be done using the SDK or survey links.
Survey Tasks
When a survey is assigned to a participant, this is represented by a survey task. Survey tasks show up on the default “tasks” view, but can also be accessed through the API/SDK. A survey task contains additional metadata about the task status.
Tip
Some surveys, such as coordinator-initiated surveys or surveys that can be accessed on-demand from a project dashboard, may not have an associated task.
Since a participant may be assigned the same survey more than once (e.g., a daily mood survey or weekly post-operative follow-up), there will often be multiple survey tasks, with varying status values, associated with a single participant and survey. The example below shows a survey that has been assigned three times to two different participants.
Active Tasks
While many surveys consist of a collection of questions and responses, a survey can also be a vehicle for a ResearchKit Active Task, measuring things like cognition, gait and balance, hearing, reaction time, and more.
Tip
Despite the similar names, don’t confuse Active Tasks with the regular survey tasks. An Active Task is just a special kind of survey. A survey task is a task assignment, as described above.
Survey Names and Identifiers
Each survey has several different names and identifiers:
- ID - An auto-generated identifier.
- Name - A project-assigned name (e.g. “Sleep Survey”).
- Display Name - An optional override used to display the survey name to participants. For example, if the survey is named “Second Week Follow-Up” for your own reference, you may want participants to see “Weekly Check-In.”
- Version Number - Each revision of the survey is differentiated by a sequential version number.
Survey Steps
Survey steps are the individual questions, prompts, and instructions that constitute the survey. Many steps will require responses, which are stored as survey answers.
Survey Results and Answers
A survey result is a collection of all the responses to survey steps in a single completion of a survey. This result may or may not be associated with a task. Each result has a unique identifier. As with survey tasks, you may have multiple results for a given participant and survey if they’ve completed it multiple times.
In a given survey result, there are likely responses to several different survey steps. The responses to each step are grouped into survey answers. Multiple-choice questions and forms can have multiple answers for a single step.
Survey steps have individual step identifiers, e.g., “Sleep Hours” or “painScore”. Form steps also have separate identifiers for each form item. A survey answer’s resultIdentifier
field will refer to either the form item identifier (for form steps) or the step identifier (for non-form steps).
Thus, a given answer is associated with a survey, possibly a task, a survey result, a step, and possibly a form item.
Device and Sensor Data
In addition to survey data, participants may contribute data collected from a variety of devices, such as smart watches or fitness trackers. Your project settings determine which device/sensor data will be collected.
Other sensor data, such as weather and air quality results, is also stored as device data even though they do not come from a device. Finally, your project may use the device data API to persist its own data.
Device Data API Versions
There are currently two versions of the Device Data API/SDK: V1 and V2. V2 is a breaking change to the API, with some changes to the fundamental use cases and operation. V2 is still in development, but it is stable and appropriate for production use.
Tip
During this transition period, access to V2 is not enabled by default.
Contact support if you wish to enable V2 for your project.
Key differences between V1 and V2 include:
- Both versions support coarse data types, such as step counts, resting heart rates, and activity summaries. V2 also supports more granular data, such as detailed sleep or heart rate tracking and Fitbit intraday data (for authorized projects).
- Both versions support querying for raw data points at the participant or population level. V2 adds support for aggregate queries across customizable time intervals (daily, weekly, etc.).
- Some namespaces and data types are only supported in V1 at present. More will be added to V2 over time.
- The
Project
namespace (for storing/retrieving custom project data) is currently only supported in V1.
Note
V1 will continue to remain available for the foreseeable future for backwards compatibility. You may use both versions in parallel, but be aware that V1 and V2 use different data stores. Once V2 is enabled for a project, prior data available in V1 will not be accessible from V2 APIs, and new data recorded by V2 will not be available from V1 APIs.
Contact support for assistance with transitioning between V1 and V2 if you have existing data.
Quick Links:
Namespaces
Device data is grouped into namespaces, which represent the source applications that generate the data. Available namespaces depend on whether you are using Device Data API V1 or V2, as well as your project’s sensor data settings. We will continue to add namespaces to Device Data API V2.
Namespace |
API V1 |
API V2 |
Description |
Project |
|
- |
Data persisted by project sources and SDKs. |
Fitbit |
|
|
Data persisted by a linked Fitbit account. API V2 includes intraday data, but requires explicit approval from Fitbit. Contact support to inquire about enabling intraday access. |
AppleHealth |
|
|
Data imported from Apple Health. |
GoogleFit |
|
- |
Data imported from Google Fit. |
Garmin |
|
|
Data imported from Garmin devices. |
AirNowApi |
|
- |
Air quality index data imported from AirNow.gov. |
WeatherBit |
|
- |
Weather forecast data imported from WeatherBit.io. |
Omron |
|
- |
Data imported from Omron wellness products. |
Data Types
The type
field for device data points refers to the general category of data it represents: e.g., heart rate or steps. This field is a freeform string controlled by the source system based on its chosen categories. Each namespace has its own naming conventions, and similar concepts may be named differently in different namespaces. For example, Apple Health uses names like HeartRate
and Steps
, while Fitbit uses names like activities-steps
or sleep
.
To understand what data types are available for querying in your project’s data, you can look at one or more of the following:
Dates in Device Data
Due to variations and limitations in device data synchronization, it is possible for older data points to turn up in the system unpredictably. All device data has several date/time parameters:
- Observed - When the data was observed/recorded by the device.
- Inserted - When the data first arrived in MyDataHelps.
- Modified - Some devices allow data to be modified after it is entered (e.g., adjusting activity parameters on a fitness tracker app). This date will be updated if that occurs; otherwise it will be the same as the inserted date.
- Started - For data points that represent an interval (e.g., sleep), the observed date will typically represent the end of the interval, and a separate start date will represent the start of the interval.
The device data queries let you query with various combinations of dates (e.g., observed before/after versus modified before/after). Using a “modified after” search is a good way to find new/updated data since a prior query.
Data Availability (Real-Time / Intraday)
Device data is not provided to the MyDataHelps APIs/SDKs in real-time. It is periodically synced from the device and/or the source server, so there is often a lag between when the data is recorded by the device and when it is available through MyDataHelps.
Different systems offer different granularity for their data, and this often varies by data type and API version.
Fitbit intraday data offers a higher granularity for steps, heart rate, etc., but requires explicit approval from Fitbit. Contact support to inquire about enabling intraday access.
Data Retention
MyDataHelps stores device data indefinitely, and that data is always available via your project’s Device Data exports or the Export Explorer. However, device data is only accessible through the API for a limited time period. This time depends on the data type, API version, and project. Contact support to get specific information about data availability for your project.
Interpreting Data
The data received by MyDataHelps is ultimately controlled by the application managing the device (e.g., Apple Health or Fitbit). The available fields and their meanings will vary depending on the device. For example, distance walked from an iPhone might present the following fields:
...
"type": "DistanceWalkingRunning",
"value": "7.8970013065263629",
"units": "m",
"properties": {
"PostalCode": "12345"
},
"source": {
"identifier": "6D8C7vR",
"properties": {
"SourceIdentifier": "com.apple.health.E9BBF489-8AE9-4D5F-983C-E66ED450FAE6",
"SourceName": "John's Apple Watch",
"SourceOperatingSystemVersion": "13.5.1",
"DeviceManufacturer": "Apple Inc.",
"DeviceModel": "iPhone",
"UploadingDeviceIdentifier": "07e9524a-2a82-4875-b700-92d7e4024cb5"
}
}
Whereas heart rate from a blood pressure cuff might present very different values:
...
"type": "RestingHeartRate",
"value": "73",
"units": "BPM",
"properties": { "PostalCode": "48104", "Sequence": "32" },
"source": {
"identifier": "BPCuff1",
"properties":
{
"Model": "BP130X",
"Serial": "17117bb6-177d-4f89-a25b-f9274eb8d4ff"
}
}
Notifications
MyDataHelps allows custom notifications to be sent to participants via the MyDataHelps app (a “push notification”), SMS/text, or email. You define notifications in your project configuration. Each notification is given an identifier, unique within the project (e.g., SLEEP-Reminder-Email
) and other fields (such as subject and body) depending on the type of notification.
Notifications can be sent to participants based on time-based triggers or project-specified triggers (adherence, events, and more.) See the scheduling user guide for more information about how survey schedules and triggers can be configured.
External Accounts
Participants frequently have accounts with services external to MyDataHelps, such as:
- Provider - Healthcare providers’ patient accounts, typically through an electronic medical record system.
- Health Plan - Health plans and associated payers.
- Device Manufacturer - Wearable devices and other consumer-facing medical technology.
MyDataHelps can enhance your project with health or sensor data imported from these external services, providing:
- The service is compatible with MyDataHelps.
- The participant connects their external account with MyDataHelps.
Once connected, MyDataHelps periodically polls and persists data from connected external accounts. Data is downloaded directly from the external provider’s server, and can be used to lead interventions or contextualize research outcomes.
Note
It is not necessary for participants to connect to their Apple Health and Google Fit accounts as external services. This data comes through the participant’s phone, depending on your project’s sensor data settings and the participant’s data sharing settings.
Uploaded Files
Web View steps and Custom Views may allow participants to upload files. This could include:
- An image of a test strip.
- An audio clip of their voice.
- A PDF of lab results.
- etc.
Files are associated with the participant who uploaded them.