Data Model Overview

When working with the various APIs and SDKs, it can be helpful to understand the MyDataHelps data model. Each interface may present the data in different ways, befitting the use cases for that interface. This section provides you with an overview of the general data concepts.

Projects

In MyDataHelps, a project represents the whole set of participants and all of the points of interaction (surveys, notifications, sensor data collection, etc.) with those participants. A project will contain all of the collected data and responses from participants. Your MyDataHelps organization can have multiple projects.

Projects - Admin View
Projects - Admin View

While many participants will only be involved in one project, they can join as many as they like.

Projects - Participant View
Projects - Participant View

Institutions

Some projects may create institutions to partition participant access. Participants and coordinator users can be assigned to institutions, and coordinators can then only view/manage participants within their institutions. Since API access operates using a service account, it is not affected by institutions. You would need to explicitly filter participants based on their institution code when querying.

Participants

Participants are all the individuals taking part in a project. Many participants will use the MyDataHelps app or website on their own device, but others will use a shared device in a clinical setting or have coordinators enter data manually on their behalf.

Participant Data

Participant data includes:

  • Survey Data (status, tasks, answers)
  • Demographic Data (name, date of birth, address, etc.)
  • Device and Sensor Data (varies by project and participant–see Device and Sensor Data for details)
  • Custom Fields (project-specific data–see Custom Fields for details)

Custom Fields

Projects can define custom data fields for participants. You might use this to store the participant’s cohort/group, a procedure or surgery date, smoking status, and more. Custom fields are stored as name/value pairs. For example, in the REST API participant query it is a simple JSON object:

"customFields": {
  "Cohort": "Red",
  "Smoker": 0,
  "SurgeryDate": "2021-08-10"
}

Through the API or Web View steps, you can store complex data within custom fields using JSON strings.

See Using Custom Fields for more information about setting up and using custom fields in your project.

Participant Identifiers

Participants in MyDataHelps have several kinds of identifiers:

Field Description Notes
ID
participantID
Internal, stable, unique identifier for the participant. This is auto-generated by MyDataHelps and cannot be changed. It is usually the best ID to use as a primary key when storing/finding participants.
Participant Identifier
participantIdentifier
Project-set, unique identifier for the participant, visible in MyDataHelps Designer. This can be set in the participant’s invitation to the project, or will be auto-generated when they join. Can be changed by the project coordinators at any time.
Link Identifier
linkIdentifier
An auto-generated identifier code used only for legacy survey completion via link feature. This feature allows participants to complete a survey without creating an account or installing the MyDataHelps app. See Completing Surveys with Survey Links for more information about setting up survey links.

Segments

Segments help to organize your participants into groups. You can use segments to control who receives certain surveys or notifications. See the Filtering With Participant Segments for more information about setting up and using participant segments.

Surveys and Tasks

Participants typically contribute data by completing surveys. Surveys can be completed in mobile apps, in web browsers, and even by project coordinators on the participants’ behalf.

Survey Schedules and Delivery

Surveys can be delivered to participants in a variety of ways. Commonly they will be delivered on an automatic schedule or triggered based on certain events (like a participant joining the project). See the scheduling user guide for more information about how survey schedules and triggers can be configured. A project coordinator can also manually deliver surveys to participants, and you can deliver surveys through the API/SDK.

Sometimes surveys are not delivered, per se, but made available to participants to complete on-demand. This can be done using the SDK or survey links.

Survey Tasks

When a survey is assigned to a participant, this is represented by a survey task. Survey tasks show up on the default “tasks” view, but can also be accessed through the API/SDK. A survey task contains additional metadata about the task status.

Since a participant may be assigned the same survey more than once (e.g., a daily mood survey or weekly post-operative follow-up), there will often be multiple survey tasks, with varying status values, associated with a single participant and survey. The example below shows a survey that has been assigned three times to two different participants.

Active Tasks

While many surveys consist of a collection of questions and responses, a survey can also be a vehicle for a ResearchKit Active Task, measuring things like cognition, gait and balance, hearing, reaction time, and more.

Survey Names and Identifiers

Each survey has several different names and identifiers:

  • ID - An auto-generated identifier.
  • Name - A project-assigned name (e.g. “Sleep Survey”).
  • Display Name - An optional override used to display the survey name to participants. For example, if the survey is named “Second Week Follow-Up” for your own reference, you may want participants to see “Weekly Check-In.”
  • Version Number - Each revision of the survey is differentiated by a sequential version number.

Survey Steps

Survey steps are the individual questions, prompts, and instructions that constitute the survey. Many steps will require responses, which are stored as survey answers.

Survey Results and Answers

A survey result is a collection of all the responses to survey steps in a single completion of a survey. This result may or may not be associated with a task. Each result has a unique identifier. As with survey tasks, you may have multiple results for a given participant and survey if they’ve completed it multiple times.

In a given survey result, there are likely responses to several different survey steps. The responses to each step are grouped into survey answers. Multiple-choice questions and forms can have multiple answers for a single step.

Survey steps have individual step identifiers, e.g., “Sleep Hours” or “painScore”. Form steps also have separate identifiers for each form item. A survey answer’s resultIdentifier field will refer to either the form item identifier (for form steps) or the step identifier (for non-form steps).

Thus, a given answer is associated with a survey, possibly a task, a survey result, a step, and possibly a form item.

Survey Answer Links
Survey Answer Links

Device and Sensor Data

In addition to survey data, participants may contribute data collected from a variety of devices, such as smart watches or fitness trackers. Your project settings determine which device/sensor data will be collected.

Other sensor data, such as weather and air quality results, is also stored as device data even though they do not come from a device. Finally, your project may use the device data API to persist its own data.

Device Data API Versions

There are currently two versions of the Device Data API/SDK: V1 and V2. V2 is a breaking change to the API, with some changes to the fundamental use cases and operation. V2 is still in development, but it is stable and appropriate for production use.

Key differences between V1 and V2 include:

  • Both versions support coarse data types, such as step counts, resting heart rates, and activity summaries. V2 also supports more granular data, such as detailed sleep or heart rate tracking and Fitbit intraday data (for authorized projects).
  • Both versions support querying for raw data points at the participant or population level. V2 adds support for aggregate queries across customizable time intervals (daily, weekly, etc.).
  • Some namespaces and data types are only supported in V1 at present. More will be added to V2 over time.
  • The Project namespace (for storing/retrieving custom project data) is currently only supported in V1.

Quick Links:

  V1 V2
API V1 API V2 API
JS SDK V1 SDK V2 SDK

Namespaces

Device data is grouped into namespaces, which represent the source applications that generate the data. Available namespaces depend on whether you are using Device Data API V1 or V2, as well as your project’s sensor data settings. We will continue to add namespaces to Device Data API V2.

Namespace API V1 API V2 Description
Project - Data persisted by project sources and SDKs.
Fitbit Data persisted by a linked Fitbit account. API V2 includes intraday data, but requires explicit approval from Fitbit. Contact support to inquire about enabling intraday access.
AppleHealth Data imported from Apple Health.
GoogleFit - Data imported from Google Fit.
Garmin Data imported from Garmin devices.
AirNowApi - Air quality index data imported from AirNow.gov.
WeatherBit - Weather forecast data imported from WeatherBit.io.
Omron - Data imported from Omron wellness products.

Data Types

The type field for device data points refers to the general category of data it represents: e.g., heart rate or steps. This field is a freeform string controlled by the source system based on its chosen categories. Each namespace has its own naming conventions, and similar concepts may be named differently in different namespaces. For example, Apple Health uses names like HeartRate and Steps, while Fitbit uses names like activities-steps or sleep.

To understand what data types are available for querying in your project’s data, you can look at one or more of the following:

Dates in Device Data

Due to variations and limitations in device data synchronization, it is possible for older data points to turn up in the system unpredictably. All device data has several date/time parameters:

  • Observed - When the data was observed/recorded by the device.
  • Inserted - When the data first arrived in MyDataHelps.
  • Modified - Some devices allow data to be modified after it is entered (e.g., adjusting activity parameters on a fitness tracker app). This date will be updated if that occurs; otherwise it will be the same as the inserted date.
  • Started - For data points that represent an interval (e.g., sleep), the observed date will typically represent the end of the interval, and a separate start date will represent the start of the interval.

The device data queries let you query with various combinations of dates (e.g., observed before/after versus modified before/after). Using a “modified after” search is a good way to find new/updated data since a prior query.

Data Availability (Real-Time / Intraday)

Device data is not provided to the MyDataHelps APIs/SDKs in real-time. It is periodically synced from the device and/or the source server, so there is often a lag between when the data is recorded by the device and when it is available through MyDataHelps.

Different systems offer different granularity for their data, and this often varies by data type and API version.

Fitbit intraday data offers a higher granularity for steps, heart rate, etc., but requires explicit approval from Fitbit. Contact support to inquire about enabling intraday access.

Data Retention

MyDataHelps stores device data indefinitely, and that data is always available via your project’s Device Data exports or the Export Explorer. However, device data is only accessible through the API for a limited time period. This time depends on the data type, API version, and project. Contact support to get specific information about data availability for your project.

Interpreting Data

The data received by MyDataHelps is ultimately controlled by the application managing the device (e.g., Apple Health or Fitbit). The available fields and their meanings will vary depending on the device. For example, distance walked from an iPhone might present the following fields:

Distance Walked Data
  ...
  "type": "DistanceWalkingRunning",
  "value": "7.8970013065263629",
  "units": "m",
  "properties": {
    "PostalCode": "12345"
  },
  "source": {
      "identifier": "6D8C7vR",
      "properties": {
          "SourceIdentifier": "com.apple.health.E9BBF489-8AE9-4D5F-983C-E66ED450FAE6",
          "SourceName": "John's Apple Watch",
          "SourceOperatingSystemVersion": "13.5.1",
          "DeviceManufacturer": "Apple Inc.",
          "DeviceModel": "iPhone",
          "UploadingDeviceIdentifier": "07e9524a-2a82-4875-b700-92d7e4024cb5"
      }
  }
... "type": "DistanceWalkingRunning", "value": "7.8970013065263629", "units": "m", "properties": { "PostalCode": "12345" }, "source": { "identifier": "6D8C7vR", "properties": { "SourceIdentifier": "com.apple.health.E9BBF489-8AE9-4D5F-983C-E66ED450FAE6", "SourceName": "John's Apple Watch", "SourceOperatingSystemVersion": "13.5.1", "DeviceManufacturer": "Apple Inc.", "DeviceModel": "iPhone", "UploadingDeviceIdentifier": "07e9524a-2a82-4875-b700-92d7e4024cb5" } }

Whereas heart rate from a blood pressure cuff might present very different values:

Blood Pressure Data
  ...
  "type": "RestingHeartRate",
	"value": "73",
	"units": "BPM", 
	"properties": { "PostalCode": "48104", "Sequence": "32" },
	"source": {
		"identifier": "BPCuff1",
		"properties":
		{
			"Model": "BP130X",
			"Serial": "17117bb6-177d-4f89-a25b-f9274eb8d4ff"
		}
	}
... "type": "RestingHeartRate", "value": "73", "units": "BPM", "properties": { "PostalCode": "48104", "Sequence": "32" }, "source": { "identifier": "BPCuff1", "properties": { "Model": "BP130X", "Serial": "17117bb6-177d-4f89-a25b-f9274eb8d4ff" } }

Notifications

MyDataHelps allows custom notifications to be sent to participants via the MyDataHelps app (a “push notification”), SMS/text, or email. You define notifications in your project configuration. Each notification is given an identifier, unique within the project (e.g., SLEEP-Reminder-Email) and other fields (such as subject and body) depending on the type of notification.

Notifications can be sent to participants based on time-based triggers or project-specified triggers (adherence, events, and more.) See the scheduling user guide for more information about how survey schedules and triggers can be configured.

External Accounts

Participants frequently have accounts with services external to MyDataHelps, such as:

  • Provider - Healthcare providers’ patient accounts, typically through an electronic medical record system.
  • Health Plan - Health plans and associated payers.
  • Device Manufacturer - Wearable devices and other consumer-facing medical technology.

MyDataHelps can enhance your project with health or sensor data imported from these external services, providing:

  1. The service is compatible with MyDataHelps.
  2. The participant connects their external account with MyDataHelps.

Once connected, MyDataHelps periodically polls and persists data from connected external accounts. Data is downloaded directly from the external provider’s server, and can be used to lead interventions or contextualize research outcomes.

Uploaded Files

Web View steps and Custom Views may allow participants to upload files. This could include:

  • An image of a test strip.
  • An audio clip of their voice.
  • A PDF of lab results.
  • etc.

Files are associated with the participant who uploaded them.