Skip to content

Managing Sensitive Research Data at Duke

Definitions

Sensitive Data

Information, that if inadvertently released, could place the research participants and/or the institution at risk of harm. Failure to protect sensitive information could result in financial losses, reputational damage and legal repercussions for the institution.
Harms to participants could include participants’ relationships, status, employability, or insurability. Participants could face criminal or civil prosecution, and in some cases, physical harm. The assessment of risk must take into account the participants’ culture, age, life experience and any other relevant characteristics.

Individually Identifiable Data

This includes: - Direct identifiers – such as participants’ names, email addresses, or other information that explicitly identifies an individual. - Indirect identifiers – also known as demographic data, which are combinations of characteristics that could allow someone to deduce a participant’s identity.

Examples of Indirect Identifiers

  • Position, gender, and length of service in a named company
  • Age, major, ethnicity, and year in school

Data Classification and Protection Plans

  • Consult the Campus Institutional Review Board (IRB) about the classification of your data before/during the development of your research protocol.

  • See the Duke Data Classification Standards.

  • The online SecureIt tool was designed to help researchers identify approved Duke services they can use to collect, store, and analyze research data.

  • The Information Technology Security Office (ITSO) reviews research protocols that involve sensitive data or data governed by contractual agreements. ITSO will notify researchers and the IRB if any changes to the proposed data protection procedures are necessary.

  • ITSO can be contacted directly at security@duke.edu.

NOTE: The University has determined that any research studies that collect or use direct (e.g., names) or indirect (that is, demographic) data about Duke students may meet the "sensitive" data classification. If the data you intend to collect are considered “sensitive” they must be protected to mitigate risk.


Duke's Research Data Policy

Sensitive, individually identifiable data must be protected during all phases of a research project. The following are some best practices from the Campus IRB and ITSO to prevent an inadvertent breach of individually identifiable, sensitive data during the collection, transfer, storage, analyses, and report phases of your project.

Per Duke’s Research Data Policy, the Principal Investigator is responsible for project personnel interacting with sensitive data, including its appropriate management and protection over the course of a project.


Best Practices for Protecting Sensitive Data

Sensitive, individually identifiable data must be protected during all phases of a research project. Below are recommended best practices from the Campus IRB and Duke's IT Security Office (ITSO) to help prevent inadvertent breaches throughout the research lifecycle:

  • Data Collection
  • Data Transfer
  • Data Storage
  • Data Analysis
  • Reporting

Collection

  • Data collection using online services should be conducted using a secure platform, such as Qualtrics or Duke Box
  • If carrying out virtual interviews or focus groups (including recording), researchers must use their Duke-sponsored Zoom accounts.

If limited resources make it necessary to use pen and pencil to collect sensitive data in the field, paper documents should be identified using a unique ID number, not the participants’ names or any other direct identifiers.
The key linking direct identifiers to unique ID numbers may be taken into the field on an encrypted device.


Transfer

  • All sensitive, identifiable data collected in the field should be transferred as soon as possible to a secure environment (e.g., Protected Network for Research, Duke Box) at Duke.
  • Never send sensitive data as email attachments. Instead:
  • Upload to Duke Box
  • Use encrypted transfer protocols like SFTP.

Data Use Agreements (DUAs)

If your research involves storage or analysis of existing data and requires a DUA: - Append it to your IRB protocol
- IRB will route to: - Office of Research Support (ORS) — for contractual review - ITSO — for data security review

The Protected Research Data Support team in the Office of Research & Innovation (OR&I) can assist with DUA agreement questions or concerns.
Email to request a consultation: researchdatasupport@duke.edu

Some data providers require a DUA even for non-identifiable data — such data is considered sensitive.

Compliance is the responsibility of the Principal Investigator and Duke personnel involved.


Storage and Analyses

đź”’ Protected Network for Research (PNR)

Duke’s most trusted environment for securely handling sensitive research data.

Using the Protected Network for Research (PNR) expedites the IRB/ITSO data security review process.

Benefits: - Free tier available for most studies - Web browser access on Duke-managed or personal devices - No VPN required - Easy install of analytics software: Stata, RStudio, SAS, NVivo - Integrates with: Microsoft OneDrive, Qualtrics, Zoom, Box - Box integration enables self-service export - Adheres toward NIST 800-171 Standard - Built-in controls for: - De-identification - Key storage

Other Options for Secure Storage and Analysis

NOTE: Removal of data from Duke Box for analysis should be temporary.
Avoid using Box Drive for data analysis; it stores files locally.
See Using Duke Box with Sensitive Research Data

If the research involves only qualitative writing—without the use of analysis tools (e.g., SAS, NVivo)—then Duke Box may suffice, and the PNR may not be necessary.


Reporting

To reduce risk of re-identification, follow these practices: - Report aggregate data with large enough groups - Use general terms (e.g., income or age ranges) - Use pseudonyms instead of real names - Use broad roles (e.g., “tradesperson” vs. “carpenter”) - Use vague locations (e.g., “a midsize city in Western Africa”)


Use of Software Services with Sensitive Data

The ITSO and Privacy Office will assess software usage case-by-case:

NOTE: Inform the IRB if sensitive data is low risk (e.g., de-identified) — this can expedite review.

Assessment & Review Options

Vendor Risk Assessment

Used if software is not previously assessed
Takes ~1 month.
Researchers should consult the Duke Services and Data Classification Guide to determine whether a vendor or tool has already been assessed and approved.

Minimal-Risk Software Review

Used if: - The vendor will not undergo a full risk assessment - The data involved is not “contractually restricted” or classified as “high-risk sensitive”

This streamlined review is conducted jointly by Duke’s Privacy Office and the Information Technology Security Office (ITSO).
It emphasizes the importance of transparent data practices, particularly the use of informed consent related to how data will be collected, stored, and managed. This process is intended for low-risk use cases where traditional vendor review is not feasible but privacy considerations still apply.


Use of AI Tools with Sensitive Data

When working with sensitive or protected data, researchers must exercise caution when using AI tools such as chatbots, large language models, or other AI-powered services. AI-generated outputs may retain or expose elements of the input data, and many AI platforms store user prompts for future model training, creating risk for reidentification or data leakage.

To mitigate these risks, Duke recommends using only Duke-managed machines and services when interacting with AI platforms and working with sensitive data for research purposes. The Duke AI Suite and AI Dashboard provide secure, institutionally supported environments for AI use that align with university policies.

Duke’s Data Classification Standard helps researchers identify what constitutes sensitive data. To ensure proper handling, researchers should also consult the Duke Services and Data Classification Guide, which outlines which Duke-approved tools and services are appropriate based on the classification of the data.

Do not input sensitive or identifiable information into public or non-Duke AI tools.

When in doubt, consult with the Information Technology Security Office (security@duke.edu) before using AI tools for research involving protected data


Use of Personal Devices with Sensitive Data

Use of non-Duke-managed devices for analyzing sensitive data is not recommended and may be prohibited for contractually restricted data. Sensitive data should never be stored on personal devices.
Transfer collected/analyzed data to secure storage (e.g., Duke Box) ASAP.

If personal devices are used researchers must attest to use of the fundamental practices listed below:

The Fundamental Practices (Laptops & Tablets)

Follow these steps to ensure good cyber hygiene for personal devices:


The Expert Practices

After addressing the basics, further enhance your personal security:


Practices to Avoid

When working with research data, avoid these practices:

  • Forwarding your Duke email to a personal email account
  • Using a personal collaboration or storage suite (e.g., DropBox) instead of a Duke service (e.g., Duke Box, Duke OneDrive)
  • Using a personal account on a development platform (e.g., GitHub) instead of Duke’s approved development tools (e.g., Gitlab.oit.duke.edu)

Mobile Phones

  • Use a Passcode, FaceID or Fingerprint to restrict access
  • Turn on encryption
  • Regular operating system updates to iOS or Android OS
  • Use "Find my iPhone or Prey remote wipe software

Personal Device Security

For more details on how to properly secure personal devices used in research, please refer to the following guide:

Personal Device Security Guide (PDF)

This guide outlines essential best practices for endpoint security, including password management, encryption, software updates, and more.