Research Data Protection Policy and Procedure for Research with Human Subjects

Purpose

Davidson College has developed the human subjects research data protection policy to assist investigators in providing appropriate protections for human subjects data. As the range of data storage options increases, so does the range of possible risks to data security. To ensure that research participants are adequately protected from the risk of harm, investigators affiliated with Davidson College are required to adhere to the policy in planning and conducting their human subjects research studies.

Scope

This research data protection policy covers all data collected and used throughout the human subjects research project.  The policy defines authorized users, the nature of data, and the nature of data storage.

“Privacy,” “confidentiality,” and “anonymity” are three terms frequently used, and often misunderstood, in human subjects research.  Each of these terms is reviewed below and guidance is provided for using these terms in human subject research study protocols and documents.

Privacy

“Privacy” is defined as an individual’s control over the extent, timing, and circumstances of sharing him/herself (physically, behaviorally, or intellectually) with others.

Privacy pertains to people (whereas confidentiality pertains to data), and privacy is a right that can be violated. In the US, there are legal rights to certain types of privacy (e.g., Fourth Amendment, the Health Insurance Portability and Accountability Act [HIPPA], Family Educational Rights and Privacy Act [FERPA]).  International laws regarding privacy vary (e.g., the EU’s General Data Protection Regulation [GDPR], 2018).

In general, researchers must protect research participants’ privacy throughout their participation in a study, including during recruitment and data collection. Data must be stored securely and in a form that prevents, where possible, the identification of individuals by others. 

An important point:

  • For research with students/student data, FERPA covers most student work generated in courses. Accordingly, students must give written consent for their class assignments (e.g., papers, quizzes) to be used in research.
  • The IRB approved an umbrella protocol, “Pedagogical Research at Davidson College,” also known as the “CTL Umbrella” protocol on February 11, 2019.  This protocol allows faculty to apply to the Dean of Faculty for inclusion under the protocol so that coursework may be used in pedagogical research.  Authorized faculty must included standard verbiage provided by the Director of the CTL, Dr. Mark Barsoum, in their syllabi for courses included under the protocol and must provide the syllabi including this language to students before they may used student coursework for pedagogical research.  You may visit the Umbrella IRB for Pedagogical Research page or reach out to Dr. Barsoum for more information at mabarsoum@davidson.edu.

Confidentiality

“Confidentiality” is defined as the treatment of information (data) disclosed in a trust relationship and with the expectation that it will not be divulged without permission to others in ways inconsistent with the understanding of the original disclosure.

Confidentiality is an agreement among parties made via the consent process. Researchers must keep participants’ contributions to the research confidential unless participants have agreed otherwise (preferably in writing). Researchers, however, should not guarantee absolute/complete confidentiality and must inform participants of this, particularly when data are identifiable and mandatory reporting laws apply to a study. As with privacy, confidentiality requires that researchers keep data secure, regardless of format (e.g., paper, digital).

Some important points:

  • Investigators and their research teams should consider the security of data during collection, transfer, use, and sharing, and while in storage, both short- and long-term.
  • Investigators and their research teams should treat all participant data and associated study documentation (e.g., list of participant codes) as confidential, and code and store them in a secure manner in accordance with Davidson’s Human Subjects Research Data Protection Policy.

Anonymity

“Anonymity,” in the context of human subjects research, means that researchers are not collecting any identifiers (e.g., name, address, telephone number, IP addresses that link information/records/samples, either individually or when combined with other variables, to the individual from whom they were obtained.  It is extremely difficult to anonymize data or fully de-identify data, particularly in small, limited/bounded samples. For example, approximately 87% of Americans may be identified by a combination of zip code, date of birth, and gender

Because data are so difficult to anonymize, participation in the research may present risk if there is a breach of confidentiality or data are re-identified and identifiable information is released without participants’ permission.   

Davidson’s HSIRB therefore recommends against promising anonymity to research participants.  Rather, the HSIRB recommends describing in consent documentation the steps you and your research team will take to protect the confidentiality of the data you collect.

Some important points:

  • Researchers should not describe as anonymous data collected in-person (e.g., via interviews, videorecording).
  • The existence of a list of codes and associated identifiers (e.g., names) means that data are not anonymous. Researchers may characterize data as anonymous (or de-identified) once such a list of codes and identifiers is destroyed if combinations of other variables collected will not identify unique individuals (e.g., a 22-year-old female Asian senior at UNC), particularly in small, bounded/limited samples.

Web-based surveys (e.g., using Qualtrics) can facilitate anonymous data collection. In order to characterize participation in a web-based survey as anonymous, however, researchers must disable the feature allowing the collection of IP (internet protocol) addresses (which may identify the computer user) in addition to not collecting any demographic information sufficient to identify unique individuals, particularly in small, bounded/limited samples (such as a combination of gender, age, race, work‐site, etc.).

The following words and phrases are commonly used when discussing information and data protection in human subjects research.

Principal Investigator (PI): The primary or lead investigator who is responsible for the security of the data collected.

Authorized user:  Users who have been granted permission to access data collected during the research project. With the partial exception of Davidson College Technology & Innovation (T&I) as institutional data stewards, all other persons are referred to as “unauthorized persons”.

Human subjects research data: Data collected, stored, or analyzed during a human subjects research project.

Data elements (variables): The types of data being collected, stored, or analyzed (e.g., Name, DOB, Gender, Race, Student ID, SSN) during a research project. Some researchers may be more familiar referring to data elements as variables.

Confidential data: Any human subjects data that might compromise the anonymity or privacy of respondents to the research project, including but not limited to:

A person's first name or first initial and last name in combination with identification numbers including but not limited to Social Security Numbers (SSN), passport numbers, driver’s license number, employer taxpayer-identification numbers; financial information, including checking account numbers, savings account numbers, credit card numbers, debit card numbers, or Personal Identification (PIN) Codes as defined in G.S. 14-113.8(6); digital signatures, biometric or fingerprint identification, geographic identification of areas smaller than Census Division, including, but not limited to state, county, minor civil division, primary sampling unit (PSU), segment, city, place, zip code, tract, block numbering area, enumeration district, block group, or block; combinations of sensitive personally identifiable information (SPII) (e.g., name and date of birth). This definition of Confidential Data aligns with the North Carolina Identity Theft Act of 2005 (https://www.ncleg.net/Sessions/2005/Bills/Senate/HTML/S1048v6.html), amended in 2019 (https://www.ncleg.gov/Sessions/2019/Bills/House/PDF/H904v1.pdf).

Coded data: Data where identifying information (e.g., name) has been replaced with a code (e.g., number, letter, symbol), and a list of codes and associated identifiers exists.

Internal data: Data collected as part of a human subjects project only accessible to members of an institution or organization that were not collected via public sources but also do not rise to the level of confidential data.

Public data: Human subjects data that were collected via available public data sources.

Sensitive data: Data that were provided to the investigator on the belief that it would not or could not be attributed to the individual, such as participant responses that if disclosed in an identifiable way would pose harm to participants, e.g., answers related to sensitive topics such as sexual orientation, sexually-transmitted diseases, incest, rape or date-rape, sexual harassment, molestation, race relations, use of licit or illicit drugs, eating disorders, abortion, contraception or pregnancy, the subjects' own mental health (suicide, depression, compulsive behaviors), religion, illegal conduct, stressful experiences.

Identifiable data: Information that could be used to ascertain a person’s identity.  This may include Identifiable Private Information (see definition below), Confidential Data including information defined by North Carolina law, and other combinations of data points such as graduation date, geographic location, IP address, demographic information, etc.

Identifiable Private Information: Defined in the federal regulations as: “Private information for which the identity of the subject is or may readily be ascertained by the investigator or associated with the information.”

De-identified data: A data set that has identifiable data removed to prevent the identification, either directly or via correlation, of the subject of the data.  This data may be kept in perpetuity and shared in public data repositories in keeping with transparency and reproducibility regulations and practices, consistent with the commitment to participants in the consent forms.

Data retention: The length of time that research data that have not been de-identified will be kept in order to meet contractual, research, funding agency requirements, or by law; e.g., lists of participants that received a participant incentive to provide to the IRS in the event of an audit. 

Electronic asset: In the human subjects research context, any device on which research data will be accessed or stored, including, but not limited to, a laptop, smart phone, portable hard drive, USB drive, SD card (video or audio storage).

Encryption: The process of encoding data into another form, or code, so that only people with access to the associated encryption key can decode it. Data can be encrypted when it is “at rest” (i.e., being stored) or “in-transit” (i.e., being moved between locations).

Air gapped device: a device set up by T&I to isolate the device and its data from all other devices (i.e., the device will have no connections to the network or internet). An air gapped device is not always needed but may be required by external data use agreements or data use requirements.

Secure electronic storage location: A file storage location (e.g., secure shared Google Drive) for which security and permissions are maintained by T&I. T&I sets the security settings for these electronic storage locations and these settings are not editable by the user.

Secure physical storage location: A storage location where physical data (e.g., paper lists of participant names, signed consent forms) are in a location for which only authorized users have access such as a locked drawer or file cabinet.  Researchers should follow clean-desk practices, e.g., not leaving documents with sensitive and/or identifiable data on a desk or table where other personnel may inadvertently access it.

When completing the HSIRB Protocol Application, you will be directed to address data protection in the following ways.

Co-Investigators (authorized users):It is important to identify all users that are authorized persons for the project. The users you list will be the only users allowed full access to project data with the exception of T&I staff as Davidson’s institutional data stewards.  You will be prompted to list the name of the PI as well as all co-investigators.

Electronic Asset Inventory: It is important to identify all electronic assets that will be used during the project, including but not limited to computers, flash drives, removable hard drives, etc. 

Please list any device that will be used for human subjects data collection, storage, or access.

We recommend that data are stored within a Davidson secured system (e.g., Computer, Google Drive). Where prompted in the application, list also the location of the device, persons with physical access (other than co-investigators), security provisions, and backup procedures.

Clarifying note: The tag number for a Davidson device will be located on a white sticker on the back or bottom of the unit.  A device name can be found in the Settings > About > Device name on a PC, and in the Apple menu > System Preferences > Sharing Preferences on a Mac. 

Example:

Asset Type Tag Number /

Device Number
Make Model Serial Number

Computer

W1111

Apple Macbook 123456789

Data Elements: In addition to providing your protocol description and/or uploading your protocol materials that describe the data you will collect, you will be prompted to list separately all sensitive and/or confidential data elements (see above definitions) to assist with data security review. 

All researchers with human subjects at Davidson College are required to comply with the below described data protection policy and to educate and ensure compliance of their research team members.

Scope: This security plan covers all human subjects research data (see above definition) used throughout the human subjects research project.

  • Informed consent forms or documentation containing identifiable data (see above definition) will be stored in a separate location from the corresponding data set.
  • The Investigator(s) will store the collected human subjects research data in a location that meets the standard of security set in Davidson’s data classification section of the Data Security Policy
Category Definition Appropriate Store Locations

Confidential data

Data whose breach or inadvertent disclosure would violate state or federal privacy or data security laws (including certain research grant obligations) and may involve civil or criminal penalties. These data must be shared only with specific individuals who have a need to access. Includes data protected by Gramm-Leach-Bliley, HIPAA, the NC Identity Theft Act, or similar laws.

Store and access on an encrypted, air gapped computer, secure electronic or physical storage location (e.g., specially requested Davidson Google Drive Storage set up by T&I).

Restricted data

Data that may be shared only with specific individuals who have a business need to access, and where breach or inadvertent disclosure would impact Davidson’s reputation or violate educational privacy requirements (FERPA).

Store on encrypted device or authorized Davidson College Services such as Google Drive, Moodle and Banner.

Internal data

Data that Davidson chooses to restrict to internal access, but where disclosure would not violate state or federal laws or cause reputational harm.

Store on Davidson-approved technology service requiring Davidson login, such as Davidson Google Drive Storage, Davidson or Personal Devices, other Davidson approved solutions (e.g., Sona Systems). 

Access on Davidson-managed devices and employee-owned computers and mobile devices.

Public data

Any data that is permitted to be shared freely with all members of the campus and the general public.

Store and processed on Davidson-approved technology service. 

Access on Davidson-managed devices and employee-owned computers and mobile devices.

  • Access to human subjects research data will be restricted to the authorized users in the investigator(s)’s HSIRB application and institutional data stewards.
    • Note: Principal Investigator(s) accept the responsibility of granting physical and electronic access for the human subjects research data files to authorized users.
  • The creation and storage of backups of human subjects research data will be the sole responsibility of the investigator(s). Backups will be treated in the same manner as the original data. This includes - but is not limited to - the location (e.g., encrypted USB drive) and the users who have access to the backups.
    • Note: T&I recommends that, to prevent data loss from drive failure, cyberattacks, or accidental deletion, investigators should store backup data outside of their production environment (i.e., not the same device on which your original data files are stored).
  • Paper printouts or physical data containing sensitive data will be secured at all times when not in use for the research project (e.g., locked in a desk drawer or filing cabinet).  Printouts or physical data will be disposed of in an appropriate manner (e.g., cross-cut shredding, biohazard incinerator bin).

Data Retention: De-identified data may be kept in perpetuity for the purposes of open and transparent science practices.  Researchers should define in their HSIRB application a specific length of time to retain data that have not been de-identified in order to meet contractual, research, funding agency requirements, or by law; e.g., lists of participants that received a participant incentive to provide to the IRS in the event of an audit.

  • IMPORTANT NOTE: Student access to their Davidson Google Drive is terminated one year after graduation.  If there is a need to retain the data beyond that period, student researchers should work with their faculty sponsor to transfer the data to an appropriate location (e.g., the faculty sponsor’s Google Drive.)

Before submitting your application, you will be required to affirm the following statement:

“As the primary investigator,  I have reviewed, understood my responsibilities, and provided and discussed the Data Protection Policy for this project with all of my research team members.”

T&I has assigned a department member to the Human Subjects Institutional Review Board  whose chief responsibility is reviewing applications to ensure compliance with the college’s data security standards and related application content.

Administration of Policy

The Chair and Vice Chair of the IRB, Director of The Office of Sponsored Programs, HSIRB Compliance Manager, and T&I representative assigned to the IRB shall oversee this policy and review it at least once every two years. Changes to this policy shall be made in accordance with the college's Policy on Policies.

Adopted: April 25, 2022

Last Revised: June 23, 2022

Last Reviewed: June 23, 2022