FAQs on CDEs
What is a CDE?
CDEs are defined as "logical units of data pertaining to one type of information, with specific and clear descriptors." In other words, CDEs are the variables that are typically collected in a preclinical TBI experiment (e.g., species, sex, age, piston velocity, behavior test outcomes), but with content standards to enable investigators to systematically collect, analyze, and share data.
Why do we need CDEs?
CDEs produce a well-defined and standardized lexicon for describing and reporting how data are collected, reducing variability over the life of a lab. In addition, the creation of CDEs leverages the collective expertise of the scientific community to identify key variables that are recommended to be collected for a complete dataset. CDEs will, therefore, improve rigor (by identifying critical variables), reproducibility (by standardizing format and facilitating sharing), and transparency (as data will be systematically entered with specific and clear descriptors).
Which set of CDEs should I use to map my data?
You likely see several version of CDEs on the internet, for example on the NINDS website as well as on the NIH FITBIR website. We at PRECISE-TBI in partnership with TBI stakeholders in the Veterans Affairs, National Institute of Neurological Disorders and Stroke, the Department of Defense, and the US Army for Congressionally Directed Medical Research, have formed an interagency resource center to help enhance rigor, reproducibility and transparency in preclinical neurotrauma research. As part of that mission, PRECISE, together with major input from the TBI research community, are currently leveraging past efforts for CDE creation and developing a comprehensive set of CDEs that are tested and curated, and then further refined with input from TBI investigators, before being shared on our website. We are sharing these CDEs with the community for review and testing, after which we will release improved CDE versions. Our goal is for there to be a single collection of CDE for TBI that builds on the past efforts from members in our TBI community.
How are CDEs being developed?
This is an iterative process that starts with the formation of committees from the TBI community to develop or refine CDEs, which are subsequently adapted based on both from user feedback and with in-house beta testing. This will necessarily include mapping CDEs to existing datasets (i.e. an individual lab's data) to determine the degree of harmonization (i.e. overlap) of data collected by end-users to the CDEs. In this way we will identify CDEs that are either missing, ill-defined, or are simply unused because they are not common.
Do I need to share all data as CDEs?
Required data sharing policies vary with funding mechanism. PRECISE-TBI does not mandate anything, but will try to align with common mandates as they evolve. In this release of CDEs, we developed a set of core study level metadata and an initial set of subject level core CDEs. CDEs adhere to the FAIR principles as data will be readily Findable, Accessible, Interoperable, and Reusable.
Are all data really common or critical?
We have initially designated candidate CDEs at 3 levels: Core, Recommended, Supplemental. Variables within data normally published at the abstract level or are critical for defining the major parameters of the study (eg. groups, sex, injury model, main outcome variables etc) are designated as core CDEs. We have designated other commonly published CDEs that may vary from study to study as recommended but not required. Variables that are not considered to be critical for defining a study, but which would be useful for specialized studies and could be of value for post-analyses are designated as supplemental. These may be existing CDEs OR may be new data elements that do not map to any existing CDE. For any other data that does not map to any existing CDE we encourage you to share it as a UDE (unique data element; see also Q9); if enough users publish the same UDE it will eventually become a CDE and make data sharing easier and more available for further multi-site analysis (but only once either it is published as a DOI for citation, or the laboratory/owner grants permission for its use).
What format are the PRECISE-TBI CDEs provided in?
The CDEs are provided in two formats: (1) a “Data Dictionary” column format Excel spreadsheet where the CDE variable names are aligned in one column and are supported by descriptive information in adjacent columns on the same row (e.g., units, datatype). (2) A second “CDE entry” format is simply the transposition of the Data Dictionary into a row format Excel spreadsheet so that the CDE variables are all aligned along one row. This allows the user to enter information for each subject in additional rows under each CDE variable listed under columns so that information from a single subject or trial is contained within one row. Additional CDEs for a particular study can be added along the top row.
How do I enter information from multiple time-points for each subject?
Simply add an additional row for each pre/post-injury time-point for that information making sure to use the same GUID (study-specific subject number) and adding the time post-injury under the correct pre/post-injury CDE. An example Excel spreadsheet for all the animal subject data can be downloaded from our website here.
How do I map my data to CDEs?
First download both the Data Dictionary and the CDE example spreadsheet. Each subject in your dataset should be assigned a study-specific identification number (GUID: globally unique identifier) and the data from each should be pasted into the CDE row format spreadsheet with individual subjects on different rows, and the subject-specific data on the same row aligned with each CDE that is contained within the header of each column. You can map existing datasets to the CDE spreadsheet and you can also define a new spreadsheet for prospective studies.
My data do not map to any CDE, what do I do?
Although not required, you are welcome to add the variable or information as a UDE (unique data element) to your Data Dictionary file following the descriptor format of the source document. Then add the UDE to your CDE file along with its data. If you wish, you can download the preliminary set of CDEs that were created by a team of people in the TBI field in conjunction with NIH-NINDS/DoD to help you create any UDEs. (We are currently working with TBI researchers to update and release some of these CDEs in the very near future). Current CDEs can be downloaded here.
What happens to UDEs that are uploaded? Will these become CDEs?
Following collective input from the user-base (i.e. you!), we will establish criteria for determining the threshold for when the number of publications or number of labs that use and report the UDE exceeds an agreed-upon threshold for CDE-level status, qualifying the UDE as a commonly used variable (i.e. a CDE). Similarly, as science progresses and techniques and/or CDEs become less commonly used, they will become UDEs and not part of the base of CDEs to report study data. In this way the list of CDEs and UDEs will remain searchable but the list of required or recommended CDEs will remain relevant and easy to find.
Where can I upload my data for sharing and what do I upload?
PRECISE-TBI partners with the Open Data Commons-TBI database which is funded by VA and NIH to share data and is freely available at: https://odc-tbi.org. There are a number of alternative sharing platforms available and you are not required to use ODC-TBI for data sharing. lease upload both the CDE data format file and the corresponding Data Dictionary file with the CDEs that were used to create your study dataset, including any UDEs that have been added.
Can I be a part of the effort to develop CDEs?
Yes! Contact us via the contact form below and request to be a part of CDE work. We would love to hear from you!