What is common about Common Data Elements
- M Martone
- Jul 28
- 4 min read
We returned last month from the National Neurotrauma Society annual meeting in Philadelphia, where several of us attended a pre-meeting workshop held by the NIH to bring together working groups creating Common Data Elements (CDEs) for preclinical-TBI, -spinal cord injury and -post traumatic epilepsy. We were brought together under the auspices of the NINDS Preclinical Common Data Elements (CDEs) and Data Standards (NT-PRECEDS) program. Although each domain is developing CDEs for their respective disease area, there are a lot of metadata elements, experimental variables, tests, etc., that are common across the three. NT-PRECEDS is therefore looking to “standardize the standards”, to ensure that common CDEs are common across all three disease areas.
As you can see from the previous sentence, it is easy to throw around the word “common” when talking about CDEs. But as the word common has many meanings, a major topic (and point of confusion) at the workshop was: What exactly is common about a Common Data Element? Here, I’ve set down a few thoughts on this question that were inspired by the workshop, but-disclaimer-do not reflect the discussions or any decisions made by the NT-PRECEDS program.
First, let’s start with the definition of a data element. A data element, according to NIST, is “A basic unit of information that has a unique meaning and subcategories (data items) of distinct value.” The easiest way to think of a data element is a single variable collected in a spreadsheet, e.g., sex, species, temperature. So far so good.
Now let’s look at what might be common about a data element by considering some definitions of “common”. There are quite a few, but here are some of the relevant ones from Merriam Websters:
Belonging to or shared by two or more individuals or things or by all members of a group, e.g., a common border
Occurring or appearing frequently, e.g., the most common species of bird in New York City
Widespread, general, e.g., common knowledge
Just satisfying accustomed criteria, e.g., Common Core or common decency.
How might these definitions apply to data elements in preclinical neurotrauma?
Definition 1 implies that the same thing is shared by more than one individual. In the case of a CDE, then, I would state that these are data elements that conform to a common standard. A data element is said to be common in 2 or more datasets if it shares and conforms to a set of standard attributes such as its name, definition, data type, proper usage and perhaps permissible values. Example: Age of subject is always coded as AgeVal, its data type is a number and it is always accompanied by a unit.
Definition 2 implies that a CDE is a data element that is in frequent use in preclinical neurotrauma datasets. Given that there are likely thousands and thousands of variables collected, the requirement that a data element appear “frequently” in current studies provides a basis for which to designate a given data element as common within preclinical neurotrauma.
Definition 3 suggests that perhaps the bar for CDEs should be higher than “frequent” and represent those data elements that are generally used across the 3 domains, e.g., subject characteristics like species, age, weight, strain. Such CDEs are often designated by those working on CDEs as “General Core” reflecting their common usage within biomedicine.
Definition 4 suggests that perhaps current usage should not be the sole determinant of whether a data element is common. Rather, it implies that a data element can become a CDE not just based on how often it is used, but by how often it should be used. For example, we at ODC-TBI project recommend that investigators supply the RRID (an unambiguous persistent identifier) for each animal strain used, particularly genetically modified animals, to make them easier to identify, find and sort. Virtually no one currently uses them in their datasets, but we would certainly like the RRID for animal strain to be part of every relevant dataset. So perhaps some CDEs would be designated as “Required” in that they can and should be collected and reported for every preclinical study. Such CDEs would therefore be considered “Common” and “Core” as in “Common Core: the set of skills that every student should have”.
After reflecting on the many discussions at this workshop and in others contexts , I think is fair to say that the “common” in CDE encompasses all of these definitions, depending on the use case:
Shared: A CDE is common when it has a clearly specified meaning and a set of shared attributes across all studies. This type of standardization makes it easier to compare and harmonize across datasets.
Frequent: Creating such CDEs takes work and therefore having some basis for selection makes sense, e.g., setting a minimum threshold for how often a data element must be used before it is considered “common”.
General/routine: Some CDEs are generally collected as apart of an experiment independent of any model type and therefore are commonly collected across neurotrauma research, indeed most of biomedicine.
Expected: Certain CDEs are critical to interpreting and using a dataset. In many CDEs and other types of standards, these are usually designated as “Required” meaning that they are expected to be present in a dataset. In this way, certain CDES may comprise a reporting standard that specifies critical information that should be present in all neurotrauma research.
But regardless of what definitions we use in the process of creating CDEs, ultimately the resultant CDEs cannot truly be considered common in any sense of the word if they are not used in preclinical research! If the CDEs are too rigidly defined, cumbersome to work with or numerous to comply with, they will likely be ignored. The key challenge facing preclinical neurotrauma research is to set reasonable boundaries on the creation and expected use of CDEs to ensure their uptake is shared, frequent, routine and expected across preclinical neurotrauma research. In other words, truly COMMON!
Cross posted at ODC-TBI




Comments