Clinical Data Interchange Standards Consortium (CDISC) defines and manages industry level data standards that are widely used during the analysis, reporting and the regulatory submission of clinical data.
There are three key CDISC standards which are vital for efficient data processing and analysis of clinical trial data known as CDASH, SDTM and ADaM. This article will focus on ADaM and provide you with everything you need to know in 2024.
One of CDISC’s standards for clinical trial submission is the Analysis Data Model (ADaM). ADaM standards are used with analysis data for the creation of analysis datasets and corresponding metadata. ADaM datasets drive the statistical programming process of creating tables, figures and listings (TLFs) more efficiently and with improved traceability. Because of these benefits reviewers are able to reduce the time to approval of regulatory submissions.
ADaM specifications outline the standards for creating ADaM datasets and their associated metadata, which are used in statistical analyses of clinical trials. These specifications detail various aspects such as variable names, labels, data types, lengths, display formats, controlled terminology, and any derivations or programming notes needed for proper dataset construction. In addition, CDISC set five core principles of ADaM:
ADaM specifications require a correlation of various documents such as the Statistical Analysis Plan (SAP), TFL shells and the study protocol. Depending on the study analysis needed, we can add variables required for endpoint analysis. ADaM specifications are an evolving document until the TFLs are finalised.
ADaM datasets are created from Study Data Tabulation Model (SDTM) data. It's important to understand that a single ADaM dataset can be derived from multiple SDTM datasets. For instance, the ADTTE (time-to-event) dataset might include data gathered from several different SDTM datasets. Doing this allows for the easy documentation of any data processing with Define-XML, the CDISC standard for data definition files.
While SDTM domain classes are determined according to data type such as special purpose domains, interventions, events or findings, their ADaM equivalents are classified by analysis approach. There are a number of standard data structures for different purposes, for example analysis of continuous data values or categorical analyses. There also is a subject-level analysis dataset that needs to be created for every study where ADaM is used.
All ADaM datasets are named ADxxxx, where xxxx is sponsor-defined and often carries over the name of the source SDTM domain. For example, an ADaM domain called ADLB would use the LB SDTM domain as its data source. This one-to-one domain mapping is not mandatory though and the required number of ADaM domains depends on the needs of any study data analysis or data review. An ADaM domain may use more than one SDTM domain as its source and carry a unique name that reflects this.
For ADaM variables, the naming conventions should follow the standardised variable names defined in the ADaM Implementation Guide. Any variables copied directly from SDTM data into an ADaM domain shall be used unchanged, with no change made either to their attributes (name, label, type, length, etc.) or their contents. Sponsor-defined variable names can be given to any other analysis variable that is not defined within the ADaM or SDTM standards. Following these conventions will provide clarity for the reviewer.
There are several types of ADaM Datasets which can be explored in more detail:
The ADaM subject-level analysis dataset is called ADSL. This dataset contains a maximum of one record per subject, that contains variables which contain key information for subject disposition, demographic, and baseline characteristics. Other variables within ADSL will contain planned or actual treatment group information for example; key dates and times of the subject’s study participation on the study, randomisation information and/or stratification factors. Not all variables within ADSL may be used directly for analysis but could be used in conjunction with other datasets for display or grouping purposes, or possibly included as variables of interest for review. Given that the intention of ADSL is to contain variables that describe subjects, analysis populations and treatment groups to which they belong or prognostic factors, subject level efficacy information should not be added here but should be placed in another domain. Variables from ADSL may be added to other ADaM domains where doing so aids output creation or data review.
Another main class of ADaM datasets is the Basic Data Structure (BDS). This dataset contains one or more records per subject and analysis parameter or analysis timepoint. It is possible to add derived analysis parameters if required for an analysis. An example would be where a derivation uses results from a number of different parameters or where a mean is calculated at subject level from all the values collected for a subject. Derived records also may be added to support Last Observation Carried Forward (LOCF), Worst Observation Carried Forward (WOCF) or BOCF (Baseline Observation Carried Forward) analyses.
The BDS is especially useful for continuous value analyses such as presenting mean, median, standard deviation and other summary statistics. This may not be the only usage but for a domain to comply with the BDS standard, it at the very least must contain variables for study and subject identifiers, analysis parameter name and code as well as analysis values. If any of these are absent, then the dataset does not fit the BDS description.
A variant of the BDS is available for Time to Event (TTE) analyses that are commonly used in therapeutic areas like oncology. This additionally contains variables for the original date of risk used for the start times in any TTE analysis or censoring for subject where the events of interest are not observed.
In February 2016, CDISC published the Occurrence Data Structure (OccDS) for use in categorical analyses where summaries of frequencies and percentages of occurrence are planned. The OCCDS is used for occurrence analysis or the counting of subjects with a record or term. OCCDS often uses dictionary coding categories to standardise the data and allow meaningful analysis. This is an extension of the previously published ADAE structure that contains extra variables for use with concomitant mediation or medical history data. Data from other SDTM domains in the event or intervention classes may be mapped into OccDS if it fulfils their analysis needs. Some domains, such as exposure data, may be mapped to either BDS or OccDS depending on the analysis and even may be split into two ADaM domains in study where both categorical and continuous analyses are required.
Currently, ADaM supports the majority of analysis needs for clinical data. It may not be as prescriptive as SDTM but if offers flexibility while at the same time ensuring that a set of analysis data standards can be set in place by a sponsor. ADaM datasets also can be submitted to a regulatory agency much like SDTM and has in-built traceability while also having compatibility with Define-XML, so that machine-readable data definitions can be supplied along with any detailed computational details.
Oncology ADaM development depends a lot on RECIST criteria. The Response Evaluation Criteria in Solid Tumours (RECIST) is a standard way to measure how well a cancer patient responds to treatment. RECIST-based statistical analysis endpoints involve complex data collection and derivations that pose challenges for statistical programming.
In compliance with SDTM IG 3.2 or later, tumour identification, assessments, and responses are assigned to the TU, TR, and RS domains. These SDTM domains, along with others, serve as the input data for deriving the RECIST 1.1 endpoints within SDTM. This process results in the creation of five ADaM datasets; ADTR, ADRS, ADINTEV, ADEFFSUM, and ADTTE. All these datasets use BDS where ADTR and ADRS are by-visit data, ADINTEV and ADEFFSUM are summary data at subject level, and ADTTE is TTE data.
The diagram below illustrates the sequence for deriving the five datasets once SDTM and ADSL are ready. It also highlights the data dependencies among these datasets, with dotted arrows representing conditional dependencies
Let’s review these oncology datasets in more detail.
The Tumour Assessment Analysis Data includes records with valid results from both baseline and post-baseline time points sourced from the TR domain. This dataset is used to generate individual tumour assessment and sum of diameters (SoD) parameters.
The Tumour Timepoint Response Analysis Data includes records with valid results exclusively from postbaseline time points. These records originate from the RS domain when timepoint responses are provided by investigators or are derived from ADTR when the sponsor generates the timepoint responses. As a result, ADRS relies on ADTR in cases where the responses are sponsor-derived.
Intermediate Event Analysis Data derives intermediate progression-free survival (PFS) events, including censoring events that are crucial for calculating endpoints such as Clinical Benefit Rate (CBR), Duration of Response (DOR), and different types of PFS. The ADINTEV dataset relies on ADRS.
Efficacy Summary Analysis Data derives categorical endpoints like Best Overall Response (BOR), Objective Response Rate (ORR), and CBR. The ADEFFSUM dataset is dependent on ADRS, and if the study involves CBR endpoints, it also relies on ADINTEV.
Time-to-event Analysis Data derives endpoints such as DOR and PFS.
Understanding and implementing CDISC ADaM standards is crucial for efficient clinical trial data analysis and regulatory submission. ADaM ensures that datasets are analysis-ready, which enhances the accuracy and traceability of statistical outputs. ADaM’s structured yet flexible approach helps streamline the creation of critical analysis datasets, facilitating faster and more reliable regulatory reviews.
For organisations looking to optimise their clinical trial data processes, embracing efficient methodology within statistical programming services is essential. Quanticate can offer automated SDTM dataset production to streamline your ADaM dataset creation and whole statistical programming solution. Our automation processes save time and money, by leveraging our expertise, you can ensure compliance with CDISC standards, improve the quality of your data submissions, and accelerate your time to market.
Contact Quanticate today to learn how our cutting-edge statistical programming services can transform your clinical trial data management and drive your research success.
There are key differences between SDTM and ADaM in terms of their roles and purposes in clinical data analysis.
CDISC SDTM (Study Data Tabulation Model) is used to collect and format raw data from human and animal clinical trials. SDTM helps organize raw data into specific domains following defined standards. This structured organization ensures uniformity and data consistency across studies, making it easier to submit data to authorities like the FDA. The SDTM Implementation Guide (IG) provides detailed instructions for creating high-quality SDTM datasets, which reduce errors and facilitate smooth regulatory reviews.
ADaM (Analysis Data Model) is different from SDTM in that it focuses on making the data analysis-ready. While SDTM organizes raw data, ADaM transforms this data to support various types of statistical analyses, such as descriptive statistics, regression analysis, and subgroup analysis. The ADaM Implementation Guide offers flexibility for customizing datasets to meet specific analytical needs while maintaining a standard that ensures clarity and comprehensibility. It is mandatory to use SDTM data to develop ADaM datasets, which are essential for generating Tables, Figures, and Listings (TFLs) in study reports.
In summary, SDTM is about standardizing the collection and tabulation of clinical trial data, while ADaM prepares this data for detailed analysis. Both play crucial roles in different stages of clinical data management.
ADaM datasets support a wide range of analyses, including descriptive statistics, regression analysis, subgroup analysis, survival analysis, and more. They are designed to be flexible and adaptable to various analytical needs.
While not always mandatory, using ADaM is highly recommended and often expected by regulatory authorities like the FDA. It ensures that the data is presented in a clear and standardized manner, facilitating efficient review and approval processes.
The ADaM Implementation Guide (IG) provides comprehensive guidelines on how to create ADaM datasets. It includes detailed instructions on dataset structure, variable naming conventions, data derivation processes, and more.
Yes, ADaM allows for some flexibility to accommodate specific study needs while maintaining standardization. Customizations can be made to meet the unique requirements of different analyses, as long as they adhere to the core principles outlined in the ADaM IG.
This blog was first published in October 2012 and has been regularly maintained and updated since then.
Related Blog Posts: