Quality control (QC) in clinical trials is essential for ensuring the integrity and reliability of data. Leveraging SAS (Statistical Analysis System) for QC processes offers a powerful solution for clinical researchers. SAS provides comprehensive tools for data analysis, reporting, and validation, making it an ideal choice for QC tasks. By integrating SAS with robust QC methodologies, clinical trials can achieve higher standards of accuracy and compliance.
In clinical trials, double programming is a widely accepted practice aimed at enhancing the reliability of the data analysis process. This method involves two independent programmers developing the same statistical programs—such as Standard Data Tabulation Model (SDTM) datasets, Analysis Data Model (ADaM) datasets, or Tables, Listings, and Figures (TLFs)—separately. The core steps of double programming include:
Double programming plays a crucial role in identifying and correcting errors or inconsistencies before final data analysis and reporting. This practice not only enhances data reliability but also ensures that any issues are addressed early in the process, thereby improving the overall quality of the clinical trial data.
This blog post explores the how SAS and UNIX can be used to create a QC batch program for clinical trials, highlighting best practices and practical applications to enhance data quality and reliability. Unlike the double programming method, which relies on manual, independent development and comparison of outputs by two programmers to ensure data accuracy, QC batch programs leverage automation to streamline the QC process. By using SAS scripts and UNIX batch programming, repetitive QC tasks such as re-running QC programs, checking for standard outputs, and validating data consistency across datasets can be automated, reducing manual effort and the likelihood of human error.
SAS is a powerful tool for data management, statistical analysis, and reporting commonly used in clinical trials and other fields. Typically, SAS programs are designed to operate on a single file and are not well-suited for making changes to a directory's folder structure or applying consistent modifications across multiple files in a single directory. SAS UNIX refers to the use of the SAS software on UNIX-based operating systems, such as Linux, AIX, or Solaris, which provides enhanced capabilities for batch processing, automation, and the integration of SAS programs with other powerful UNIX tools. For instance, the %SYSEXEC macro statement (as seen in the sas quality control in clinical trials example) can be used to switch to the operating environment mode, allowing users to execute UNIX commands directly within a SAS session. This approach makes it possible to handle tasks like file manipulation, data processing, and automation without leaving the SAS environment.
This QC example in SAS is based on a client request to build a robust system which reacts to display creation; finds and runs the related QC programs automatically, and compiles the information in one central place. This needs to work for multiple studies with differing standards.
To start the process first a statistical programmer must consider all scenarios and tease out all the nuances of the system to create a system (the batch program).
To build any system, a lot of in depth thought is required before any programming begins. Programmers take the requirements and build a robust system to address each potential scenario that may arise, including some which should not.
All possible questions/issues and other considerations should be brainstormed. This approach helps focus the programmer on the requirements and enables identification of areas for the client to consider. This is a fundamental part of a programmers role before starting a request.
Some of the considerations to a request are:
After all the brainstorming, we gather all the questions and create an order to these, i.e. in terms of how the quality control batch program will work. The client should be kept informed of progress and any issues discussed to ensure the initial top level requirements have been interpreted correctly. Without the clients involvement at this stage, there is the potential risk of creating an improved system and wasting time if it was not originally required in the first place.
Therefore we break down the items as follows, which will be considered separately throughout this article:
The main components of the process includes:
To begin with we need to set up the environment, but to do this we need to first consider how and where we are going to define this information to enable the environment to be set up with ease.
Having one key program defined (GLOBPROT.SunixAS) with all the relevant information means that we can use this key program to set up the environment but also utilize this program in every other program thereafter. This means that we can have a suite of standard programs which call the specific GLOBPROT.SAS which has all the settings for that specific reporting delivery.
What information do we require to be included in this specific GLOBPROT.SAS for our client?
When developing a directory structure, keep it simple, you want the users to become familiar with this new directory structure immediately. There will be a directory created called QC_MASTER and underneath this sub directories PROGRAMS, OUTPUT, LOG etc.
Note: Underneath the QC_MASTER/programs the master copies of the programs will reside.
For each reporting delivery there will be a directory called QC_snapdate (e.g. QC_26SEP2011) and mirroring the QC_MASTER directory the same subdirectories will be created.
The easiest way to explain this is diagrammatically:
Figure 1: Diagrammatic representation of the directory/sub-directory structure
Again there needs to be a simple method used so that it may be applied to all with ease.
For this particular client the display name followed a specific pattern so it was easy to apply. By taking the original display name to be QCed and stripping this back to create a linking variable, the QC program name can then be assigned.
Figure 2: Diagrammatic representation of how the QC programming naming convention was derived
The linking variable is a key variable within the program. Without this the batch program will not function properly. This variable will enable us to define the status flag for each display within the batch program, this will be discussed later.
There are six main components of the process:
The batch program was developed with the above in mind but by allowing each of the components to be driven independently using macro variables enables as much flexibility with the code as possible. The macro variables are highlighted in bold below:
Note when P2-P5 are run, the QC spreadsheet is updated after each of these tasks has been performed. This ensures the QC spreadsheet always reflects the latest information gathered.
For the remainder of the blog we will concentrate on P1-P2 and as a whole discuss the rest when identifying the status of each display.
This is driven by macro variable P1_NEWDIRYN.
At this point in the paper we will be introducing examples of the code from the batch program, specifically the UNIX commands. These need to be surrounded by the SAS code in bold below to allow the UNIX commands to be run as part of the SAS batch program.
The code below creates the directories (mkdir) then copies (cp -p) the globprot.sas from the MASTER area to the reporting effort retaining the file permissions.
%sysexec %str( | ||
cd&m_lib; | QC_MASTER/programs | |
cd ../../; | ||
mkdir QC_&snapdt2; | QC_26SEP2011 | |
cd ../QC_&snapdt2; | ||
mkdir programs; | QC_26SEP2011/programs | |
mkdir | ||
cp -p &m_lib.globprot.sas&q_lib.globprot.sas; | ||
); |
The code below copies the qc_plan.csv template file from the MASTER area to the reporting effort ONLY if it doesn’t already exist in the reporting effort area. It also renames it to include the reporting effort snap date.
%if %sysfunc(fileexist(“&q_lib.qc_plan_&snapdt2d..csv”)) eq 0
%then %do;
%sysexec %str(
cp -p &m_lib.qc_plan.csv
&q_lib.qc_plan_&snapdt2d..csv;
);
%end;
The code below navigates to the display output directory (cd) , then opens up each display and extracts the date/time stamp when the file was run (perl). The reason we didn’t take the Unixdate/time stamp was because the displays could have been copied to the display folder which wouldn’t accurately reflect the display information.
%sysexec %str(
cd&outdir_lib; Display directory
Internal date/time stamp
perl -nle ‘print “$ARGV: $&” if /(0[1-9]|[12][0-9]|3[01])
(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)
(11|12)\ ([0-1][0-9]|2[0-3]):([0-5][0-9])/’ *.* >&q_lib.dd.txt;
);
Note: The perl code was provided by a colleague which saved us writing additional SAS code.
The code below navigates to the reporting effort directory, creates a list of the QC programs only (ls -1) and a long list of the QC logs (including date/times) in this area (ls –el). This also navigates to the MASTER directory and creates a list of QC programs available.
%sysexec %str(
cd&q_lib ; QC snap directory
ls -1 qc_*.sas>&q_lib.qcprog.txt; 2b] QC programs
ls -el qc_*.log >&q_lib.qclog.txt; 2c] QC log
cd&m_lib; QC MASTER directory
ls -1 qc_*.sas>&q_lib.qcmaster.txt; 2d] QC programs
);
The data from the QC plan spreadsheet is imported into the batch program to maintain existing information. The following are a list of some of the variables within the spreadsheet with example entries.
Automated:
Manual:
Here is a summary of all the files created so far which will be used in the batch program to identity the status of each displays:
(*) The date/time derived variables are used to also identify the status and to ensure the QC information occurs AFTER the display has been created.
The table below provides an example for each status setting:
DISPLAYS TO QC QC PROGRAMS AREAS
Although the initial request was simple this did develop into a robust system which could be used by all. To ensure the system can be easily used by all users this system needs to be clearly documented. Any assumptions need to be explained and specific limitations of the system need to be identified. Examples about how to use the system will help the user work with the batch program. This should be a living document which is updated alongside any system modifications.
In addition to the standard requirements, we have created an audit file which collates information every time the batch program is run. This way we can monitor the use of the program and the type of information which was required using the following 5 categories: AUDIT DATE, AUDIT TIME, USER, TASK and ADDITIONAL INFORMATION.
Examples of the contents of TASK are as follows:
Additional information for P3 and P4 are the QC program names.
The benefits of the new robust system to the client were:
To achieve high-quality results, it is essential for programming teams to strategically plan their QC processes and regularly assess the quality of both their deliverables and their methods. This article demonstrated that using SAS quality control and automated batch programs can help meet these needs. Communication and involvement by all stakeholders will ensure that the resulting system is easy to use, relevant, and includes all appropriate steps to ensure quality and consistency. Ultimately, while delivering accurate and timely results is important, maintaining a reliable and consistent QC process is equally crucial in the pharmaceutical industry.
Bring your drugs to market with fast and reliable access to experts from one of the world’s largest global biometric Clinical Research Organizations.
© 2025 Quanticate