An efficacy endpoint in oncology is a characteristic or variable that measures how beneficial a treatment is to a patient’s feeling, function, and survival in a clinical trial. They are essential for assessing whether new cancer therapies are safe and efficacious[1]. Endpoints may serve different purposes, from evaluating survival rates to monitoring tumour size, as well as monitoring patient symptoms. While overall survival (OS) is regarded as the “gold standard” primary efficacy endpoint, trials can be time consuming, pushing developers to explore surrogate endpoints, such as objective response rate (ORR) or progression free survival (PFS). This can speed up the market approval of new agents with the potential to save or extend lives[2]. Additionally, more value is now being placed on outcomes such as quality of life and treatment failure, with the use of such endpoints becoming increasingly prevalent in oncology clinical trials. Although efficacy endpoints in oncology have historically been used to evaluate therapies in phase III clinical trials, their use in early-phase clinical trials is becoming more frequent with the development of novel immunotherapies.
This blog explores the roles of endpoints in oncology, as well as their benefits, their drawbacks, and how they may be optimally used. Different categories of endpoints are discussed – events-based endpoints, tumour assessment endpoints, and symptom assessment endpoints.
Overall survival (OS) is the time from randomisation to death, with any patients lost to follow up or still alive at the time of evaluation being censored. As the main goal of cancer treatment is extending survival, OS is the “gold standard” efficacy endpoint. It is easy to measure, being objective and impervious to researcher bias. Although it is the preferred endpoint in oncology, it has some drawbacks. The expectation of long-term patient follow-up requires a larger patient population, meaning the study will require more financial support. OS also is less useful in diseases which progress more slowly and have longer-term survival expectations. OS can be influenced by many other treatments which can make it difficult to attribute the efficacy endpoint to the specific medical intervention. Additionally, OS can be influenced by non-cancer deaths, as the endpoint is defined as time from randomisation to death of any cause.
Time to next treatment (TTNT) is defined as the time from initiating treatment to the beginning of the next line of therapy. TTNT is a meaningful endpoint in incurable diseases where patients will require many interventions to extend survival, being used as a measure of duration of treatment efficacy in primary cutaneous T-cell lymphomas[3]. It is a surrogate marker for duration of clinical benefit and requires validation before serving as a standalone marker to assess treatment efficacy[1].
Milestone survival is defined as the survival probability at a given time point, being classified as an endpoint related to OS. It remains a possible surrogate endpoint for OS in late-stage drug development and requires further validation[4].
Response Evaluation Criteria in Solid Tumour (RECIST) is a classification system for solid tumours, providing a simplified set of criteria for evaluating tumour response to treatment. It relies only on linear measurements, allowing for straightforward monitoring of tumour size. In its most recent version, RECIST selects target lesions (tumours) by size and defines them as representative lesions of all involved organs[1]. These lesions must be easy to measure via imaging with a minimum size of 10mm by CT scan, and a maximum of 2 lesions per organ and 5 in total are considered baseline target lesions while all others are non-target[5]. The longest diameter of each lesion is taken, and the sum of the longest diameters (SLD) is calculated. There are 4 categories in the RECIST system[6], those being:
Computed tomography (CT) is the preferred method of measurement as it is easily reproducible, and the results can be independently reviewed. Spiral CT and MRI can also be used, whereas X-rays and especially ultrasound tend not to be used. Once a method of measurement has been chosen, it should be maintained throughout the study. In general, the largest lesions are selected to follow as target lesions.
In some cases, the largest lesions may not be easily measured and are not suitable for follow-up because of their configuration. In this case, identification of the largest most reproducible lesions is advised. Some lesions, however, are considered non-measurable, whether it be because they are simply too small (longest diameter < 10 mm or pathological lymph nodes with a short axis at least 10 but < 15 mm) or they are truly non-measurable (e.g. leptomeningeal disease, ascites, pleural or pericardial effusion, inflammatory breast disease)[6]. A useful test that can be applied when assessing patients for unequivocal progression is to consider if the increase in overall disease burden based on the change in non-measurable disease is comparable in magnitude to the increase that would be required to declare PD for a measurable disease. For example, an increase in a pleural effusion from ‘trace’ to ‘large’ or an increase in lymphangitic disease from localised to widespread. The nature of non-measurable disease means that there are no objective criteria for progression, so an increase in disease burden must be substantial in order to declare PD.
Progression free survival (PFS) is defined as the time from randomisation until first evidence of disease progression or death, whichever occurs first. It is a popular surrogate endpoint as fewer patients are needed in order to obtain data, and data will also become available earlier in the trial than for OS. Additionally, PFS has drawn more attention as a clinical endpoint for its ability to assess treatment paradigms that include multi-stage therapies. PFS can assess the short term, incremental changes of each round of treatment, where OS fails to do this[7]. For a given sample size, the magnitude of effect on PFS can be larger than the effect on OS, although data are usually insufficient to allow a robust evaluation of the correlation between effects on OS and PFS. For this reason, PFS’ use as an efficacy endpoint has been the subject of debate because prolonged PFS does not always result in an extended survival[8].
Time to progression (TTP) is defined as the time from randomisation until first evidence of disease progression. The precise definition of tumour progression is important and should be carefully detailed in the protocol. Compared with TTP, PFS is the preferred efficacy endpoint as it registers progression and/or death, whereas in TTP analysis, deaths are always censored.
Disease-free survival (DFS) is defined as the time from randomisation until evidence of disease recurrence or death from any cause, being most frequently used in an adjuvant setting after definitive surgery or radiotherapy. DFS can be an important efficacy endpoint when a large percentage of patients achieve complete remissions with chemotherapy. DFS is a surrogate endpoint, requiring a smaller patient number and shorter follow-up than OS. Because of this, DFS is regarded as an important efficacy endpoint for cancers with a prolonged overall survival. A similar endpoint is event-free survival (EFS), where randomisation takes place before definitive surgery or radiotherapy. EFS is defined as the time from randomisation to an event which could include disease progression, discontinuation of the treatment, or death. While EFS and DFS used to be interchangeable, EFS is now the efficacy endpoint reserved for neoadjuvant settings (treatment given before surgery) whereas DFS is applied in adjuvant settings (treatment given after surgery). Like DFS, EFS is a surrogate endpoint that can be used in the place of a primary efficacy endpoint to reduce sample size, costs, and follow-up duration. DFS has been the primary basis of approval for adjuvant breast cancer hormonal therapy, adjuvant colon cancer therapy, and adjuvant cytotoxic breast cancer therapy, whereas EFS is an appropriate endpoint for the evaluation of neoadjuvant breast cancer therapy[9].
Time to treatment failure (TTF) is the time from initiation of treatment to the first of the following: progression, discontinuation of treatment due to adverse event or progression, start of any new anticancer therapy, withdrawal of patient consent, or death. When TTF is used as a primary efficacy endpoint, secondary endpoints are strategically chosen to explore the portion of patients that have discontinued treatment due to disease progression compared to other reasons.
Objective response rate (ORR) is a measure of how a specific treatment impacts tumour burden in a patient with a history of solid tumours. It is defined as the proportion of patients with a tumour size reduction of a specified amount, for a minimum time period. Response duration is usually measured from the time of initial response until documented tumour progression. The World Health Organisation (WHO) was the first to develop criteria to evaluate ORR as an efficacy endpoint in oncology trials, but these criteria were criticised for the interobserver variability in the number of lesions and the selection of measurable targets[10]. This was replaced by the Response Evaluation Criteria in Solid Tumours (RECIST) as discussed above. RECIST can also be used to assess for PFS, defining a progression as a 20% increase in the sum of target lesion longest diameters (SLD). The use of RECIST makes ORR a standard efficacy endpoint across multiple clinical trial locations, although ORR fails to capture stable disease which is more accurately assessed by TTP or PFS endpoints. Although the transition to RECIST helped to eliminate some interobserver variability, human error cannot be completely avoided when in measuring tumours via CT or MRI. Concerns also remain that ORR does not adequately reflect efficacy endpoints such as PFS, DFS, and OS despite tumour regression. ORR provides the greatest benefit in trials evaluating neoadjuvant therapies, especially those in breast cancer patients[8]. Special consideration for ORR as a primary clinical endpoint is also given in single-arm trials of patients with refractory tumours and no current therapy options [11].
Duration of response (DOR) is defined as the length of time a tumour will respond to treatment without growing or metastasising, and time to response (TTR) is the time that it takes for this to happen. DOR and TTR are typically evaluated as secondary endpoints in early-stage clinical studies where efficacy is presented as objective response rate (ORR).
Complete response (CR) is defined as the lack of any detectable evidence of a tumour. This is generally measured through imaging studies such as CT scans, or through histopathological treatment like a bone marrow biopsy or breast cancer resection specimens. CR can be used as a surrogate or primary efficacy oncology endpoint depending on the specific disease and context of use[12]. CR has been used as a clinical endpoint for traditional approval for hematologic malignancies such as acute leukaemia, and has proven to be clinically relevant in the setting of multiple myeloma therapy, conveying a survival advantage with improved OS and prolonged EFS[13]. Pathologic complete response (pCR) has been used as a surrogate endpoint in breast cancer and is defined as the absence of residual invasive cancer upon evaluation of the resected breast tissue and regional lymph nodes[14]. This efficacy endpoint is commonly used in trials of neoadjuvant chemotherapy for breast cancer patients under FDA’s accelerated approval program. In studies following these patients, pCR was associated with improved OS and EFS[15, 16].
Disease control rate (DCR) or clinical benefit rate (CBR) are defined as the percentage of patients with advanced cancer whose therapeutic intervention has led to either a complete response (CR), partial response (PR), or stable disease (SD). DCR is related to ORR and may best be used as an oncology efficacy endpoint for therapies that have tumoristatic rather than tumoricidal effects. However, there exist no comprehensive analyses to demonstrate that CBR and DCR add to the value of traditional response/activity endpoints in early clinical trials, with data suggesting that DCR and CBR provide ambiguous information that likely exaggerates the anticancer activity of the therapy[17].
Health-related quality of life (HRQoL) is an important measure, reported by the patients and demonstrating the clinical benefit of the treatment. It evaluates patient quality of life with respect to health status over time. HRQoL is often used as a secondary clinical endpoint to compare treatments with similar effects, but potential differences in toxicity, although it can also be used as a co-primary endpoint with OS[18]. It is usually assessed with a set of four core questions developed by the Centres for Disease Control and Prevention (CDC). These questions include concepts related to overall health, physical health, mental health, and daily living activities, providing results that are easy for the assessor to interpret[19].
Time to progression of cancer symptoms, a similar endpoint to TTP, is a direct measure of clinical benefit rather than a potential surrogate endpoint. There are several difficulties that arise from measuring this endpoint. Progression can be difficult to track if assessments are missed, and these symptom assessments can also be subject to response bias as few cancer trials are blinded. There can also be a delay between progression of a tumour and the onset of cancer symptoms, and other treatments may be initiated during the trial, making the cause of certain symptoms difficult to ascertain. Additionally, it can be difficult to differentiate tumour symptoms from drug toxicity.
A composite symptom endpoint or symptom scale should contain components of similar clinical importance, and an analysis of the contribution of each component should be submitted with the primary analysis of the overall composite endpoint. An example is the composite symptom scale that includes several important symptoms of myelofibrosis[9]. Drugs have been approved based on delaying the time to skeletal-related events such as pathological fractures, radiation therapy to bone, surgery to bone, and spinal cord compression.
Generally, these biomarkers have not served as primary endpoints for cancer drug approval, although paraprotein levels in blood and urine have been used as part of myeloma response criteria[9]. The FDA has accepted blood-based markers as elements of a composite endpoint. The occurrence of certain clinical events in conjunction with marked increases in the protein mucin-16 was considered progression in ovarian cancer patients. Blood-based biomarkers can also be useful in helping to identify prognostic factors and in the selection of patients and stratification factors to be considered in study designs.
As missing data can complicate endpoint analysis, methodology for analysing incomplete and/or missing follow-up visits and censoring methods should be specified in the protocol. Data should be monitored accurately, and checks must be in place to quickly identify inconsistencies in the collected data. A more sophisticated assessment of the appropriateness of the data can be achieved by establishing periodic medical and statistical supervision. The aim of such a process is to identify potential sources of bias that could lead to incorrect assessment of the oncology efficacy endpoints. For example, in a trial where PFS is the primary efficacy endpoint, it is important to monitor the number of PFS events and the tumour assessment schedule, so that Investigator sites where incorrect schedules occurred more frequently can be detected, allowing corrective actions to be put in place for future enrolled subjects and future tumour assessments[2].
Many potential patients may not understand the difference between overall survival (OS) and progression free survival (PFS). FDA researchers conducted two studies where participants were randomly assigned to watch one of five television adverts for fictional prescription oncology drugs: an ad with an overall survival claim; ads with an overall response rate claim with and without a disclosure; and ads with a progression free survival claim with and without a disclosure[20]. After watching the ad, they completed a survey that included open-ended questions; true/false questions; and questions about the extent to which the ad affected their perceptions and intentions. 90 percent of the participants who watched the ad with the OR claim without a disclosure thought the drug would help people live longer, compared with 30 percent of participants who watched the ad with a disclosure. Among people in the PFS ad group without a disclosure, 93 percent of people thought the drug would help people live longer, versus 51 percent of the participants who watched the PFS ad with a disclosure. When researchers asked about the disclosure specifically, they found that participants noted it and understood its purpose. Across the studies, approximately 60 to 70 percent of participants who saw an ad with a disclosure understood that the drug’s effect on survival is currently unknown.
From the study, it can be concluded that disclosures are an important tool to reduce misunderstanding around the exact implications of different oncology efficacy endpoints, as people may incorrectly infer that a drug has the potential to extend life, when it has only been shown to increase overall response rate (ORR) and progression free survival (PFS).
In the 1970s, FDA usually approved cancer drugs based on objective response rate (ORR), determined by tumour assessments from radiological tests or physical examinations[21]. Then in the early 1980s, the FDA determined that cancer drug approval should be based more on direct evidence of clinical benefit, such as improvement in survival, improvement in a patient’s quality of life, or improved tumour-related symptoms[22], as these benefits may not always be predicted by, or correlate, with ORR. In 2000, Response Evaluation Criteria in Solid Tumours (RECIST) was first introduced to simplify and regulate the measurements of tumours.
As the number of trials using overall survival (OS) as a primary endpoint decreased, progression-free survival (PFS) and disease-free survival (DFS) became more frequently used, partly due to the financial and time constraints of OS as it requires longer trials with larger numbers of patients. From 2005 to 2013, DFS was used as a primary endpoint in 5 out of the 8 US approved adjuvant or curative drugs in solid tumours[23]. The FDA now recognises the clinical benefit of both DFS and PSD and allows for their use as primary end-points in trials seeking regulatory approval. Over time, larger improvements in tumour reduction and delay in tumour growth have been achieved, and tumour measurement endpoints are used to support both traditional and accelerated approval. A large improvement in PFS or high, substantiated durable ORR has been used to support traditional approval in select malignancies, but magnitude effect, relief of tumour-related symptoms, and drug toxicity should also be considered[22, 11]. Improvement in tumour-related symptoms in conjunction with an improved ORR and adequate response duration has supported traditional approval in several clinical settings.
Endpoints will continue to evolve as new therapies are developed, as well as the improvement of imaging and detection modalities, this shift already having been demonstrated by the development of RECIST. As more endpoints continue to be developed for specific types of cancer and their therapies, it is imperative that studies clearly define which endpoints they employ and can differentiate them from other endpoints. Overall survival remains the “gold standard” primary clinical endpoint, as it is easy to measure, validate, and is widely accepted in the medical community. However, it is important to continue exploring the value that other endpoints add to assessing adjuvant and neoadjuvant therapies[1]. Surrogate endpoints have the potential to lower costs and reduce the resources needed. Efficacy endpoints in oncology will be designed with novel immunotherapies in mind and as survival increases, qualitative endpoints will become critical secondary endpoints in the assessment of clinical benefit.
Bring your drugs to market with fast and reliable access to experts from one of the world’s largest global biometric Clinical Research Organizations.
© 2024 Quanticate