Additional background to the matter arising
In the March 2006 pre PBAC response, the sponsor stated “while we agree that indirect comparisons are not ideal, the indirect comparison method is not without benefits in cases where a head-to-head randomised controlled trial is not feasible. To ignore this comparison as providing only weak evidence ignores the following points:- The FIT-VFA and GHAC trials have the same design
- The FIT-VFA and GHAC trials had the same inclusion/exclusion criteria
- The FIT-VFA and GHAC trials used the same primary endpoint to ascertain efficacy
- The patient populations in the FIT-VFA and GHAC trials were matched on all major risk factors for fracture – age, BMD, prior fractures.”
In addition to tabled material already presented, Table 5 displays a summary of the key results for the indirect data comparison derived from the GHAC trial (full trial data and SQ3 subgroup for teriparatide) and the FIT-VFA trial (for aledronate). This data has been reproduced without amendment from the agenda papers of the March 2006 PBAC meeting.
Table 5: Indirect analysis of morphometric vertebral fracture rates using placebo as common reference
![]() | Teriparatide data | Alendronate data | ||||
![]() | Active | Placebo | RR (95% CI) | Active | Placebo | RR (95% CI) |
1 – Teriparatide & alendronate: all randomised populations | ||||||
![]() | 22/541 | 64/544 | 0.346 (0.22,0.55) | 78/1022 | 145/1005 | 0.529 (0.41, 0.69) |
Risk difference (indirect comparison) = -0.426 (-0.96, 0.11) | ||||||
Relative risk reduction (indirect comparison) = 0.65 (0.38, 1.12), p=0.1206 | ||||||
2 – Teriparatide & alendronate: paired radiograph populations | ||||||
![]() | 22/444 | 64/448 | 0.347 (0.22, 0.55) | 78/981 | 145/965 | 0.529 (0.41,0.69) |
Risk difference (indirect comparison) = -0.422 (-0.96, 0.11) | ||||||
Relative risk reduction (indirect comparison) = 0.66 (0.38, 1.12), p=0.1211 | ||||||
3 – Teriparatide: SQ3 sub-group population; alendronate: paired radiograph population | ||||||
![]() | 5/86 | 27/95 | 0.205 (0.08, 0.51) | 78/981 | 145/965 | 0.529 (0.41, 0.69) |
Risk difference (indirect comparison) = -0.950 (-1.90, 0.01) | ||||||
Relative risk reduction (indirect comparison) = 0.39 (0.15, 0.99) p=0.0487 |
NB: Comment included in the March 2006 PBAC short minutes “The calculation of the risk differences for the indirect comparison of teriparatide and alendronate could not be reproduced and the values are implausibly large.”
Top of page
Reviewer’s opinion
The preferred method to assess the comparative effects of medical interventions is head-to-head randomized controlled trials (RCT). Concerns have been expressed over the use of indirect comparisons of treatments. The Cochrane Collaboration’s guidance to authors states that indirect comparisons are not randomized, but are “observational studies across trials, and may suffer the biases of observational studies, for example confounding.” Some investigators suggest that indirect comparisons may systemically over-estimate the effects of treatments as “randomization is not sufficient for comparability”.However, in the absence of direct comparator data, other types of data evaluations may be considered but these methods are scientifically less robust than head-to-head RCTs. There is emerging evidence in the literature that when no head-to-head evidence is available, the method of “adjusted indirect comparisons” may be useful to estimate the relative efficacy of competing interventions. Bucher and colleagues developed a model for making adjusted indirect comparisons of the magnitude of treatment effects that produces an unbiased estimate of relative efficacy of the treatments.
In fact, several recent papers have demonstrated that results from adjusted indirect comparisons usually, but not always, agree with the results from head-to-head (direct) comparisons. For example, a recent paper by Song et al, demonstrated that only 3 of 44 treatment comparisons showed significant discrepancy (p<0.05) between the direct results and the adjusted indirect estimate. The categories of patients involved in these studies covered a diverse range of medical conditions including those with an increased risk of vascular occlusion, HIV infection, chronic Hepatitis C virus infection, gastro-oesophageal reflux disease, post-operative pain, heart failure, and cigarette smoking.
However, the methodological process involved in the indirect comparison (i.e. when two or more interventions are compared through their relative effect versus a common comparator) is crucial. Some authors have used a naïve (unadjusted) indirect comparison, in which the results of individual treatment arms between different trials are compared as if they were from a single trial. Simulation studies and empirical evidence indicates that the naïve indirect comparison is liable to bias and produces over-precise estimates. As such, the naïve indirect comparison should be avoided. However, suitable statistical methods for comparing multiple treatments that fully respect randomization have been available for some time. They have not been widely used, although their application is increasing in USA based medical journals and in medical decision making. In this particular matter, the sponsor states it has applied the methods described by Bucher et al and Song et al to the common comparator analysis of the GHAC and FIT-VFA trials in deriving the data on the primary and secondary fracture endpoints. The workings of this analysis were not located in the submissions but can be assumed to be performed correctly.
In addition to the statistical process, for the adjusted indirect comparison method to be valid an assessment of study characteristics (in particular, similarity and internal validity) that are related to the exchangeability of results across trials, such as patient characteristics, methodological quality, endpoint definitions, outcome measures, and adherence rates need to be considered. In general, the trial designs, primary endpoint, patient inclusion and exclusion criteria, patient demographic and baseline characteristics, and fracture rates in the placebo groups are comparable between the GHAC and FIT-VFA studies. This provides a reasonable level of assurance that the results are valid when compared via a common comparator analysis. However, there are some issues with the GHAC trial that detract from both the internal validity and generalisability of the data. The GHAC study was terminated after a median of 19 months of treatment (mean 18 months; maximal treatment period 2 years and 20 days) because a long term carcinogenicity study in rats revealed the occurrence of skeletal proliferative lesions, including osteosarcoma. This finding was later determined to be unlikely to have significant predictive ability in humans but nonetheless, may introduce a bias into the comparison. The study had originally been designed to continue for 3 years. The sponsor believes that comparing teriparatide where the active treatment phase for teriparatide is much shorter (19 month median) to comparator (mean follow-up of 2.9 years for aledronate in the FIT-VFA study) is biased against teriparatide. This interpretation is probably true.
However, other potential sources of bias related to patient disposition, trial design, and analysis were observed. Vertebral fracture analyses were based on 801 (placebo=398; teriparatide 20ug/day=403) of the 1085 (73.8%) women who were originally randomized to the two treatment arms of interest in the GHAC trial. There were multiple reasons for loss of evaluable subjects. A significant proportion (17.7% in total; 193/1085) of patients (placebo=96; teriparatide=97) lacked an adequate baseline or follow-up radiograph. Some placebo treated (n=50) and teriparatide treated (n=41) subjects were subsequently judged by the central reader to not have prevalent vertebral fractures. The study protocol required that patients have one or more prevalent vertebral fractures at screening as assessed by the investigative site but clearly a discrepancy arose in some patients for this assessment. Because the vertebral fracture status of these patients was unclear their data was excluded from the analysis. The sponsor does not believe these missing values impacted upon the results as baseline characteristics were not statistically different between placebo and teriparatide treated subjects. Moreover, prior to the sponsor’s decision to terminate the GHAC trial, 18.9% (205/1085) patients discontinued from the study. In addition, the central radiograph readers were not blinded to radiograph temporal sequence but were unaware of patient treatment assignment. It is also noteworthy that neither study (GHAC or FIT-VFA) limited enrolment to patients with SQ3 grade (severe) vertebral fractures at baseline which would be a desirable characteristic of the supporting clinical evidence.
Furthermore, because the severe vertebral fracture subgroup is such a limited subpopulation in total number (n=86 for teriparatide, n=95 for placebo) and the absolute event rate of new morphometric fractures is relatively small (n=5 for teriparatide, n=27 for placebo) this result may be prone to over-interpretation in the setting of low statistical power.
Top of page
Reviewer’s summary
The sponsor’s use of an indirect data comparison across placebo controlled trials to infer the superiority of teriparatide treatment over aledronate is scientifically less robust than the gold standard methodology of head-to-head randomized controlled studies. The sponsor has undertaken the appropriate statistical measures to optimize the validity of the indirect data comparison. However, the validity of the adjusted indirect comparisons depends on the internal validity and similarity of the trials involved. Some concerns remain regarding the internal validity and generalisability of the pivotal GHAC study which include an unanticipated premature termination of the study, and the significant loss of evaluable subjects due to patient drop-out and the lack of appropriate paired radiographs. As such, the data is open to varying interpretations which clearly detract from the scientific robustness of the treatment claim.Document download
This publication is available as a downloadable document.