Inserm, CESP Centre for research in Epidemiology and Population Health, U1018, Biostatistics Team, F94807 Villejuif, France
Univ ParisSud, UMRS1018, F94807 Villejuif, France
Université de ToulouseINSA, IMT UMR CNRS 5219, Toulouse, France
Département de pharmacologie, Centre de pharmacovigilance, CHU de Bordeaux, Bordeaux, France
INSERM U657, Bordeaux, France
Abstract
Background
Analyzing timetoonset of adverse drug reactions from treatment exposure contributes to meeting pharmacovigilance objectives,
Methods
Both approaches, naive or taking right truncation into account, were compared with a simulation study. We used twelve scenarios for the exponential distribution and twentyfour for the Weibull and loglogistic distributions. These scenarios are defined by a set of parameters: the parameters of the timetoonset distribution, the probability of this distribution falling within an observable values interval and the sample size. An application to reported lymphoma after anti TNF
Results
The simulation study shows that the bias and the mean squared error might in some instances be unacceptably large when right truncation is not considered while the truncationbased estimator shows always better and often satisfactory performances and the gap may be large. For the real dataset, the estimated expected timetoonset leads to a minimum difference of 58 weeks between both approaches, which is not negligible. This difference is obtained for the Weibull model, under which the estimated probability of this distribution falling within an observable values interval is not far from 1.
Conclusions
It is necessary to take right truncation into account for estimating timetoonset of adverse drug reactions from spontaneous reporting databases.
Background
Identifying and preventing adverse drug reactions are major objectives of pharmacovigilance. Owing to design constraints, premarketing clinical trials fail to identify rare events, which lead in the last decades to an increased focus placed on the development of postmarketing surveillance methods
The data consisting of the timetoonset among patients who were reported to have potentially developed an adverse drug reaction are righttruncated. Truncation arises because some patients who were exposed to the drug and who will eventually develop the adverse drug reaction may do it after the time of analysis (Figure
Right truncation and data on timetoonset of adverse drug reactions from spontaneous reporting databases
Right truncation and data on timetoonset of adverse drug reactions from spontaneous reporting databases. Some patients who were exposed to the drug and who will eventually develop the adverse drug reaction may do it after the time of analysis. Here, in these hypothetical examples, the patient on the top line is included in the database because he experienced the adverse drug reaction before the time of analysis,
This paper investigates parametric maximum likelihood estimation of the timetoonset distribution of adverse drug reactions from spontaneous reporting data for different types of hazard functions likely to be encountered in pharmacovigilance. Acknowledgment of the developments adapted to righttruncated data is not widespread and these methods have never been used in pharmacovigilance. No simulation studies are available on the accuracy of their estimates. Furthermore, a naive approach that does not take into account right truncation features of spontaneous reports and uses classical parametric methods instead of appropriate methods may lead to misleading estimates. We consider the two approaches,
Methods
Proper estimation of the timetoonset distribution
We consider a given time of analysis and the population of exposed patients who will eventually experience the adverse drug reaction before they die. Let
We consider a parametric model for the timetoonset
When right truncation,
maximizing this likelihood yields the naive estimator of
When right truncation is considered, the likelihood is modified. Observed timestoonset consist of
the maximum likelihood estimator from this likelihood,
The nonparametric maximum likelihood estimation for righttruncated data was developed and used to estimate the incubation period distribution for AIDS
where the
In a parametric framework, the unconditional distribution is completely specified by a parameter
Simulation study
Some adverse reactions have a very short timetoonset, from several minutes to several hours after the beginning of treatment. Others occur only after several days, weeks, months or even years of exposure. This variation depends on numerous factors such as the pharmacokinetics of the drug and its metabolites, or the pathophysiological mechanism of the effect. The multiplicity of the underlying mechanisms results in a range of possible hazard functions that can be observed in pharmacovigilance
Distribution
Exponential
Weibull
Loglogistic
Density
Support
Parameter(s)
The timestoonset were generated from these three distributions. Two values of
Parametric likelihood maximization with and without considering right truncation were performed for each generated sample. An iterative algorithm is necessary to solve this optimization problem except for the naive exponential estimation. Calculations were made with the R
Application study
We analyzed 64 French cases of lymphoma that occurred after anti TNF
All anti TNFagents taken together, we derived the parametric maximum likelihood estimates and secondarily corresponding estimated mean times, with and without considering right truncation, for the exponential, Weibull and loglogistic distributions. For completeness, we also derived the nonparametric maximum likelihood estimation.
The French pharmacovigilance database is developed by the French drug agency (
Results
Simulation study
For each set of simulations parameters, for both approaches and for both parameters, the bias and the mean squared error of the parametric maximum likelihood estimator, based on the 1000 replications, were calculated as well as the proportion of replications where the estimate is larger than the true value. As the iterative algorithm may fail to find a maximum, those three quantities were actually calculated on the replications where there was no problem of maximization. The mean squared error is a measure of the dispersion of the estimator around the true value of the parameter  the smaller the better  and is used for global comparative purposes between two estimation procedures, as it incorporates both the variance of the estimator and its bias. The proportion of replications where the estimate is larger than the true value makes it possible to know if the estimators tend to overestimate or underestimate systematically the true value of the parameter.
For both approaches, for all distributions and for both parameters, the smaller is
Naive estimator
TBE
BIAS(
MSE(
BIAS(
MSE(
NPM
The mean squared error formula is
0.05
0.25
100
0.498
0.250
0.030
0.005
224
500
0.498
0.248
0.007
0.001
79
0.05
0.50
100
0.195
0.038
0.008
0.001
85
500
0.193
0.037
<0.001
<0.001
1
0.05
0.80
100
0.073
0.005
<0.001
<0.001
2
500
0.072
0.005
<0.001
<0.001
0
1
0.25
100
10.06
102
0.462
2.17
72
500
9.95
99
0.046
0.48
10
1
0.50
100
3.91
15.4
0.126
0.49
29
500
3.86
14.9
0.022
0.12
0
1
0.80
100
1.45
2.16
0.004
0.11
0
500
1.45
2.11
0.004
0.02
0
Naive estimator
TBE
BIAS
MSE
BIAS
MSE
BIAS
MSE
BIAS
MSE
NPM
The mean squared error formula is
0.05
0.5
0.25
100
4.04
16.7
0.200
0.044
0.465
0.51
0.046
0.007
312
500
3.95
15.6
0.195
0.039
0.106
0.04
0.013
0.001
201
0.05
0.5
0.50
100
0.762
0.60
0.167
0.031
0.068
0.018
0.024
0.005
172
500
0.747
0.56
0.164
0.028
0.015
0.003
0.003
0.001
22
0.05
0.5
0.80
100
0.160
0.027
0.119
0.017
0.008
0.002
0.009
0.004
9
500
0.156
0.025
0.113
0.013
0.001
<0.001
0.001
<0.001
0
1
0.5
0.25
100
80.4
6612
0.201
0.044
8.68
183
0.046
0.007
300
500
78.9
6249
0.194
0.038
2.07
17
0.012
0.001
186
1
0.5
0.50
100
15.0
233
0.174
0.034
1.53
7.99
0.031
0.006
163
500
15.0
225
0.164
0.028
0.32
1.17
0.003
0.001
24
1
0.5
0.80
100
3.20
10.8
0.117
0.017
0.16
0.67
0.007
0.004
13
500
3.15
10.0
0.112
0.013
0.041
0.15
<0.001
<0.001
0
0.05
2
0.25
100
0.121
0.015
0.354
0.16
<0.001
0.002
0.097
0.075
8
500
0.120
0.014
0.333
0.12
0.004
0.001
0.020
0.016
2
0.05
2
0.50
100
0.065
0.004
0.278
0.11
0.004
<0.001
0.047
0.074
6
500
0.064
0.004
0.264
0.08
0.002
<0.001
0.004
0.016
0
0.05
2
0.80
100
0.032
0.001
0.182
0.063
<0.001
<0.001
0.046
0.063
1
500
0.032
0.001
0.157
0.031
<0.001
<0.001
0.008
0.014
0
1
2
0.25
100
2.41
5.84
0.364
0.17
0.090
0.79
0.10
0.075
1
500
2.41
5.79
0.336
0.12
0.082
0.38
0.02
0.015
0
1
2
0.50
100
1.29
1.68
0.283
0.12
0.073
0.33
0.052
0.069
3
500
1.29
1.65
0.261
0.07
0.065
0.12
0.002
0.017
0
1
2
0.80
100
0.638
0.41
0.186
0.065
0.024
0.086
0.045
0.064
0
500
0.636
0.40
0.154
0.030
0.007
0.014
0.004
0.013
0
Naive estimator
TBE
BIAS
MSE
BIAS
MSE
BIAS
MSE
BIAS
MSE
NPM
The mean squared error formula is
0.05
0.5
0.25
100
6.45
44
0.384
0.16
0.258
0.25
0.041
0.008
217
500
6.33
40
0.372
0.14
0.043
0.01
0.005
0.001
52
0.05
0.5
0.50
100
1.05
1.2
0.319
0.108
0.045
0.012
0.020
0.006
22
500
1.02
1.1
0.308
0.096
0.009
0.001
0.003
0.001
0
0.05
0.5
0.80
100
0.165
0.031
0.195
0.041
0.008
0.001
0.008
0.004
0
500
0.158
0.026
0.189
0.036
0.001
<0.001
0.001
<0.001
0
1
0.5
0.25
100
129
17533
0.383
0.15
5.06
87
0.042
0.008
207
500
127
16217
0.374
0.14
1.01
6
0.008
0.001
41
1
0.5
0.50
100
21.0
467
0.317
0.106
0.93
5.0
0.019
0.006
43
500
20.5
426
0.308
0.096
0.20
0.6
0.004
0.001
0
1
0.5
0.80
100
3.31
12
0.201
0.044
0.209
0.55
0.016
0.005
0
500
3.17
10
0.190
0.037
0.037
0.09
0.002
<0.001
0
0.05
2
0.25
100
0.150
0.022
1.06
1.2
<0.001
0.001
0.08
0.085
4
500
0.149
0.022
1.04
1.1
0.001
<0.001
0.01
0.018
0
0.05
2
0.50
100
0.079
0.006
0.932
0.94
<0.001
<0.001
0.06
0.094
5
500
0.078
0.006
0.903
0.83
<0.001
<0.001
0.01
0.017
0
0.05
2
0.80
100
0.035
0.001
0.665
0.50
<0.001
<0.001
0.03
0.078
0
500
0.035
0.001
0.649
0.43
<0.001
<0.001
0.01
0.013
0
1
2
0.25
100
2.99
9.0
1.07
1.2
0.024
0.57
0.08
0.089
0
500
2.98
8.9
1.04
1.1
0.028
0.20
0.01
0.020
0
1
2
0.50
100
1.57
2.49
0.943
0.96
0.007
0.19
0.063
0.095
1
500
1.56
2.45
0.896
0.82
0.013
0.04
0.004
0.018
0
1
2
0.80
100
0.702
0.50
0.668
0.50
0.004
0.042
0.045
0.072
0
500
0.693
0.48
0.648
0.43
0.004
0.007
0.015
0.013
0
For both approaches, for all distributions and for both parameters, Tables
Naive estimator
TBE
Calculations were made on the replications where there was no problem of maximization.
0.05
0.25
100
100%
61.6%
500
100%
55.3%
0.05
0.50
100
100%
55.3%
500
100%
50.4%
0.05
0.80
100
100%
51.1%
500
100%
51.7%
1
0.25
100
100%
54.8%
500
100%
50.7%
1
0.50
100
100%
53.2%
500
100%
48.0%
1
0.80
100
100%
50.0%
500
100%
51.0%
Naive estimator
TBE
Calculations were made on the replications where there was no problem of maximization.
0.05
0.5
0.25
100
100%
100%
81.4%
71.9%
500
100%
100%
64.6%
64.5%
0.05
0.5
0.50
100
100%
100%
63.3%
60.1%
500
100%
100%
53.4%
51.0%
0.05
0.5
0.80
100
100%
99.6%
52.0%
53.3%
500
100%
100%
48.6%
51.6%
1
0.5
0.25
100
100%
100%
79.3%
76.0%
500
100%
100%
62.0%
61.2%
1
0.5
0.50
100
100%
100%
65.9%
64.6%
500
100%
100%
53.8%
51.8%
1
0.5
0.80
100
100%
99.5%
52.7%
52.2%
500
100%
100%
51.9%
50.6%
0.05
2
0.25
100
100%
98.1%
52.1%
61.6%
500
100%
100%
52.2%
53.7%
0.05
2
0.50
100
100%
94.2%
51.6%
53.3%
500
100%
100%
50.6%
51.0%
0.05
2
0.80
100
100%
85.4%
56.1%
55.8%
500
100%
97.9%
52.2%
49.6%
1
2
0.25
100
100%
98.2%
56.2%
62.5%
500
100%
99.9%
50.1%
54.8%
1
2
0.50
100
100%
94.3%
53.9%
54.2%
500
100%
99.9%
47.1%
48.1%
1
2
0.80
100
100%
85.3%
54.1%
54.2%
500
100%
97.9%
52.7%
52.2%
1Naive estimator
TBE
Calculations were made on the replications where there was no problem of maximization.
0.05
0.5
0.25
100
100%
100%
67.2%
67.7%
500
100%
100%
53.6%
52.0%
0.05
0.5
0.50
100
100%
100%
55.4%
57.5%
500
100%
100%
51.1%
52.0%
0.05
0.5
0.80
100
100%
100%
51.1%
53.2%
500
100%
100%
50.8%
51.5%
1
0.5
0.25
100
100%
100%
67.7%
66.1%
500
100%
100%
55.9%
56.1%
1
0.5
0.50
100
100%
100%
54.9%
57.2%
500
100%
100%
53.4%
53.4%
1
0.5
0.80
100
100%
100%
55.1%
56.5%
500
100%
100%
51.9%
52.0%
0.05
2
0.25
100
100%
100%
53.2%
55.9%
500
100%
100%
51.8%
51.8%
0.05
2
0.50
100
100%
100%
55.0%
54.2%
500
100%
100%
53.3%
52.2%
0.05
2
0.80
100
100%
100%
50.3%
51.5%
500
100%
100%
53.9%
54.4%
1
2
0.25
100
100%
100%
52.7%
56.1%
500
100%
100%
53.3%
51.0%
1
2
0.50
100
100%
100%
54.3%
56.4%
500
100%
100%
50.1%
49.5%
1
2
0.80
100
100%
100%
52.0%
53.7%
500
100%
100%
52.9%
55.0%
Application study
Table
Naive estimator
TBE
Distribution
Expectation (weeks)
Expectation (weeks)
^{*}95% confidence intervals calculated using BCa simple bootstrap method based on 5000 replicates.
Exponential
0.00739

135
0.00172

0.60
581
[264,7528]^{*}
Weibull
0.00666
1.55
135
0.00468
1.49
0.98
193
[150,432]^{*}
Loglogistic
0.00890
2.06
171
0.00408
1.53
0.76
567
[207,1.8 ×10^{12}]^{*}
Figure
Right truncationbased estimations of timetoonset of lymphoma that occurred after anti TNF
Right truncationbased estimations of timetoonset of lymphoma that occurred after anti TNF
Figure
Naive and right truncationbased estimations of timetoonset of lymphoma that occurred after anti TNF
Naive and right truncationbased estimations of timetoonset of lymphoma that occurred after anti TNF
Discussion and conclusions
In drug safety assessment, the temporal relationship between drug administration and timetoonset is of utmost relevance. A better understanding of the underlying mechanism of the occurrence of an adverse effect is crucial, as it could allow the identification of particular groups of patients at risk and of particular risk timewindows in the course of a treatment and lead to preventing or diagnosing earlier the occurrence of adverse reactions. In this framework, the timetoonset of an adverse drug reaction constitutes an essential feature to be analyzed. Its accurate estimation and modeling could help in understanding the mechanism of a drug’s action.
As rare adverse effects are not generally identified by cohort studies of exposed patients but from spontaneous reporting systems, we investigated with a simulation study the accuracy of estimates that can be obtained from these data in a parametric framework. As one can only estimate a conditional distribution function in a nonparametric setting, the nonparametric maximum likelihood estimator is of rather little interest for pharmacovigilance people. For a finite sample size, the simulations show that, whatever the approach, naive or truncationbased, the parametric maximum likelihood estimator may be positively biased and that this bias and the corresponding mean squared error increase when the theoretical probability
The probability
Problems of maximization may arise when right truncation is taken into account. The smaller is
For the 64 cases of lymphoma after anti TNF
Finally, improvement of timetoonset distribution assessment could make it possible to compare two drug profiles or more generally to assess risk factors with regression models.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
FL, JYD and PTB conceived and designed the work. FL implemented the simulations, performed data analysis and wrote the initial draft of the manuscript. HT and FH made the extraction of the data from the national pharmacovigilance database. All authors contributed to the interpretation of the results of the data analysis. All authors reviewed and revised the draft version of the manuscript. All authors read and approved the final version of the manuscript.
Acknowledgements
This work was supported by the Fondation ARC (fellowship DOC20121206119 to Fanny Leroy).
Prepublication history
The prepublication history for this paper can be accessed here: