Department of Clinical Epidemiology and Clinical Research, Institut Bergonié, Regional Comprehensive Cancer Centre, Bordeaux, France

Department of Pathology, Institut Bergonié, Regional Comprehensive Cancer Centre, Bordeaux, France

Department of Medical Oncology, Institut Bergonié, Regional Comprehensive Cancer Centre, Bordeaux, France

Department of Surgery, Institut Bergonié, Regional Comprehensive Cancer Centre, Bordeaux, France

Unité INSERM 897, Université Victor Segalen Bordeaux 2, Bordeaux, France

Abstract

Background

The Cox model relies on the proportional hazards (PH) assumption, implying that the factors investigated have a constant impact on the hazard - or risk - over time. We emphasize the importance of this assumption and the misleading conclusions that can be inferred if it is violated; this is particularly essential in the presence of long follow-ups.

Methods

We illustrate our discussion by analyzing prognostic factors of metastases in 979 women treated for breast cancer with surgery. Age, tumour size and grade, lymph node involvement, peritumoral vascular invasion (PVI), status of hormone receptors (HRec), Her2, and Mib1 were considered.

Results

Median follow-up was 14 years; 264 women developed metastases. The conventional Cox model suggested that all factors but HRec, Her2, and Mib1 status were strong prognostic factors of metastases. Additional tests indicated that the PH assumption was not satisfied for some variables of the model. Tumour grade had a significant time-varying effect, but although its effect diminished over time, it remained strong. Interestingly, while the conventional Cox model did not show any significant effect of the HRec status, tests provided strong evidence that this variable had a non-constant effect over time. Negative HRec status increased the risk of metastases early but became protective thereafter. This reversal of effect may explain non-significant hazard ratios provided by previous conventional Cox analyses in studies with long follow-ups.

Conclusions

Investigating time-varying effects should be an integral part of Cox survival analyses. Detecting and accounting for time-varying effects provide insights on some specific time patterns, and on valuable biological information that could be missed otherwise.

Background

Survival analysis, or time-to-event data analysis, is widely used in oncology since we are often interested in studying a delay, such as the time from cancer diagnosis or treatment initiation to cancer recurrence or death. Thanks to the improvement of cancer treatments, and the induced longer life expectancy, we observe an increasing number of studies with long follow-up periods. Statistical models to analyze such data should thus adequately account for the increasing duration of follow-ups. The Cox proportional hazards (PH) model allows one to describe the survival time as a function of multiple prognostic factors

Although the Cox model has been widely used (more than 25 000 citations since the publication of the original paper by Cox

Assessing whether the assumption of proportional hazards is a central theme in survival analysis, and as such is discussed in several statistical textbooks

Our objective is to inform clinicians, as well as those who read and write manuscripts in medical journals, about the importance of the underlying PH assumption, the misleading conclusions that can be inferred if it is violated, as well as the additional information provided by verifying it. After a theoretical introduction, we describe techniques to assess if this assumption is violated, and model strategies to account for, and describe time-dependency. We illustrate our discussion with a study on prognostic factors in breast cancer.

Methods and results

Survival analysis

In many studies, the primary variable of interest is a delay, such as the time from cancer diagnosis to a particular event of interest. This event may be death, and for this reason the analysis of such data is often referred to as survival analysis. The event of interest may not have occurred at the time of the statistical analysis, and similarly, a subject may be lost to follow-up before the event is observed. In such case, data are said to be censored at the time of the analysis or at the time the patient was lost to follow-up. Censored data still bring some information since although we do not know the exact date of the event, we know that it occurred later than the censoring time.

Both the Kaplan-Meier method and the Cox proportional hazards (PH) model allow one to analyze censored data

The instantaneous hazard rate at time t, also called instantaneous incidence, death, or failure rate, or risk, is the instantaneous probability of experiencing an event at time _{1}, X_{2}, ..., X_{n}, then the hazard is given by:

The baseline hazard rate h_{0}(t) is an unspecified non-negative function of time. It is the time-dependent part of the hazard and corresponds to the hazard rate when all covariate values are equal to zero. β_{1}, β_{2}, ..., β_{n }are the coefficients of the regression function β_{1}x_{1 }+ β_{2}x_{2 }+... β_{n}x_{n}. Suppose that we are interested in a single covariate then the hazard is:

The hazards for two subjects with covariate values x_{1 }and x_{2 }are thus given respectively by h_{x1}(t) = h_{0}(t) exp(βx_{1}) and h_{x2}(t) = h_{0}(t) exp(βx_{2}), and the hazard ratio is expressed as:

Taking x_{2 }= x_{1 }+ 1, the hazard ratio reduces to HR = exp(β) and corresponds to the effect of one unit increase in the explanatory variable X on the risk of event. Since β = log(HR), β is referred as the log hazard ratio. Although the hazard rate h_{x}(t) is allowed to vary over time, the hazard ratio HR is constant; this is the _{2 }compared to subjects with covariate value x_{1}, while a HR lower than 1 (β < 0) indicates a decreased risk. When the HR is not constant over time, the variable is said to have a time-varying effect; for example, the effect of a treatment can be strong immediately after treatment but fades with time. This should not be confused with a time-varying covariate, which is a variable whose value is not fixed over time, such as smoking status. Indeed, a person can be a non-smoker, then a smoker, then a non-smoker. Note however, that a variable may be both time-varying and have an effect that changes over time.

In a Cox PH model, the HR is estimated by considering each time

Example

We applied some of the presented methods to breast cancer patients as time-varying effects have been reported, such as for nodal or hormone receptor status,

Characteristics of the study population.

**N**

**(%)**

**Year of diagnosis**

1989

231

23.6

1990

207

21.1

1991

182

18.6

1992

189

19.3

1993

170

17.4

**Metastases following surgery**

Yes

264

27.0

No

715

73.0

**Age at diagnosis**

≤ 40 years

76

7.8

> 40 years

903

92.2

**SBR Grade**

Grade I

275

28.1

Grade II

444

45.3

Grade III

260

26.6

**Tumor size**

≤ 20 mm

753

76.9

> 20 mm

226

23.1

**Lymph node involvement**

No

554

56.6

Yes

425

43.4

**Peritumoral vascular invasion**

No

700

71.5

Yes

279

28.5

**Hormone Receptor status**

Both ER- and PR-

178

18.2

At least ER+ or PR+

801

81.8

**Her2 status**

Positive

100

10.2

Negative

879

89.8

**Mib1 status**

Negative

691

70.6

Positive

288

29.4

Working example

The prognostic factors were initially selected based on current knowledge regarding risk of metastases. They were next analyzed using a conventional Cox regression model; all were statistically significant at the 5% level in the univariate analyses, and were then entered onto a multivariate Cox model. The risk of metastases was increased for women with younger age compared to older age; grade II and III tumours compared to grade I tumours; large compared to small tumour sizes; lymph node involvement compared to no involvement; and PVI compared to no PVI (Additional file

Estimated log hazard ratios (log(HR)), and hazard ratios (HR = exp(

Click here for file

Assessing non-proportionality: Graphical strategy

In the presence of a categorical variable, one can plot the Kaplan-Meier survival distribution, S(t), as a function of the survival time, for each level of the covariate. If the PH assumption is satisfied, the curves should steadily drift apart. One can also apply a transformation of the Kaplan-Meier survival curves and plot the function log(-log(S(t))) as a function of the log survival time, where log represents the natural logarithm function. If the hazards are proportional, the stratum specific log-minus-log plots should exhibit constant differences, that is be approximately parallel. These visual methods are simple to implement but have limitations. When the covariate has more than two levels, Kaplan-Meier plots are not useful for discerning non-proportionality because the graphs become to cluttered

Statistical software

**
R/Splus
^{©}
**

**
SAS
^{©}
**

**
SPSS
^{©}
**

**
Stata
^{©}
**

**Graphical checks**

survfit function

lifetest procedure

Survt command

sts command

**Time-by-covariate interactions**

programming required.

phreg procedure (definition of interactions)/test statement.

time program command (definition of interactions)/cox reg command.

tvc option/stcox command

**Scaled Schonfeld residuals**

cox.zph function

phreg procedure/ressch option

Not directly available/programming required

stphtest command

**Cumulative residuals**

Timereg/gof libraries/cum.residuals function

phreg procedure/assess statement/ph option

Not directly available/programming required

Not directly available/programming required

Working example (cont')

Kaplan-Meier survival curves and log-minus-log plots are shown for some variables (Figures

Kaplan-Meier survival curves for SBR grade, tumour size, PVI, hormone receptor status

**Kaplan-Meier survival curves for SBR grade, tumour size, PVI, hormone receptor status**.

Log(-log(survival)) curves as a function of time (log scale) for SBR grade, tumour size, PVI, hormone receptor status

**Log(-log(survival)) curves as a function of time (log scale) for SBR grade, tumour size, PVI, hormone receptor status**.

Assessing non-proportionality: Modelling and testing strategies

Graphical methods for checking the PH assumption do not provide a formal diagnostic test, and confirmatory approaches are required. Multiple options for testing and accounting for non-proportionality are available.

Cox proposed assessing departure from non-proportionality by introducing a constructed time-dependent variable, that is, adding an interaction term that involves time to the Cox model, and test for its significance ^{2}, log(t), ...). Adding this interaction to the model (equation

The hazard ratio is given by HR(t) = h_{x+1}(t)/h_{x}(t) = exp[β + γ.x.f(t)] for a unit increase in the variable X, and is time-dependent through the function f(t). If γ > 0 (γ < 0), then the HR increases (decreases) over time. Testing for non-proportionality of the hazards is equivalent to testing if γ is significantly different from zero. One can use different time functions such as polynomial or exponential decay but often very simple fixed functions of time such as linear or logarithmic functions are preferred

Working example (cont')

We created time-by-covariate interactions for each variable of the model, by introducing products between the variables and a linear function of time. As shown in Additional File

Departure from non-proportionality can also be investigated using the residuals of the model. A residual measures the difference between the observed data, and the expected data under the assumption of the model. Schoenfeld residuals are calculated and reported at every failure time under the PH assumption, and as such are not defined for censored subjects

Working example (cont')

For each covariate, scaled Schoenfeld residuals were plotted over time, and tests for a zero slope were performed. The corresponding p-values, as well as the p-value associated with a global test of non-proportionality are reported in Table

Test for non-proportionality based on the scaled Schoenfeld residuals from the conventional Cox model (see table 1).

**Variable**

**p-value**

Age

0.10

Grade II

<0.01

Grade III

<0.01

Size

0.32

Lymph node involvement

0.22

PVI

0.05

Hormone receptor

0.05

Her2

0.08

Mib1

0.07

**GLOBAL**

<0.01

Scaled Schoenfeld residuals for SBR grade, PVI, and hormone receptor status (with 95% confidence interval)

**Scaled Schoenfeld residuals for SBR grade, PVI, and hormone receptor status (with 95% confidence interval)**.

The cumulative sum of Schoenfeld residuals, or equivalently the observed score process can also be used to assess proportional hazards

Working example (cont')

Tests based on cumulative residuals are presented in Table

Test for non-proportionality based on the Cumulative residuals from the conventional Cox model (see table 1).

**Variable**

**p-value**

Age

0.97

Grade II

0.02

Grade III

<0.01

Size

0.16

Lymph node involvement

0.75

PVI

0.11

Hormone receptor

<0.01

Her2

<0.01

Mib1

<0.01

Observed score process for SBR grade, lymph node involvement, and hormone receptor status (with 95% confidence interval)

**Observed score process for SBR grade, lymph node involvement, and hormone receptor status (with 95% confidence interval)**.

Another simple approach for testing time-varying effects of covariates involves fitting different Cox models for different time periods. Indeed, although the PH assumption may not hold over the complete follow-up period, it may hold over a shorter time window. Unless there is an interest in a particular cut-off time value, two subsets of data can be created based on the median event time

Working example (cont')

The median event time was 4.3 years. A Cox model was applied censoring everyone still at risk after 4.3 years, while only those subjects still at risk beyond this time point were included in another model (Additional file

Estimated hazard ratios (exp(

Click here for file

It is also possible to account for non-proportionality by partitioning the time axis as proposed by Moreau et al.

Abandoning the assumption of proportional hazards, and as such, the Cox model, is another option. Indeed, other powerful statistical models are available to account for time-varying effects, including additive models, accelerated failure time models, regression splines models or fractional polynomials

Finally, one can perform a statistical analysis stratified by the variable suspected to have a time-varying effect; this variable should be thus categorical or be categorized. Each stratum _{k}(t) = exp(βx) Stratifying assumes that the other covariates are acting in the same way in each stratum, that is, HRs are similar across strata. Although stratification is effective in removing the problem of non-proportionality and simple to implement, it has some disadvantages. Most importantly, stratification by a non-proportional variable precludes estimation of its strength and its test within the Cox model. Thus, this approach should be selected if one is not directly interested in quantifying the effect of the variable used for stratification. Moreover, a stratified Cox model can lead to a loss of power, because more of the data are used to estimate separate hazard functions; this impact will depend on the number of subjects and strata

Discussion

While ensuring that the PH assumption holds is part of the modeling process, it is also useful in providing valuable information on time-varying effects. In our illustrative example, the conventional Cox model suggested that all factors but HRec, Her2, and Mib1 status were strong prognostic factors of metastases. Additional tests indicated that the PH assumption was not satisfied for some variables of the model. Tumour grade had a significant time-varying effect, but although its effect diminished over time, it remained strong. According to the conventional model hormone receptor status did not significantly impact relapses. Additional tests provided strong evidence of a time-varying effect. Importantly, both tests based on residuals suggested that negative hormone receptor status increased the risk of metastases early but became protective thereafter, in accordance with the analysis partitioned on event time. This reversal of effect may explain the non-significant averaged hazard ratio provided by the conventional Cox model and reported earlier

Applying a Cox model without ensuring that its underlying assumptions are validated can lead to negative consequences on the resulting estimates

Once non-proportionality is established, time-dependency can be accounted for in different ways. The strategy will depend on the study objectives. If there is no interest in longer time periods, one can shorten the follow-up time as non-proportionality is less likely to be an issue on short time intervals. If there is no particular interest in the variable with the time-varying effect, one could stratify on this variable in the statistical analysis, however no association between the stratification variable and survival can be tested. If one wants to describe the effect of the variable over time, it is possible to rely on time by covariate interactions or on plots of residuals to estimate of relative risks at different time points. Methods to test and account for non-proportionality are available in most standard statistical software (Table

It is difficult to propose definite guidelines for the best strategy for testing for non-proportionality. Each method has its advantages and limitations, and depending on the study objective some approaches might be preferred. Before performing statistical modeling, the study objectives should be clearly stated in advance, as well as the statistical tests that will be employed. Departure from non-proportionality can be investigated using graphical and numerical approaches. Plotting methods involve visualizing the Kaplan-Meier survival curves for the variable tested for non-proportionality. This graphical method requires categorical variables, and is particularly appropriate for binary data; however they do not provide formal diagnostic tests. Numerical tests involve for example testing for covariate-by-time interactions or for the presence of a trend in the residuals of the model. Including a covariate-by-time interaction is particularly simple within the Cox model; however, results are strongly dependent on the choice of the functional form of the time function. Tests based on cumulative residuals tend to have better statistical properties than those based on the Schoenfeld residuals. As a result, performing a test based on the cumulative residuals seems to be a more powerful approach in detecting covariates with time-varying effects.

Note that the Cox model involves multiple types of residuals including the martingale, deviance, score and Schoenfeld residuals, which can be particularly useful as additional regression diagnostics for the Cox model. Martingale residuals are useful for determining the functional form of a covariate to be included in the model and deviance residuals can be used to examine model accuracy. Additional details can be found in

Statistical testing raises the issue of power, that is, the ability of tests to find true effects. We have seen for example that some simple strategies, such as shortening the observation period can suffer from reduced power as fewer events are considered. This might be a limitation with small datasets. Simulations have shown that stratified Cox modeling usually leads to wider confidence intervals, that is, reduced power compared to unstratified analysis

Since its original publication in 1972, the Cox proportional-hazards model has gained widespread use and has become a popular tool for the analysis of survival data in medicine. After performing an online search, we found that the original paper by Cox had been cited approximately 25, 000 times, with about 8, 000 citations in oncology papers

Our objective was to familiarize the reader with the PH assumption. We also highlighted that detecting and accounting for time-varying effects provide insights on some specific time patterns and valuable biological information that could be missed otherwise. Given the possible consequences on parameter estimates, checking the proportionality of hazards should be an integral part of a survival analysis based on a Cox model. In the presence of variables with time-varying risks, plots should be used to augment the results and indicate where non-proportionality is present. This seems particularly appropriate in the context of oncology studies, as long follow-ups are common and non-constant hazards have already been reported.

Conclusions

Investigating time-varying effects should be an integral part of Cox survival analyses. Detecting and accounting for time-varying effects provide insights on some specific time patterns, and on valuable biological information that could be missed otherwise.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

CB conceived the study, performed the statistical analysis and drafted the manuscript. GMG carried out the immunoassays. MD provided clinical expertise in oncology. CTL provided clinical expertise in surgery. VB was responsible of the datamanagement. SMP participated in the design of study. All authors read and approved the final manuscript.

Acknowledgements

The tissue microarray was financed by the Comités départementaux de la Gironde, Dordogne, Charente, Charente Maritime, Landes, by la Ligue Nationale contre le Cancer, and by Lyons Club de Bergerac, France.

Pre-publication history

The pre-publication history for this paper can be accessed here: