### Modeling background

Here, we focus on a binary variable which allocates the patients in two groups *i*=0,1 (with *n*_{
i
} subjects in group *i*(*n*=*n*_{0} + *n*_{1})). For each patient *j*, *G*_{
j
}denotes the indicator variable of group 1. For the lung cancer dataset, this variable indicates the presence/absence of Kras mutation. In the following, a tumor is modeled as a set of clonogens, with identical properties and independent evolution. For each patient *j* in group *i*, let the random variables ${T}_{\mathrm{ij}}^{k}$ associated to the *k*^{
th
} latent (unobservable) clonogen, be the time-to-progression until a detectable recurrence with (clonogenic) survival function *A*_{
i
}(*t*). Let *K*_{
ij
} be the number of latent clonogens that survived the treatment for patient *j* in group *i*. We suppose that for the two groups, *K*_{
ij
} is distributed with probability mass function *Φ*_{0}*Φ*_{1}and *K*_{
ij
} is supposed to be independent of ${T}_{\mathrm{ij}}^{k}$. Let denote ${T}_{\mathrm{ij}}^{\ast}=\mathrm{mi}{n}_{1\le k\le {K}_{\mathrm{ij}}}\left({T}_{\mathrm{ij}}^{k}\right)$ the time-to-event of the earliest clonogen and *C*_{
ij
} the censoring time. We assume that ${T}_{\mathrm{ij}}^{\ast}$ and *C*_{
ij
}satisfy the condition of independent censoring [10]. For each subject, the data consist of ${X}_{\mathrm{ij}}=\mathrm{min}({T}_{\mathrm{ij}}^{\ast},{C}_{\mathrm{ij}})$ the observed time of follow-up, ${\delta}_{\mathrm{ij}}={1}_{({X}_{\mathrm{ij}}={T}_{\mathrm{ij}}^{\ast})}$ the indicator of the occurence of the earliest clonogen and *G*_{
j
} the indicator variable of group 1. We also denote *Y*_{
ij
}(*t*)=1_{(t≤}_{
X
ij
})the indicator of being at risk for an event at time *t*.

For each patient

*j* in group

*i* with

*K*_{
ij
}latent clonogens, the conditional (patient-specific) survival function is expressed as:

$\begin{array}{ll}{S}_{\mathrm{ij}}\left(t\right|{K}_{\mathrm{ij}})& =\text{Pr}\left({T}_{\mathrm{ij}}^{\ast}>t\right)\phantom{\rule{2em}{0ex}}\\ =\text{Pr}\left({T}_{\mathrm{ij}}^{1}>t,\dots ,{T}_{\mathrm{ij}}^{{K}_{\mathrm{ij}}}>t\right)={A}_{i}{\left(t\right)}^{{K}_{\mathrm{ij}}}\phantom{\rule{2em}{0ex}}\end{array}$

Thus, the marginal (population) survival function (for group

*i*) is given by:

${S}_{i}\left(t\right)=\sum _{k=0}^{\infty}{S}_{\mathrm{ij}}\left(t\right|k\left)\underset{{\Phi}_{i}}{\text{Pr}}\right(k)=\sum _{k=0}^{\infty}{A}_{i}{\left(t\right)}^{k}\underset{{\Phi}_{i}}{\text{Pr}}(k)$

Assuming that the number of clonogens in treated tumors is following for the two groups a Poisson distribution [2–4], the marginal distribution is such as : *S*_{
i
}(*t*)=exp {−*ξ*_{
i
}[1−*A*_{
i
}(*t*)]} where *ξ*_{
i
} (i.e. the Poisson parameter) is the mean number of clonogens and *exp*(−*ξ*_{
i
}) is the probability of having no surviving clonogen (cure fraction). From this framework, one can modelize short and long-term effects of a marker [6–8]. The short-term effect (linked to *A*_{
i
}(*t*)) formulates the shape of the difference between the (clonogenic) latent survival functions. The long-term effect (linked to *ξ*_{
i
}) quantifies the difference in the long-term survivors rates. It is straighforward to see that a same cure fraction between the different groups (no long-term effect) implies a same distribution for the number of surviving clonogens.

In the following, we consider a family of discrete distributions proposed by Katz [11] for which the Poisson distribution is considered as the benchmark model (null model). This family allows to consider different conditional probability mass functions for the number of surviving clonogens (${\text{Pr}}_{{\Phi}_{i}}({K}_{\mathrm{ij}}=u|{K}_{\mathrm{ij}}>0)$) with a same cure fraction ${\text{Pr}}_{{\Phi}_{i}}({K}_{\mathrm{ij}}=0)$.

### Distribution of the number of clonogens

We recall that Katz [

11,

12] proposed a family of discrete distributions with the property that successive count probabilities satisfy the following first-order recurrence formula:

$\frac{\text{Pr}(x+1)}{\text{Pr}\left(x\right)}=\frac{\omega +\mathrm{\theta x}}{1+x};\phantom{\rule{2.77695pt}{0ex}}x=0,\dots ,{\infty}_{+}$

where *ω*>0 and *θ*<1.

Katz showed that the probability generating function is such as:

$\begin{array}{c}g(s;\omega ,\theta )={\left[{\left(1-\theta \right)}^{-1}\times \left(1-\mathrm{\theta s}\right)\right]}^{-\frac{\omega}{\theta}}\phantom{\rule{0.3em}{0ex}}\text{for}\phantom{\rule{0.3em}{0ex}}\theta \ne 0\\ \phantom{\rule{1em}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}g(s;\omega ,\theta )=\text{exp}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\left[-\omega \left(1-s\right)\right]\phantom{\rule{0.3em}{0ex}}\text{for}\phantom{\rule{0.3em}{0ex}}\theta =0\end{array}$

with |*s*|≤1.

It follows that the initial probability is equal to: $\text{Pr}\left(0\right)={p}_{0}={\left(1-\theta \right)}^{\frac{\omega}{\theta}}$ for *θ*≠0 (*p*_{0}=*e*^{−ω} for *θ*=0). Thus, this family allows us to consider different conditional probability mass functions (Pr(*x*|*x*>0)) with a same *p*_{0}.

Moreover, it is worth noting that *ω*=*μ*^{2}/*σ*^{2}and *θ* is linked to the dispersion index (variance-to-mean ratio) such as : *σ*^{2}/*μ*= (1−*θ*)^{−1}. This family covers various distributions with the property of being under-dispersed (*θ*<0), over-dispersed (*θ*>0) or equi-dispersed (*θ*=0). This latter case corresponds to the Poisson distribution. For *θ*<0, it includes Binomial distributions (*N*=−*ω*/*θ*;*p*=*θ*/ (*θ*−1)) whereas for *θ*>0 it includes Negative Binomial distributions (*u*=*ω*/*θ*;*P*=*θ*/ (1−*θ*)).

Relying on this family of distributions, we propose to consider the following semi-parametric cure model.

### Improper survival function

According to the above results, a semi-parametric improper cure model, which encompasses the Poisson cure model, is obtained as follows:

The marginal survival function is defined such as:

${S}_{i}\left(t\right)=\sum _{k=0}^{\infty}{S}_{\mathrm{ij}}\left(t\right|k\left)\underset{{\Phi}_{i}}{\text{Pr}}\right(k)=\sum _{k=0}^{\infty}{A}_{i}{\left(t\right)}^{k}\underset{{\Phi}_{i}}{\text{Pr}}(k)$

where ${\text{Pr}}_{{\Phi}_{i}}\left(k\right)$ is the Katz probability mass function and *A*_{
i
}(*t*) is a decreasing function such as 1≥*A*_{
i
}(*t*)≥0.

Thus, we have the following general survival functions in group

*i*=0,1:

$\begin{array}{c}\phantom{\rule{1em}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{S}_{0}\left(t\right)=exp\left\{-{\omega}_{0}\left[1-{A}_{0}\left(t\right)\right]\right\}\\ {S}_{1}\left(t\right)={\left[{\left(1-\theta \right)}^{-1}\times \left(1-\theta {A}_{1}\left(t\right)\right)\right]}^{-\frac{{\omega}_{1}}{\theta}}\end{array}$

The corresponding cumulative hazard function and hazard function are noted *Θ*_{
i
}(*t*)=−log [*S*_{
i
}(*t*)] and ${\lambda}_{i}\left(t\right)=\frac{\partial}{\mathrm{\partial t}}{\Theta}_{i}\left(t\right)$, respectively. It is straighforward to see that *S*_{0}(*t*) and *S*_{1}(*t*) are improper survival functions with cure fractions ${S}_{0}\left({\infty}_{+}\right)={e}^{-{\omega}_{0}}$ and ${S}_{1}\left({\infty}_{+}\right)={\left(1-\theta \right)}^{\frac{{\omega}_{1}}{\theta}}$, respectively. Here, *A*_{0}(*t*) and *A*_{1}(*t*) are arbitrary latent survival functions decreasing with time from one to zero. We can give different shapes by modeling the function such as *A*_{1}(*t*)=*A*_{0}(*t*,*α*) where ${D}_{0}(t,\alpha )=-\frac{\partial}{\mathrm{\partial t}}{A}_{0}(t,\alpha )$ refers to the corresponding density function and *α* is a real parameter with *A*_{0}(*t*,0)=*A*_{0}(*t*). In the following section, we will consider a classical log-linear relationship such as ${A}_{0}(t,\alpha )={A}_{0}{\left(t\right)}^{{e}^{\alpha}}$. Thus, the parameter *α*formulates the shape of the difference between the clonogenic survival functions for group 0 and 1. When *α*≥0 (resp. *α*≤0) patients belonging to groupe 1 have earlier (resp. later) relapses as compared to group 0. Here, the Poisson model is considered as the reference one which leads to the marginal survival *S*_{0}(*t*). Changes in the distribution of the number of clonogens are interpreted with regard to this model. It is worth noting that the Poisson cure model can also be considered as representing an homogeneous multi-clonogenic model and departure from this model can be interpreted as either an under-dispersed (single clonogenic model) or over-dispersed (heterogeneous multi-clonogenic model) situation.

It is useful for the following to write the ratio of the hazard functions

*λ*_{0}(

*t*) and

*λ*_{1}(

*t*) deduced from model (1) so that:

$\begin{array}{ll}{\lambda}_{1}\left(t\right)=& \phantom{\rule{0.3em}{0ex}}{\lambda}_{0}\left(t\right)exp\left\{log\left[{\omega}_{1}/{\omega}_{0}\right]+log\left[{D}_{0}(t,\alpha )/{D}_{0}\left(t\right)\right]\right.\phantom{\rule{2em}{0ex}}\\ \left.-log\left[1-\theta {A}_{0}(t,\alpha )\right]\right\}.\phantom{\rule{2em}{0ex}}\end{array}$

In the following, we denote *γ*=log [*ω*_{1}/*ω*_{0}]. From a biological perspective, belonging to group 1 is associated with changes in the cure fraction, the conditional distribution of the number of surviving clonogens or the latent survival (tumor progression) through the parameters of interest *γ*, *θ* and *α*. If *α*=0, the latent (clonogenic) survival curves are identical between the two groups whatever the distribution of the number of clonogens. If *θ*=0, there is a same probability distribution family (Poisson) for the number of clonogens whatever the dynamic of the clonogens (*α*) or the cure fraction (*γ*). This latter case corresponds to the classical Poisson cure rate model. If *θ*=*α*=0, it corresponds to the proportional hazards hypothesis whereby the relative risk is constant over time but the improper survival distributions converges to different cure fractions. Moreover, it should be noted that using a different parametrization and constraining the quantity *θ*/*ω*_{1}to lie on [0,1] leads to the transformation cure model [9].

In this work, the general null hypothesis to be tested *H*_{0}:*θ*=*α*=*γ*=0 is the lack of survival difference between the two groups.

#### The proposed statistic

In the following, we derive a score statistic which is optimal under a classical log-linear relationship such as

${A}_{0}(t,\alpha )={A}_{0}{\left(t\right)}^{{e}^{\alpha}}$ so that the ratio of the hazard functions between the two groups is such as:

$\begin{array}{ll}{\lambda}_{1}\left(t\right)=& \phantom{\rule{0.3em}{0ex}}{\lambda}_{0}\left(t\right)exp\left\{\right.\gamma +\alpha +log\left[{A}_{0}\left(t\right)\right]\left({e}^{\alpha}-1\right)\phantom{\rule{2em}{0ex}}\\ \left.\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{2em}{0ex}}\phantom{\rule{2em}{0ex}}-log\left[1-\theta {A}_{0}{\left(t\right)}^{{e}^{\alpha}}\right]\right\}\phantom{\rule{1em}{0ex}}\phantom{\rule{2em}{0ex}}\end{array}$

Thus, the log-partial likelihood derived under this multiplicative model is such as:

$\begin{array}{ll}logL\left(\theta ,\alpha ,\gamma ;G\right)=& \phantom{\rule{0.3em}{0ex}}{\Sigma}_{j=1}^{n}{\delta}_{j}\left\{\phantom{\rule{-24.0pt}{0ex}}\upsilon \left({t}_{j}\right){G}_{j}\right.\phantom{\rule{2em}{0ex}}\\ \left.\phantom{\rule{1em}{0ex}}\phantom{\rule{2em}{0ex}}-log\left[\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right){e}^{\upsilon \left({t}_{j}\right){G}_{k}}\right]\right\}\phantom{\rule{2em}{0ex}}\end{array}$

where $\upsilon \left(t\right)=\gamma +\alpha +log\left[{A}_{0}\left(t\right)\right]\left({e}^{\alpha}-1\right)-log\left[1-\theta {A}_{0}{\left(t\right)}^{{e}^{\alpha}}\right]$

The score vector is derived from the first derivative of the log-partial likelihood with respect to *θ*, *α* and *γ* evaluated under *H*_{0}:*θ*=*α*=*γ*=0.

The score vector is deduced under the null hypothesis (

*H*_{0}:

*θ*=

*α*=

*γ*=0). The three components are as follows:

$\begin{array}{ll}{\widehat{V}}_{{H}_{0},\alpha}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left[1+log\left(1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right)\right]\left\{{G}_{j}-\frac{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right){G}_{k}}{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right)}\right\}\phantom{\rule{2em}{0ex}}\\ {\widehat{V}}_{{H}_{0},\theta}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left[1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right]\left\{{G}_{j}-\frac{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right){G}_{k}}{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right)}\right\}\phantom{\rule{2em}{0ex}}\\ {\widehat{V}}_{{H}_{0},\gamma}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left\{{G}_{j}-\frac{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right){G}_{k}}{\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right)}\right\}\phantom{\rule{2em}{0ex}}\end{array}$

For computing the score statistic, we should substitute *Θ*_{0}(*t*) and *ω*_{0} by efficient estimators ${\widehat{\Theta}}_{0}\left(t\right)$ and ${\widehat{\omega}}_{0}$ computed under the null hypothesis *H*_{0}. Here, ${\widehat{\Theta}}_{0}\left(t\right)=\sum _{j=1}^{n}\underset{0}{\overset{t}{\int}}{\left\{\sum _{k=1}^{n}{Y}_{k}\right(s\left)\right\}}^{-1}d{N}_{j}\left(s\right)$, where *N*_{
j
}_{(t)=1}_{{X j}_{≤t,δ}_{
j
}_{=1}} is the left-continuous version of the Nelson-Aalen estimator for the cumulative hazard [13] obtained by using the pooled sample and ${\widehat{\omega}}_{0}={\widehat{\Theta}}_{0}\left({t}_{max}\right)$ is the maximum value of this estimator computed at the last observed failure time *t*_{max}. In our problem, the limiting distribution of the proposed statistic where *ω*_{0} is replaced by ${\widehat{\omega}}_{0}$ is obtained by using the results of Pierce [14] in the context of improper survival distribution [8]. Here, ${\widehat{\omega}}_{0}$ is an efficient estimator of *ω*_{0}if the upper bound of the domain for the survival distribution is less or equal to the upper bound of the domain for the censoring distribution [8, 14]. In practice, this latter condition expresses the fact that the uncured patients should experience the event within the maximum length of follow-up. This condition is assumed to be verified and is required for establishing the limiting distribution of the proposed statistic.

The corresponding information matrix

$\stackrel{\u02c6}{\mathit{I}}$ is such as:

$\begin{array}{ll}\frac{{\partial}^{2}logL}{{\partial}^{2}\alpha}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}{\left[1+log\left(1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right)\right]}^{2}\left\{{\Delta}_{j}\right\}\phantom{\rule{2em}{0ex}}\\ \frac{{\partial}^{2}logL}{{\partial}^{2}\theta}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}{\left[1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right]}^{2}\left\{{\Delta}_{j}\right\};\frac{{\partial}^{2}logL}{{\partial}^{2}\gamma}=\sum _{j=1}^{n}{\delta}_{j}\left\{{\Delta}_{j}\right\}\phantom{\rule{2em}{0ex}}\end{array}$

and

$\begin{array}{ll}\frac{{\partial}^{2}logL}{\mathrm{\partial \alpha \partial \theta}}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left[1+log\left(1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right)\right]\left[1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{\omega}\right]\left\{{\Delta}_{j}\right\}\phantom{\rule{2em}{0ex}}\\ \frac{{\partial}^{2}logL}{\mathrm{\partial \gamma \partial \theta}}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left[1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right]\left\{{\Delta}_{j}\right\}\phantom{\rule{2em}{0ex}}\\ \frac{{\partial}^{2}logL}{\mathrm{\partial \alpha \partial \gamma}}\phantom{\rule{0.3em}{0ex}}& =\phantom{\rule{0.3em}{0ex}}\sum _{j=1}^{n}{\delta}_{j}\left[1+log\left(1-\frac{{\Theta}_{0}\left({t}_{j}\right)}{{\omega}_{0}}\right)\right]\left\{{\Delta}_{j}\right\}\phantom{\rule{2em}{0ex}}\end{array}$

with ${\Delta}_{j}={\left[\frac{{S}^{\left(1\right)}(0,0,0,t)}{{S}^{\left(0\right)}(0,0,0,{t}_{j})}\right]}^{2}-\left[\frac{{S}^{\left(2\right)}(0,0,0,{t}_{j})}{{S}^{\left(0\right)}(0,0,0,{t}_{j})}\right]$

*where*${S}^{\left(r\right)}(0,0,0,t)={n}^{-1}\sum _{k=1}^{n}{Y}_{k}\left({t}_{j}\right){G}_{j}^{r}$ with *r*=0,1,2.

The elements of the score vector and of the information matrix (${I}_{{H}_{0}}$) are computed by using efficient estimators of *Θ*_{0}(*t*_{
j
}) and *ω*_{0}as given above.

Finally, the statistic

${S}_{{H}_{0}}=\left({\widehat{V}}_{{H}_{0},\alpha},{\widehat{V}}_{{H}_{0},\theta},{\widehat{V}}_{{H}_{0},\gamma}\right){\xce}_{{H}_{0}}^{-1}{\left({\widehat{V}}_{{H}_{0},\alpha},{\widehat{V}}_{{H}_{0},\theta},{\widehat{V}}_{{H}_{0},\gamma}\right)}^{\u2033}$

is asymptotically distributed under *H*_{0}as a chi-square with three degrees of freedom.