Deriving the PMF of a Pure-Birth Process with Alternating Rates

Consider a two-player turn-based game in which the two players take their turns at different rates \(\alpha>0\) and \(\beta>0\). One might ask, what is the distribution of the number of turns taken up to a given time \(t\)? We can answer this question by leveraging an interesting connection between two superficially unrelated concepts: pure-birth processes and approximations of the exponential function.

Two people playing chess, one presumably slower than the other.

Pure-Birth Processes, Divided Differences, and Hermite-Padé Forms

A pure-birth process or birth process is a particular type of continuous time Markov chain taking values in \(\mathbb{N}\) which is increasing and exclusively makes jumps of size 1. The value of the process at time \(t\) is the “number of births" which have occurred by that time, and it is determined by its inter-birth times, which are independent exponentially distributed random variables. Here we will use the explicit construction of such a process described below.

Definition 1. Let \(\lambda\in\mathbb{R}_{+}^{\infty}\), take \(\tau_{i}\sim\text{Exp}(\lambda_{i})\) independent and define \(S_{k}^{\lambda} = \sum_{i=1}^{k}\tau_{i}\) with \(S_{0}^{\lambda} = 0\). We define the pure-birth process associated with \(\lambda\) by
\[X_{t}^{\lambda} := \max\{k\geq 0 : S_{k}^{\lambda}\leq t\}\]
for all \(t\geq 0\).

A typical way to analyze a pure-birth process would be to compute its infinitesimal generator and use the Kolmogorov equations to show that its transition probabilities satisfy a certain set of recurrence relations. From there, if the inter-birth rates have some kind of regular pattern (i.e. alternating back and forth), it may be possible to use probability generating functions to get explicit formulas for the transition probabilities [1]. However, even for the simplest cases such as the two-player game described above, this method is extremely computation-heavy.

Another route is to leverage a connection between pure-birth processes and approximations of the function \(f(t,x) = e^{-tx}\), utilizing the concept of divided differences. Recall the generalized divided difference \(f[x_{1},\ldots,x_{n+1}]\), defined recursively by:
\[f[x_{1},\ldots,x_{n+1}] = \begin{cases} \dfrac{f[x_{2},\ldots,x_{n+1}] - f[x_{1},\ldots,x_{n}]}{x_{n+1} - x_{1}} &\:\:\:\: x_{1}\neq x_{n+1} \\[10pt] \lim\limits_{h\rightarrow 0}\dfrac{f[x_{2},\ldots,x_{n+1}+h] - f[x_{1},\ldots,x_{n}]}{h} &\:\:\:\: x_{1} = x_{n+1} \end{cases}\]
with \(f[x_{i}] = f(x_{i})\). Notably, \(f[x_{1},\ldots, x_{n+1}]\) is symmetric in its arguments, and if \(x_{1}\leq x_{2}\leq\cdots\leq x_{n}\leq x_{n+1}\), then
\[f[x_{1},\ldots,x_{n+1}] = \begin{cases} \dfrac{f[x_{2},\ldots,x_{n+1}] - f[x_{1},\ldots,x_{n}]}{x_{n+1} - x_{1}} &\:\:\:\: x_{1}\neq x_{n+1} \\[10pt] \dfrac{f^{(n)}(x_{1})}{n!} &\:\:\:\: x_{1} = x_{n+1}. \end{cases}\]
Generalized divided differences usually appear as the coefficients of the Hermite interpolating polynomial for a function \(f\) evaluated at the arguments of the divided difference. They also have an integral representation due to Hermite and Genocchi [2]:

Theorem 1 (Hermite-Genocchi). For any \(f\) which is \(n\)-times differentiable and any \(x_{1},\ldots,x_{n+1}\) (not necessarily distinct) we have
\[f[x_{1},\ldots, x_{n+1}] = \underset{\Omega_{n}}{\int\cdots\int} f^{(n)}(r_{1}x_{1}+\cdots + r_{n+1}x_{n+1})dr_{n}\ldots dr_{1}\]
where
\[\Omega_{n} = \left\{(r_{1},\ldots,r_{n}) : r_{i}\geq 0\:\:\forall i,\:\: \sum_{i=1}^{n}r_{i}\leq 1\right\},\:\:\:\:\:\:\:\: r_{n+1}= 1 - \sum_{i=1}^{n}r_{i}.\]

That is, the generalized divided difference of a function is given by integrating the function’s \(n^{\text{th}}\) derivative over all convex combinations of its arguments. We can exploit this fact to get a general formula for the p.m.f. of any pure-birth process:

Theorem 2. Let \(\lambda\in \mathbb{R}_{+}^{\infty}\) with corresponding pure-birth process \((X_{t}^{\lambda})\). For any time \(t\geq 0\), we then have
\[\mathbb{P}(X_{t}^{\lambda} = k) = (-1)^{k}\left(\prod_{i=1}^{k}\lambda_{i}\right)f_{t}[\lambda_{1},\ldots,\lambda_{k+1}],\:\:\:\:\:\:\:\: f_{t}(x) = e^{-tx}\]
for \(k\in\mathbb{N}\).

Proof. By construction, for each \(\lambda\), \(k\), and \(t\), we have
\[\begin{aligned} \mathbb{P}(X_{t}^{\lambda} = k) &= \mathbb{P}(S_{k}^{\lambda}\leq t, S_{k+1}^{\lambda}>t) \\[5pt] &= \int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t - \sum_{i=1}^{k-1}s_{i}}\int_{t - \sum_{i=1}^{k}s_{i}}^{\infty}\left(\prod_{i=1}^{k+1}\lambda_{i}e^{-\lambda_{i}s_{i}}\right)ds_{k+1}\ldots ds_{1} \\[5pt] &= \int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t - \sum_{i=1}^{k-1}s_{i}}e^{-\lambda_{k+1}(t - \sum_{i=1}^{k}s_{i})}\left(\prod_{i=1}^{k}\lambda_{i}e^{-\lambda_{i}s_{i}}\right)ds_{k}\ldots ds_{1} \\[5pt] &= \left(\prod_{i=1}^{k}\lambda_{i}\right)\int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t - \sum_{i=1}^{k-1}s_{i}}e^{-\lambda_{k+1}(t - \sum_{i=1}^{k}s_{i}) - \sum_{i=1}^{k}\lambda_{i}s_{i}}ds_{k}\ldots ds_{1} \\[5pt] &= (-1)^{k}\left(\prod_{i=1}^{k}\lambda_{i}\right)\int_{0}^{1}\int_{0}^{1-r_{1}}\cdots\int_{0}^{1 - \sum_{i=1}^{k-1}r_{i}}(-t)^{k}e^{- t\sum_{i=1}^{k+1}\lambda_{i}r_{i}}dr_{k}\ldots dr_{1} \end{aligned}\]
where \(r_{i} = s_{i}/t\) for \(i = 1,\ldots,k\) and \(r_{k+1} = 1 - \sum_{i=1}^{k}r_{i}\). Thus, noting that \(f_{t}^{(n)}(x) = (-t)^{n}e^{-tx}\), the result follows directly from Theorem 1.

◻

This is a remarkably nice-looking formula considering that it works for any pure-birth process, and it apparently has somehow not yet appeared in the literature. However, its aesthetic appeal is tempered by the general poor behavior of the generalized divided difference; though easy to compute recursively for any specific set of inputs, it is pretty much impossible to analyze in general. Luckily, this difficulty is mitigated in our setting by the specific properties of the exponential.

In the particular case of the exponential function, divided differences are naturally connected to the notion of Hermite-Padé approximation. Given a sequence of integers \(\textbf{r} = (r_{0},\ldots,r_{m})\), it is a well-known result of Hermite that there exist unique (up to a constant multiple) polynomials \(A_{r_{0}}^{\textbf{r}},\ldots, A_{r_{m}}^{\textbf{r}}\) such that each \(A_{r_{i}}^{\textbf{r}}\) has degree at most \(r_{i}\) and
\[\sum_{i=0}^{m}A_{r_{i}}^{\textbf{r}}(t)e^{it} = O(t^{n+1})\]
where \(n = \sum_{i=0}^{n}r_{i} + m - 1\). The left hand side in the expression above is called a Hermite-Padé form (HPF) of degree \(m\) and order \(n\). The most well known such form is \(P_{p/q}(x) - Q_{p/q}(x)e^{x}\), where \(P_{p/q}(x)\) and \(Q_{p/q}(x)\) are the Padé approximant polynomials for \(e^{x}\) of degrees \(p\) and \(q\) respectively, given by
\[\begin{aligned} P_{p/q}(t) &= \frac{1}{p!q!}\sum_{i=0}^{p}(p+q-i)!{p \choose i}t^{i} \\[5pt] Q_{p/q}(t) &= \frac{1}{p!q!}\sum_{j=0}^{q}(-1)^{j}(p+q-j)!{q \choose j}t^{j}. \end{aligned}\]
For the function \(g_{t}(x) = e^{tx}\), the connection between divided differences and Hermite-Padé forms was codified by Sablonniere [3].

Theorem 3 (Sablonniere). The Hermite-Padé form for a given \(\textbf{r}\in\mathbb{N}^{m+1}\) has the following representation as a divided difference:
\[\sum_{i=0}^{m}A_{r_{i}}^{\textbf{r}}(t)e^{it} = g_{t}[0^{(r_{0}+1)}, 1^{(r_{1}+1)},\ldots,m^{(r_{m}+1)}]\]
where \(k^{(r_{k}+1)}\) means that the argument \(k\) has multiplicity \(r_{k}+1\) in the divided difference above.

Application to the Two-Player Game

A special case of Theorem 3 is when \(m=1\), in which case we have
\[g_{t}[0^{(p+1)}, 1^{(q+1)}] = (-1)^{q+1}(P_{p/q}(t) - Q_{p/q}(t)e^{t}).\]
In Sablonniere’s paper, the result is only proved for the inputs 0 and 1 to the divided difference, and only if they each appear with multiplicity at least 1. We can rework this result for general inputs \(\alpha\) and \(\beta\) with multiplicity \(\geq 0\), also swapping \(g_{t}(x)\) for the function \(f_{t}(x) = e^{-tx}\).

Theorem 4. For any \(\alpha,\beta > 0\) with \(\alpha\neq\beta\) and any \(n,m\in\mathbb{N}\), we have
\[f_{t}[\alpha^{(n)},\beta^{(m)}] = \frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-1/m-1}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/m-1}(\alpha t - \beta t)\right)\]
where \(P_{p/q}(t)\) and \(Q_{p/q}(t)\) are defined as above for \(p,q\geq 0\), and in addition we take
\[\begin{aligned} P_{-1/q}(t) = 0\:\:\:\:\forall q\geq-1,& &\:\:\:\:\: Q_{p/-1}(t) = 0\:\:\:\:\forall p\geq-1 \\[5pt] P_{p/-1}(t) = \frac{t^{p}}{p!}\:\:\:\:\forall p\geq 0,& &\:\:\:\:\: Q_{-1/q}(t) = \frac{t^{q}}{q!}\:\:\:\:\forall q\geq 0. \end{aligned}\]

Proof. For the case where \(n=0\) or \(m=0\), the result follows from the recursive definition of the divided difference and the definitions of \(P\) and \(Q\); assuming without loss of generality that \(m=0\), we have
\[\begin{aligned} f_{t}[\alpha^{(n)}] &= \frac{(-t)^{n-1}e^{-\alpha t}}{(n-1)!} \\[5pt] &= \frac{(-1)^{n-1}e^{-\alpha t}}{(\alpha - \beta)^{n-1}}\frac{(\alpha t - \beta t)^{n-1}}{(n-1)!} \\[5pt] &= \frac{(-1)^{n-1}e^{-\alpha t}}{(\alpha - \beta)^{n-1}}P_{n-1/-1}(\alpha t - \beta t). \end{aligned}\]
For the case where \(n=1\) or \(m=1\), we again assume without loss of generality that \(m=1\) and proceed by induction. When \(n=m=1\), we have
\[f_{t}[\alpha, \beta] = \frac{e^{-\alpha t} - e^{-\beta t}}{\alpha - \beta} = \frac{1}{\alpha - \beta}(e^{-\alpha t}P_{0/0}(\alpha t - \beta t) - e^{-\beta t}Q_{0/0}(\alpha t - \beta t)).\]
For \(n\geq 2\), we assume the result holds for \(n-1\), in which case we have
\[\begin{aligned} f_{t}[\alpha^{(n)}, \beta] &= \frac{f_{t}[\alpha^{(n)}] - f_{t}[\alpha^{(n-1)}, \beta]}{\alpha - \beta} \\[5pt] &= \frac{(-1)^{n-1}}{(\alpha - \beta)^{n}}\left(e^{-\alpha t}P_{n-1/-1}(\alpha t - \beta t) + \left(e^{-\alpha t}P_{n-2/0}(\alpha t - \beta t) - e^{-\beta t}Q_{n-2/0}(\alpha t - \beta t)\right)\right) \\[5pt] &= \frac{(-1)^{n-1}}{(\alpha - \beta)^{n}}\left(e^{-\alpha t}\left(P_{n-1/-1}(\alpha t - \beta t) + P_{n-2/0}(\alpha t - \beta t)\right) - e^{-\beta t}Q_{n-2/0}(\alpha t - \beta t)\right). \end{aligned}\]
By definition, \(Q_{m/0}(t) = 1\) for all \(m\), and in particular \(Q_{n-2/0}(t) = Q_{n-1/0}(t)\). Moreover, for any \(t\) we have
\[\begin{aligned} P_{n-1/-1}(t) + P_{n-2/0}(t) &= \frac{t^{n-1}}{(n-1)!} + \frac{1}{(n-2)!}\sum_{i=0}^{n-2}(n-2-i)!{n-2 \choose i}t^{i} \\[5pt] &= \frac{t^{n-1}}{(n-1)!} + \frac{1}{(n-1)!}\sum_{i=0}^{n-2}(n-1)(n-2-i)!{n-2 \choose i}t^{i} \\[5pt] &= \frac{t^{n-1}}{(n-1)!} + \frac{1}{(n-1)!}\sum_{i=0}^{n-2}(n-1-i)!{n-1 \choose i}t^{i} \\[5pt] &= \frac{1}{(n-1)!}\sum_{i=0}^{n-1}(n-1-i)!{n-1 \choose i}t^{i} \\[5pt] &= P_{n-1/0}(t). \end{aligned}\]
Hence, we have
\[f_{t}[\alpha^{(n)}, \beta] = \frac{(-1)^{n-1}}{(\alpha - \beta)^{n}}\left(e^{-\alpha t}P_{n-1/0}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/0}(\alpha t - \beta t)\right)\]
as desired. For the general case where \(n,m\geq 2\), we again argue by induction. The possible base cases are covered by the computations above, so we can go straight to the induction step; assume the result holds for \(n, m-1\) and \(n-1,m\). We then have
\[\begin{aligned} f_{t}[\alpha^{(n)}, \beta^{(m)}] &= \frac{f_{t}[\alpha^{(n)}, \beta^{(m-1)}] - f_{t}[\alpha^{(n-1)}, \beta^{(m)}]}{\alpha - \beta}\\[5pt] &= \frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-1/m-2}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/m-2}(\alpha t - \beta t)\right)\\ &\:\:\:\:\:\:\:\:\:\:\:+\frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-2/m-1}(\alpha t - \beta t) - e^{-\beta t}Q_{n-2/m-1}(\alpha t - \beta t)\right)\nonumber\\[5pt] &= \frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}e^{-\alpha t}\left(P_{n-1/m-2}(\alpha t - \beta t) + P_{n-2/m-1}(\alpha t - \beta t)\right)\\ &\:\:\:\:\:\:\:\:\:\:\:-\frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}e^{-\beta t}\left(Q_{n-1/m-2}(\alpha t - \beta t) + Q_{n-2/m-1}(\alpha t - \beta t)\right). \end{aligned}\]
The fact that \(P_{n-1/m-2}(t) + P_{n-2/m-1}(t) = P_{n-1/m-1}(t)\) and \(Q_{n-1/m-2}(t) + Q_{n-2/m-1}(t) = Q_{n-1/m-1}(t)\) is proved in [3], and thus we conclude
\[f_{t}[\alpha^{(n)}, \beta^{(m)}] = \frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-1/m-1}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/m-1}(\alpha t - \beta t)\right)\]
as desired.

◻

Now, noting that the two player game described above is simply the pure-birth process associated with \(\lambda = (\alpha,\beta,\alpha,\beta,\ldots)\), the desired p.m.f. falls out as a corollary of Theorems 2 and 4.

Corollary 1. Let \((X_{t}^{\alpha,\beta})\) denote the pure-birth process associated with the rate vector \(\lambda = (\alpha,\beta,\alpha,\beta,\ldots)\), where \(\alpha,\beta > 0\) with \(\alpha\neq\beta\). We then have
\[\mathbb{P}(X_{t}^{\alpha,\beta} = k) = (-1)^{k+k_{3}}\frac{\alpha^{k_{1}}\beta^{k_{2}}}{(\alpha-\beta)^{k}}\left(e^{-\alpha t}P_{k_{3}/k_{4}}(\alpha t - \beta t) - e^{-\beta t}Q_{k_{3}/k_{4}}(\alpha t - \beta t)\right)\]
where
\[k_{1} = \left\lceil \frac{k}{2}\right\rceil,\:\:\:\:k_{2} = \left\lfloor \frac{k}{2}\right\rfloor,\:\:\:\:k_{3} = \left\lceil \frac{k-1}{2}\right\rceil, \:\:\:\:k_{4} = \left\lfloor \frac{k-1}{2}\right\rfloor.\]

Let’s check to make sure this formula works. We can code up a simulation of the two-player game with \(\alpha=1\) and \(\beta=2\) and get the empirical distribution for the number of turns taken up to time \(t=5\) over 1,000,000 trials, and compare this to the plot given by our formula above.

empirical distribution from 1,000,000 simulations — (i) Empirical distribution from 1,000,000 simulations. (ii) Exact distribution using formula from Corollary 1.

exact distribution using formula from Corollary 1 — (i) Empirical distribution from 1,000,000 simulations. (ii) Exact distribution using formula from Corollary 1.

From these plots, it appears we’ve successfully found a formula for the p.m.f. of a pure-birth process with alternating rates. Nice! Now it just needs a catchy name; my suggestions are the “interlaced Poisson distribution” or the “dual Poisson distribution”, but I will allow the scholars of the future the freedom to decide what best to call it.

References

Antonio Di Crescenzo, Antonella Iuliano, and Barbara Martinucci. On a bilateral birth-death process with alternating rates. Ricerche di Matematica, 61(1):157–169, November 2011.
Kendall E. Atkinson. An Introduction to Numerical Analysis. John Wiley & Sons, New York, second edition, 1989.
Paul Sablonniére. An algorithm for the computation of hermite–pad´e approximations to the exponential function: divided differences and hermite–padé forms. Numerical Algorithms, 33:443–452, 2003.

Free Sample

Search This Blog