Consider a two-player turn-based game in which the two players take their turns at different rates \(\alpha>0\) and \(\beta>0\). One might ask, what is the distribution of the number of turns taken up to a given time \(t\)? We can answer this question by leveraging an interesting connection between two superficially unrelated concepts: pure-birth processes and approximations of the exponential function.

Pure-Birth Processes, Divided Differences, and Hermite-Padé Forms
A pure-birth process or birth process is a particular type of continuous time Markov chain taking values in \(\mathbb{N}\) which is increasing and exclusively makes jumps of size 1. The value of the process at time \(t\) is the “number of births" which have occurred by that time, and it is determined by its inter-birth times, which are independent exponentially distributed random variables. Here we will use the explicit construction of such a process described below.
Definition 1. Let \(\lambda\in\mathbb{R}_{+}^{\infty}\), take
\(\tau_{i}\sim\text{Exp}(\lambda_{i})\)
independent and define \(S_{k}^{\lambda} =
\sum_{i=1}^{k}\tau_{i}\) with \(S_{0}^{\lambda} = 0\). We define the
pure-birth process associated with \(\lambda\) by
\[X_{t}^{\lambda} := \max\{k\geq 0 :
S_{k}^{\lambda}\leq t\}\]
for all \(t\geq 0\).
A typical way to analyze a pure-birth process would be to compute its infinitesimal generator and use the Kolmogorov equations to show that its transition probabilities satisfy a certain set of recurrence relations. From there, if the inter-birth rates have some kind of regular pattern (i.e. alternating back and forth), it may be possible to use probability generating functions to get explicit formulas for the transition probabilities [1]. However, even for the simplest cases such as the two-player game described above, this method is extremely computation-heavy.
Another route is to leverage a connection between pure-birth processes
and approximations of the function \(f(t,x) =
e^{-tx}\), utilizing the concept of divided differences. Recall
the generalized divided difference \(f[x_{1},\ldots,x_{n+1}]\), defined
recursively by:
\[f[x_{1},\ldots,x_{n+1}] = \begin{cases}
\dfrac{f[x_{2},\ldots,x_{n+1}] -
f[x_{1},\ldots,x_{n}]}{x_{n+1} - x_{1}} &\:\:\:\: x_{1}\neq x_{n+1}
\\[10pt]
\lim\limits_{h\rightarrow 0}\dfrac{f[x_{2},\ldots,x_{n+1}+h]
- f[x_{1},\ldots,x_{n}]}{h} &\:\:\:\: x_{1} = x_{n+1}
\end{cases}\]
with \(f[x_{i}] = f(x_{i})\). Notably,
\(f[x_{1},\ldots, x_{n+1}]\) is
symmetric in its arguments, and if \(x_{1}\leq
x_{2}\leq\cdots\leq x_{n}\leq x_{n+1}\), then
\[f[x_{1},\ldots,x_{n+1}] = \begin{cases}
\dfrac{f[x_{2},\ldots,x_{n+1}] -
f[x_{1},\ldots,x_{n}]}{x_{n+1} - x_{1}} &\:\:\:\: x_{1}\neq x_{n+1}
\\[10pt]
\dfrac{f^{(n)}(x_{1})}{n!} &\:\:\:\: x_{1} = x_{n+1}.
\end{cases}\]
Generalized divided differences usually appear as the coefficients of
the Hermite interpolating polynomial for a function \(f\) evaluated at the arguments of the divided
difference. They also have an integral representation due to Hermite and
Genocchi [2]:
Theorem 1 (Hermite-Genocchi). For any \(f\) which is \(n\)-times differentiable and any \(x_{1},\ldots,x_{n+1}\) (not necessarily
distinct) we have
\[f[x_{1},\ldots, x_{n+1}] =
\underset{\Omega_{n}}{\int\cdots\int} f^{(n)}(r_{1}x_{1}+\cdots +
r_{n+1}x_{n+1})dr_{n}\ldots dr_{1}\]
where
\[\Omega_{n} = \left\{(r_{1},\ldots,r_{n}) :
r_{i}\geq 0\:\:\forall i,\:\: \sum_{i=1}^{n}r_{i}\leq
1\right\},\:\:\:\:\:\:\:\: r_{n+1}= 1 -
\sum_{i=1}^{n}r_{i}.\]
That is, the generalized divided difference of a function is given by integrating the function’s \(n^{\text{th}}\) derivative over all convex combinations of its arguments. We can exploit this fact to get a general formula for the p.m.f. of any pure-birth process:
Theorem 2. Let \(\lambda\in \mathbb{R}_{+}^{\infty}\) with
corresponding pure-birth process \((X_{t}^{\lambda})\). For any time \(t\geq 0\), we then have
\[\mathbb{P}(X_{t}^{\lambda} = k) =
(-1)^{k}\left(\prod_{i=1}^{k}\lambda_{i}\right)f_{t}[\lambda_{1},\ldots,\lambda_{k+1}],\:\:\:\:\:\:\:\:
f_{t}(x) = e^{-tx}\]
for \(k\in\mathbb{N}\).
Proof. By construction, for each \(\lambda\), \(k\), and \(t\), we have
\[\begin{aligned}
\mathbb{P}(X_{t}^{\lambda} = k) &=
\mathbb{P}(S_{k}^{\lambda}\leq t, S_{k+1}^{\lambda}>t) \\[5pt]
&= \int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t -
\sum_{i=1}^{k-1}s_{i}}\int_{t -
\sum_{i=1}^{k}s_{i}}^{\infty}\left(\prod_{i=1}^{k+1}\lambda_{i}e^{-\lambda_{i}s_{i}}\right)ds_{k+1}\ldots
ds_{1} \\[5pt]
&= \int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t -
\sum_{i=1}^{k-1}s_{i}}e^{-\lambda_{k+1}(t -
\sum_{i=1}^{k}s_{i})}\left(\prod_{i=1}^{k}\lambda_{i}e^{-\lambda_{i}s_{i}}\right)ds_{k}\ldots
ds_{1} \\[5pt]
&=
\left(\prod_{i=1}^{k}\lambda_{i}\right)\int_{0}^{t}\int_{0}^{t-s_{1}}\cdots\int_{0}^{t
- \sum_{i=1}^{k-1}s_{i}}e^{-\lambda_{k+1}(t - \sum_{i=1}^{k}s_{i}) -
\sum_{i=1}^{k}\lambda_{i}s_{i}}ds_{k}\ldots ds_{1} \\[5pt]
&=
(-1)^{k}\left(\prod_{i=1}^{k}\lambda_{i}\right)\int_{0}^{1}\int_{0}^{1-r_{1}}\cdots\int_{0}^{1
- \sum_{i=1}^{k-1}r_{i}}(-t)^{k}e^{-
t\sum_{i=1}^{k+1}\lambda_{i}r_{i}}dr_{k}\ldots dr_{1}
\end{aligned}\]
where \(r_{i} = s_{i}/t\) for \(i = 1,\ldots,k\) and \(r_{k+1} = 1 - \sum_{i=1}^{k}r_{i}\). Thus,
noting that \(f_{t}^{(n)}(x) =
(-t)^{n}e^{-tx}\), the result follows directly from Theorem
1.
◻
This is a remarkably nice-looking formula considering that it works for any pure-birth process, and it apparently has somehow not yet appeared in the literature. However, its aesthetic appeal is tempered by the general poor behavior of the generalized divided difference; though easy to compute recursively for any specific set of inputs, it is pretty much impossible to analyze in general. Luckily, this difficulty is mitigated in our setting by the specific properties of the exponential.
In the particular case of the exponential function, divided
differences are naturally connected to the notion of Hermite-Padé
approximation. Given a sequence of integers \(\textbf{r} = (r_{0},\ldots,r_{m})\), it is
a well-known result of Hermite that there exist unique (up to a constant
multiple) polynomials \(A_{r_{0}}^{\textbf{r}},\ldots,
A_{r_{m}}^{\textbf{r}}\) such that each \(A_{r_{i}}^{\textbf{r}}\) has degree at most
\(r_{i}\) and
\[\sum_{i=0}^{m}A_{r_{i}}^{\textbf{r}}(t)e^{it} =
O(t^{n+1})\]
where \(n = \sum_{i=0}^{n}r_{i} + m -
1\). The left hand side in the expression above is called a
Hermite-Padé form (HPF) of degree \(m\) and order \(n\). The most well known such form is \(P_{p/q}(x) - Q_{p/q}(x)e^{x}\), where \(P_{p/q}(x)\) and \(Q_{p/q}(x)\) are the Padé approximant
polynomials for \(e^{x}\) of degrees
\(p\) and \(q\) respectively, given by
\[\begin{aligned}
P_{p/q}(t) &= \frac{1}{p!q!}\sum_{i=0}^{p}(p+q-i)!{p \choose
i}t^{i} \\[5pt]
Q_{p/q}(t) &= \frac{1}{p!q!}\sum_{j=0}^{q}(-1)^{j}(p+q-j)!{q
\choose j}t^{j}.
\end{aligned}\]
For the function \(g_{t}(x) = e^{tx}\),
the connection between divided differences and Hermite-Padé forms was
codified by Sablonniere [3].
Theorem 3 (Sablonniere). The Hermite-Padé form
for a given \(\textbf{r}\in\mathbb{N}^{m+1}\) has the
following representation as a divided difference:
\[\sum_{i=0}^{m}A_{r_{i}}^{\textbf{r}}(t)e^{it} =
g_{t}[0^{(r_{0}+1)}, 1^{(r_{1}+1)},\ldots,m^{(r_{m}+1)}]\]
where \(k^{(r_{k}+1)}\) means that the
argument \(k\) has multiplicity \(r_{k}+1\) in the divided difference
above.
Application to the Two-Player Game
A special case of Theorem 3 is when \(m=1\), in which case we have
\[g_{t}[0^{(p+1)}, 1^{(q+1)}] =
(-1)^{q+1}(P_{p/q}(t) - Q_{p/q}(t)e^{t}).\]
In Sablonniere’s paper, the result is only proved for the inputs 0 and 1
to the divided difference, and only if they each appear with
multiplicity at least 1. We can rework this result for general inputs
\(\alpha\) and \(\beta\) with multiplicity \(\geq 0\), also swapping \(g_{t}(x)\) for the function \(f_{t}(x) = e^{-tx}\).
Theorem 4. For any \(\alpha,\beta > 0\) with \(\alpha\neq\beta\) and any \(n,m\in\mathbb{N}\), we have
\[f_{t}[\alpha^{(n)},\beta^{(m)}] =
\frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha
t}P_{n-1/m-1}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/m-1}(\alpha t -
\beta t)\right)\]
where \(P_{p/q}(t)\) and \(Q_{p/q}(t)\) are defined as above for \(p,q\geq 0\), and in addition we take
\[\begin{aligned}
P_{-1/q}(t) = 0\:\:\:\:\forall q\geq-1,& &\:\:\:\:\:
Q_{p/-1}(t) = 0\:\:\:\:\forall p\geq-1 \\[5pt]
P_{p/-1}(t) = \frac{t^{p}}{p!}\:\:\:\:\forall p\geq 0,&
&\:\:\:\:\: Q_{-1/q}(t) = \frac{t^{q}}{q!}\:\:\:\:\forall q\geq 0.
\end{aligned}\]
Proof. For the case where \(n=0\) or \(m=0\), the result follows from the
recursive definition of the divided difference and the definitions of
\(P\) and \(Q\); assuming without loss of generality
that \(m=0\), we have
\[\begin{aligned}
f_{t}[\alpha^{(n)}] &= \frac{(-t)^{n-1}e^{-\alpha
t}}{(n-1)!} \\[5pt]
&= \frac{(-1)^{n-1}e^{-\alpha t}}{(\alpha -
\beta)^{n-1}}\frac{(\alpha t - \beta t)^{n-1}}{(n-1)!} \\[5pt]
&= \frac{(-1)^{n-1}e^{-\alpha t}}{(\alpha -
\beta)^{n-1}}P_{n-1/-1}(\alpha t - \beta t).
\end{aligned}\]
For the case where \(n=1\) or \(m=1\), we again assume without loss of
generality that \(m=1\) and proceed by
induction. When \(n=m=1\), we
have
\[f_{t}[\alpha, \beta] = \frac{e^{-\alpha t}
- e^{-\beta t}}{\alpha - \beta} = \frac{1}{\alpha - \beta}(e^{-\alpha
t}P_{0/0}(\alpha t - \beta t) - e^{-\beta t}Q_{0/0}(\alpha t - \beta
t)).\]
For \(n\geq 2\), we assume the result
holds for \(n-1\), in which case we
have
\[\begin{aligned}
f_{t}[\alpha^{(n)}, \beta] &= \frac{f_{t}[\alpha^{(n)}]
- f_{t}[\alpha^{(n-1)}, \beta]}{\alpha - \beta} \\[5pt]
&= \frac{(-1)^{n-1}}{(\alpha -
\beta)^{n}}\left(e^{-\alpha t}P_{n-1/-1}(\alpha t - \beta t) +
\left(e^{-\alpha t}P_{n-2/0}(\alpha t - \beta t) - e^{-\beta
t}Q_{n-2/0}(\alpha t - \beta t)\right)\right) \\[5pt]
&= \frac{(-1)^{n-1}}{(\alpha -
\beta)^{n}}\left(e^{-\alpha t}\left(P_{n-1/-1}(\alpha t - \beta t) +
P_{n-2/0}(\alpha t - \beta t)\right) - e^{-\beta t}Q_{n-2/0}(\alpha t -
\beta t)\right).
\end{aligned}\]
By definition, \(Q_{m/0}(t) = 1\) for
all \(m\), and in particular \(Q_{n-2/0}(t) = Q_{n-1/0}(t)\). Moreover,
for any \(t\) we have
\[\begin{aligned}
P_{n-1/-1}(t) + P_{n-2/0}(t) &= \frac{t^{n-1}}{(n-1)!} +
\frac{1}{(n-2)!}\sum_{i=0}^{n-2}(n-2-i)!{n-2 \choose i}t^{i} \\[5pt]
&= \frac{t^{n-1}}{(n-1)!} +
\frac{1}{(n-1)!}\sum_{i=0}^{n-2}(n-1)(n-2-i)!{n-2 \choose i}t^{i}
\\[5pt]
&= \frac{t^{n-1}}{(n-1)!} +
\frac{1}{(n-1)!}\sum_{i=0}^{n-2}(n-1-i)!{n-1 \choose i}t^{i} \\[5pt]
&= \frac{1}{(n-1)!}\sum_{i=0}^{n-1}(n-1-i)!{n-1 \choose
i}t^{i} \\[5pt]
&= P_{n-1/0}(t).
\end{aligned}\]
Hence, we have
\[f_{t}[\alpha^{(n)}, \beta] =
\frac{(-1)^{n-1}}{(\alpha - \beta)^{n}}\left(e^{-\alpha
t}P_{n-1/0}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/0}(\alpha t - \beta
t)\right)\]
as desired. For the general case where \(n,m\geq 2\), we again argue by induction.
The possible base cases are covered by the computations above, so we can
go straight to the induction step; assume the result holds for \(n, m-1\) and \(n-1,m\). We then have
\[\begin{aligned}
f_{t}[\alpha^{(n)}, \beta^{(m)}] &=
\frac{f_{t}[\alpha^{(n)}, \beta^{(m-1)}] - f_{t}[\alpha^{(n-1)},
\beta^{(m)}]}{\alpha - \beta}\\[5pt]
&= \frac{(-1)^{n-1}}{(\alpha -
\beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-1/m-2}(\alpha t - \beta t) -
e^{-\beta t}Q_{n-1/m-2}(\alpha t - \beta t)\right)\\
&\:\:\:\:\:\:\:\:\:\:\:+\frac{(-1)^{n-1}}{(\alpha -
\beta)^{n+m-1}}\left(e^{-\alpha t}P_{n-2/m-1}(\alpha t - \beta t) -
e^{-\beta t}Q_{n-2/m-1}(\alpha t - \beta t)\right)\nonumber\\[5pt]
&= \frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}e^{-\alpha
t}\left(P_{n-1/m-2}(\alpha t - \beta t) + P_{n-2/m-1}(\alpha t - \beta
t)\right)\\
&\:\:\:\:\:\:\:\:\:\:\:-\frac{(-1)^{n-1}}{(\alpha -
\beta)^{n+m-1}}e^{-\beta t}\left(Q_{n-1/m-2}(\alpha t - \beta t) +
Q_{n-2/m-1}(\alpha t - \beta t)\right).
\end{aligned}\]
The fact that \(P_{n-1/m-2}(t) +
P_{n-2/m-1}(t) = P_{n-1/m-1}(t)\) and \(Q_{n-1/m-2}(t) + Q_{n-2/m-1}(t) =
Q_{n-1/m-1}(t)\) is proved in [3], and thus we
conclude
\[f_{t}[\alpha^{(n)}, \beta^{(m)}] =
\frac{(-1)^{n-1}}{(\alpha - \beta)^{n+m-1}}\left(e^{-\alpha
t}P_{n-1/m-1}(\alpha t - \beta t) - e^{-\beta t}Q_{n-1/m-1}(\alpha t -
\beta t)\right)\]
as desired.
◻
Now, noting that the two player game described above is simply the pure-birth process associated with \(\lambda = (\alpha,\beta,\alpha,\beta,\ldots)\), the desired p.m.f. falls out as a corollary of Theorems 2 and 4.
Corollary 1. Let \((X_{t}^{\alpha,\beta})\) denote the
pure-birth process associated with the rate vector \(\lambda =
(\alpha,\beta,\alpha,\beta,\ldots)\), where \(\alpha,\beta > 0\) with \(\alpha\neq\beta\). We then have
\[\mathbb{P}(X_{t}^{\alpha,\beta} = k) =
(-1)^{k+k_{3}}\frac{\alpha^{k_{1}}\beta^{k_{2}}}{(\alpha-\beta)^{k}}\left(e^{-\alpha
t}P_{k_{3}/k_{4}}(\alpha t - \beta t) - e^{-\beta
t}Q_{k_{3}/k_{4}}(\alpha t - \beta t)\right)\]
where
\[k_{1} = \left\lceil
\frac{k}{2}\right\rceil,\:\:\:\:k_{2} = \left\lfloor
\frac{k}{2}\right\rfloor,\:\:\:\:k_{3} = \left\lceil
\frac{k-1}{2}\right\rceil, \:\:\:\:k_{4} = \left\lfloor
\frac{k-1}{2}\right\rfloor.\]
Let’s check to make sure this formula works. We can code up a simulation of the two-player game with \(\alpha=1\) and \(\beta=2\) and get the empirical distribution for the number of turns taken up to time \(t=5\) over 1,000,000 trials, and compare this to the plot given by our formula above.


From these plots, it appears we’ve successfully found a formula for the p.m.f. of a pure-birth process with alternating rates. Nice! Now it just needs a catchy name; my suggestions are the “interlaced Poisson distribution” or the “dual Poisson distribution”, but I will allow the scholars of the future the freedom to decide what best to call it.
References
-
Antonio Di Crescenzo, Antonella Iuliano, and Barbara Martinucci. On a bilateral birth-death process with alternating rates. Ricerche di Matematica, 61(1):157–169, November 2011.
-
Kendall E. Atkinson. An Introduction to Numerical Analysis. John Wiley & Sons, New York, second edition, 1989.
-
Paul Sablonniére. An algorithm for the computation of hermite–pad´e approximations to the exponential function: divided differences and hermite–padé forms. Numerical Algorithms, 33:443–452, 2003.
Comments
Post a Comment