# Call option price function in Bernstein polynomial basis with no-arbitrage inequality constraints

## Abstract

We propose an efficient method for the construction of an arbitrage-free call option price function from observed call price quotes. The no-arbitrage theory of option pricing places various shape constraints on the option price function. For each available maturity on a given trading day, the proposed method estimates an option price function of strike price using a Bernstein polynomial basis. Using the properties of this basis, we transform the constrained functional regression problem to the least-squares problem of finite dimension and derive the sufficiency conditions of no-arbitrage pricing to a set of linear constraints. The resultant linearly constrained least square minimization problem can easily be solved using an efficient quadratic programming algorithm. The proposed method is easy to use and constructs a smooth call price function which is arbitrage-free in the entire domain of the strike price with any finite number of observed call price quotes. We empirically test the proposed method on S&P 500 option price data and compare the results with the cubic spline smoothing method to see the applicability.

## 1 Introduction

The arbitrage-free option price function defined across strike price and estimated from the available quotes has been studied extensively by researchers and practitioners (see, e.g., ). This function contains precious information about the risk inherent in the underlying asset. Policy makers and investors estimate the state price density from option prices, which signifies the probability that market participants ascribe to the future asset price movements . Practitioners frequently compute the implied volatility which sets the Black-Scholes model option value equal to the price of that option. The implied volatility is then used to obtain the price of other complex exotic options which are either not significantly traded, or for which the quotes are not recorded.

There are three essential difficulties in constructing an arbitrage-free option price function from the available bid-ask quotes. First, on a given trading day, the bid-ask quotes of an option contingent on an underlying asset are known only for a limited number of unevenly distributed strike prices. Second, high bid-ask spreads are observed for the options with the strike price far from the current underlying asset price which makes finding the right price more challenging. Third, the observed quotes are not always free from an arbitrage opportunity. It is crucial to have an option price function which does not allow for any arbitrage, to avoid any mispricing.

An option price function is arbitrage-free in the strike domain if it is monotonic, convex, and satisfying certain bounds on itself and the first derivative [2, 7]. So, an arbitrage-free option price function needs to meet some shape restrictions although the actual functional form of the function is not known a priori. The only other information available is that it should fall ideally within the observed bid and ask quotes. Mathematically, estimating such a function is a generalized constrained functional regression problem.

Several approaches have been suggested for the construction of the arbitrage-free option price function. Bates  was first to use constrained cubic spline fitting to interpolate an arbitrage-free option price function from the observed option price transactions data. Kahalé , proposed a technique using the piecewise convex polynomial interpolation to approximate the call price function. However, the prerequisite for this algorithm is to prepare the data arbitrage-free which may lead to substantial loss of information.

Aït-Sahalia and Duarte  proposed a two-step method to estimate the arbitrage-free call price function. First, they use a constrained least-squares procedure which incorporate no-arbitrage shape constraints of monotonicity and convexity and then employ smoothing using local polynomials. In the same vein, Birke and Pilz  and Fan and Mancini  suggested alternative kernel regression estimators for the arbitrage-free call price function. The drawback of the kernel-based regression is that it is computationally intensive.

Fengler  proposed a procedure based on quadratic programming using the cubic spline smoothing on the call price data to get an arbitrage-free implied volatility. Along the same line, more spline-based models are described in the literature . Unfortunately, selecting the optimal number and location of the knots for spline fitting under shape constraints leads to a highly nonlinear optimization problem .

In this study, we use a Bernstein polynomial basis  to estimate the arbitrage-free option price function. The choice of this basis is straightforward as the constraints of monotonicity, convexity, and other bounds on the function and its first derivative on the entire domain leads to a set of linear inequality constraints on unknown parameters. The advantage is that we need to solve a finite-dimensional least-squares problem with a few linear constraints only. Moreover, they provide a smooth estimate, which can be obtained as a unique solution of a quadratic programming problem, thereby making it computationally attractive and efficient.

The class of Bernstein polynomials is dense in the space of all continuous functions ($$\mathcal{C}^{0}[a, b]$$) with any finite support with supremum norm and all of the derivatives possess the same convergence properties . Furthermore, Chak et al.  proved that the univariate Bernstein polynomial estimator is consistent not only with the true function but also for its first and second derivatives. In the literature, the Bernstein polynomial regression has been employed with several shape constraints which include non-negativity, unimodality, monotonicity, convexity (or concavity), etc., either alone or in combination with each other (see, e.g., ).

The organization of our paper is as follows: we present the no-arbitrage inequalities on the call price function in Section 2. Section 3 describes our proposed methodology to approximate the call price function using a Bernstein polynomial basis under the various inequality constraints arising from no-arbitrage conditions and derive the quadratic programming formulation of the estimation problem. In Section 4, we examine the empirical applicability of our proposed model using S&P 500 option price data. We compare our computed results with the cubic spline smoothing method in Section 5. Finally, we conclude in Section 6.

## 2 No-arbitrage inequalities on the call price function

On a single underlying asset $$S_{t}$$, at current time t, the price of a European call option $$C(K)$$ with strike price K, at fixed expiry time T, time to maturity $$\tau= T-t$$, risk-free interest rate r, and deterministic dividend rate q, is given by

$$C(K) = e^{-r\tau} \int_{0}^{\infty} \max(S_{T}-K,0)g(S_{T}) \,dS_{T},$$
(1)

provided the probability density function $$g(S_{T})$$ of state price $$S_{T}$$ exists. The fundamental theorem of asset pricing ensures the existence of $$g(S_{T})$$ under the assumption of no-arbitrage and is usually called the state price density (SPD) (see [23, 24]).

Differentiating (1) w.r.t. the strike price K we get

$$\frac{\partial C}{\partial K} = -e^{-r\tau} \int_{K}^{\infty}g(S_{T})\,dS_{T}.$$
(2)

Since the state price density $$g(S_{T})$$ is a probability density function, it is non-negative and integrable to one. Now,

$$g(S_{T}) \geq0 \quad\Rightarrow\quad \frac{\partial C}{\partial K} \leq0 \quad\mbox{and}\quad \int_{0}^{\infty}g(S_{T})\,dS_{T} = 1 \quad\Rightarrow\quad \frac{\partial C}{\partial K} \geq-e^{-r\tau}.$$

So we have

$$-e^{-r\tau}\leq\frac{\partial C}{\partial K} \leq0.$$
(3)

Thus, the call option price function needs to be monotonically non-increasing and further the slope of the function must be bounded below uniformly.

Differentiating (2) again w.r.t. the strike price K, we get the famous relation given by Breeden and Litzenberger ,

$$\frac{\partial^{2} C}{\partial K^{2}} = e^{-r\tau}g(K).$$
(4)

Equation (4) implies that the second derivative of the call price function is proportional to the state price density,

$$\Rightarrow\quad \frac{\partial^{2} C}{\partial K^{2}} \geq0.$$
(5)

Hence, the call price function has to be globally convex as any local non-convexity of the call price function implies a negative state price density.

In addition, the no-arbitrage theory also imposes the following bounds on the call price function:

\begin{aligned} \max\bigl(e^{-q\tau}S_{t} - e^{-r\tau}K,0 \bigr)\leq C(K) \leq S_{t}e^{-q\tau}. \end{aligned}
(6)

Therefore, an arbitrage-free call price function must satisfy the set of inequality constraints given by (3), (5), and (6).

## 3 Problem formulation

In this section, we describe the mathematical formulation to construct the arbitrage-free call price function $$\hat{C}:[K_{\min}, K_{\max}] \rightarrow R$$ using the given discrete set of observable data points. For a fixed expiry date, we assume that a finite sample of observations $$\{(K_{i}, C_{i}): K_{i} \in[K_{\min}, K_{\max}], i = 1:n\}$$ are available in the market, where the ordered pair $$(K_{i}, C_{i})$$ represents the price $$C_{i}$$ corresponding to the European call option of strike price $$K_{i}$$ contingent on a single underlying asset.

Let us consider the regression model,

$$C_{i} = f(K_{i})+\epsilon_{i},\quad i = 1, 2, \ldots, n,$$

where the error term $$\epsilon_{i}$$ has zero mean with finite variance. The set of ordered pairs $$\{(K_{i},C_{i})\}_{i=1}^{n}$$ are assumed to be independent and identically distributed. Here, the unknown call price function $$f(K)$$ is assumed to belong to a class of smooth functions having some restrictions on their shapes and can be estimated by minimizing the empirical $$L_{2}$$ norm, i.e.,

$$\hat{C}(K)= \arg \min _{f(K) \in\Im_{c} }\frac{1}{n}\sum _{i = 1}^{n}\bigl(f(K_{i})- C_{i} \bigr)^{2}.$$
(7)

Here $$\Im_{c} = \{f(K) \in \mathcal{C}^{2}[K_{\min}, K_{\max}]: f(K) \mbox{ satisfies the inequality constraints given by (3),}\mbox{ }\mbox{(5), and (6)}\}$$.

Nevertheless, the constrained functional regression problem (7) has some basic mathematical challenges, such as

• only a finite sample of noisy and unevenly distributed observations are available;

• the actual functional form of the regression function f is not known a priori;

• the constraints arising from no-arbitrage conditions must be satisfied everywhere in the domain;

• the constraints are imposed not only on the function but also on its first and second derivative.

As the actual functional form of the regression function f is not known a priori we propose to approximate the regression function in a Bernstein polynomial basis. Using the properties of the Bernstein polynomial basis, we now transform the constrained functional regression problem (7) into a finite-dimensional least-squares problem with linear constraints.

### 3.1 Finite-dimensional problem in Bernstein polynomial basis

Let us apply a simple linear transformation $$K \longmapsto x$$, where $$x = \frac{K-K_{\min}}{K_{\max}-K_{\min}}$$ to transform the domain $$[K_{\min}, K_{\max}]$$ to $$[0, 1]$$. Then it is easy to see that the inequality constraints on f given by (3), (5), and (6) transforms to the following conditions:

\begin{aligned}& -(K_{\max}-K_{\min})e^{-r\tau} \leq \frac{\partial f}{\partial x} \leq0, \end{aligned}
(8)
\begin{aligned}& \frac{\partial^{2} f}{\partial x^{2}} \geq0, \end{aligned}
(9)
\begin{aligned}& \max\bigl(e^{-q\tau}S_{t} - e^{-r\tau} \bigl((K_{\max}- K_{\min}) x + K_{\min}\bigr), 0\bigr) \leq f(x) \leq S_{t}e^{-q\tau}. \end{aligned}
(10)

Now, for any continuous function f in $$[0, 1]$$, the approximating Bernstein polynomial of order N is given by

$$B_{N}(x;f) = \sum_{k=0}^{N}f(k/N) \binom{N}{k}x^{k}(1-x)^{N-k} = \sum _{k=0}^{N}\beta_{k}b_{k}(x,N),$$
(11)

where $$\{ b_{k}(x, N)=\binom{N}{k}x^{k}(1-x)^{N-k}, k=0,1, \ldots, N \}$$ forms a basis of the Bernstein polynomial of degree N and $$\boldsymbol {\beta} _{N} =\{\beta_{k}: \beta_{k} = f(\frac{k}{N}), k = 0, 1, \ldots,N\}$$ are the corresponding coefficients.

So, if we approximate $$f(x)$$ in the Bernstein polynomial basis of order N, the constrained functional regression problem (7) transforms into the following finite-dimensional problem of estimation of $$\boldsymbol {\beta}_{N}$$:

$$\hat{C}(x) = \arg \min _{B_{N}(x,f)\in\Im_{N}}\frac{1}{n}\sum _{i=1}^{n} \bigl(B_{N}(x_{i},f) -C_{i} \bigr)^{2},$$
(12)

where

$$\Im_{N}=\bigl\{ B_{N}(x,f): B_{N}(x,f) \mbox{ satisfy the inequality constraints } (8), (9), \mbox{ and } (10) \bigr\} ,$$

From the Weierstrass theorem, $$B_{N}(x;f) \rightarrow f(x)$$ uniformly over $$[0, 1]$$ as $$N\rightarrow\infty$$ . Additionally, the derivatives of $$B_{N}(x;f)$$ also satisfy the existing bounds of the corresponding derivatives of f (see ) and possess the same convergence properties (see ). Moreover, using the properties of the Bernstein polynomial basis, the inequality constraints on $$B_{N}(x;f)$$ and its the derivatives can be transformed to the linear inequality constraints involving $$\boldsymbol {\beta}_{N}$$ only.

### 3.2 Constraints in Bernstein polynomial basis

Bernstein polynomial basis functions are non-negative on $$x \in[0, 1 ]$$. Also, they form a partition of unity, i.e., $$\sum_{k=0}^{N} b_{k}(x, N) = 1$$. The first and second derivatives of $$B_{N}(x;f)$$, $$N \geq2$$ can be written

\begin{aligned}& B_{N}^{\prime}(x;f) = \sum _{k=0}^{N}\beta_{k}b_{k}^{\prime}(x,N) = N \sum_{k=0}^{N-1}(\beta_{k+1}- \beta_{k})b_{k}(x,N-1), \end{aligned}
(13)
\begin{aligned}& B_{N}^{\prime\prime}(x;f) = \sum_{k=0}^{N} \beta_{k}b_{k}^{\prime\prime }(x,N)= N(N-1)\sum _{k=0}^{N-2}(\beta_{k+2}-2 \beta_{k+1}+\beta_{k})b_{k}(x,N-2). \end{aligned}
(14)

Since $$b_{k}(x, N-2)$$ is positive in $$[0,1]$$,

$$(\beta_{k+2}-2\beta_{k+1}+\beta_{k}) \geq0, \quad\forall k = 0:N-2 \quad\Rightarrow\quad B_{N}^{\prime\prime}(x;f) \geq0,\quad \forall x \in [0, 1].$$

The above set of inequalities can also be written $$\beta_{1}-\beta _{0}\leq\cdots\leq\beta_{N}-\beta_{N-1}$$. Then

\begin{aligned}& (\beta_{N}-\beta_{N-1})\leq0\\& \quad\Rightarrow\quad \beta_{1}-\beta_{0}\leq\cdots\leq \beta_{N}-\beta _{N-1}\leq0 \\& \quad\Rightarrow \quad B_{N}^{\prime}(x;f) \leq0,\quad \forall x \in[0, 1]. \end{aligned}
(15)

Further, as $$\sum_{k=0}^{N-1}b_{k}(x,N-1)=1$$,

\begin{aligned}& \beta_{1}-\beta_{0} \geq \frac{-(K_{\max}-K_{\min})e^{-r\tau}}{N} \\& \quad\Rightarrow\quad \frac{-(K_{\max}-K_{\min})e^{-r\tau}}{N} \leq \beta_{1}-\beta_{0} \leq \cdots\leq \beta_{N}-\beta_{N-1} \\& \quad\Rightarrow\quad N \sum_{k=0}^{N-1}( \beta_{k+1}-\beta_{k})b_{k}(x,N-1) \geq -(K_{\max}-K_{\min})e^{-r\tau}\sum _{k=0}^{N-1}b_{k}(x,N-1) \\& \quad\Rightarrow\quad B_{N}^{\prime}(x;f) \geq -(K_{\max}-K_{\min})e^{-r\tau},\quad \forall x \in[0, 1]. \end{aligned}
(16)

Furthermore,

\begin{aligned}& \beta_{1}-\beta_{0} \geq -(K_{\max}-K_{\min})e^{-r\tau}/N\\& \quad\Rightarrow\quad \beta_{1} \geq \beta_{0} -(K_{\max}-K_{\min})e^{-r\tau}/N. \end{aligned}

So,

\begin{aligned}& \beta_{0} \geq e^{-q\tau}S_{t} - e^{-r\tau}K_{\min}\\& \quad\Rightarrow\quad \beta_{1} \geq e^{-q\tau}S_{t} -e^{-r\tau }\biggl((K_{\max}- K_{\min})\frac{1}{N}+ K_{\min}\biggr). \end{aligned}

Similarly, for $$k = 2, 3, \ldots, N$$,

$$\beta_{k} \geq e^{-q\tau}S_{t} -e^{-r\tau} \biggl((K_{\max}- K_{\min})\frac {k}{N}+ K_{\min} \biggr).$$

Then

$$B_{N}(x;f) \geq \sum_{k=0}^{N} \biggl(e^{-q\tau}S_{t} -e^{-r\tau }\biggl((K_{\max}- K_{\min})\frac{k}{N}+ K_{\min}\biggr) \biggr) b_{k}(x,N).$$

Since the right hand side of the above inequality is the Bernstein polynomial representation of a linear function $$e^{-q\tau}S_{t} - e^{-r\tau} ((K_{\max}- K_{\min}) x + K_{\min})$$.

Thus, we can always write

$$B_{N}(x;f) \geq e^{-q\tau}S_{t} - e^{-r\tau} \bigl((K_{\max}- K_{\min}) x + K_{\min}\bigr), \quad\forall x \in[0,1].$$
(17)

Now,

$$B_{N}^{\prime}(x;f) \leq 0 \quad\Rightarrow \quad B_{N}(0;f) \geq B_{N}(x;f) \geq B_{N}(1;f),\quad \forall x \in[0,1].$$

Moreover, we have $$B_{N}(0;f) = \beta_{0}$$, and $$B_{N}(1;f) = \beta _{N}$$.

Then

$$\beta_{N} \geq 0 \quad\Rightarrow \quad B_{N}(x;f) \geq 0, \quad \forall x \in[0,1],$$

and

$$\beta_{0} \leq S_{t}e^{-q\tau} \quad \Rightarrow\quad B_{N}(x;f) \leq S_{t}e^{-q\tau}, \quad \forall x \in[0,1].$$

Hence, $$B_{N}(x;f)$$ satisfy (8), (9), and (10) if the following conditions hold:

\begin{aligned}& (\beta_{k}-2\beta_{k+1}+\beta_{k+2}) \geq 0,\quad \forall k=0:N-2, \end{aligned}
(18)
\begin{aligned}& (\beta_{N-1}-\beta_{N}) \geq 0, \end{aligned}
(19)
\begin{aligned}& (-\beta_{0} +\beta_{1}) \geq \frac{-e^{r\tau}(K_{\max}- K_{\min})}{N}, \end{aligned}
(20)
\begin{aligned}& \beta_{N} \geq 0, \end{aligned}
(21)
\begin{aligned}& \beta_{0} \geq e^{-q\tau}S_{t} - e^{-r\tau}K_{\min}, \end{aligned}
(22)
\begin{aligned}& -\beta_{0} \geq -e^{-q\tau}S_{t} . \end{aligned}
(23)

The finite-dimensional problem (12) can be written as the following least square minimization problem with linear constraints:

$$\hat{\beta}_{N} = \arg \min _{\boldsymbol {\beta}_{N} \in\mathfrak {B}_{N}}\frac{1}{n}\sum _{i=1}^{n}\bigl(\mathbf{b}_{N}(x_{i}) \boldsymbol {\beta} _{N}^{\top}- C_{i}\bigr)^{2},$$
(24)

where $$\mathfrak{B}_{N}=\{\boldsymbol {\beta}_{N} \in\mathbb{R}^{N+1} : \mathbf {A}_{N}\boldsymbol {\beta}_{N}^{\top}\geq\mathbf{d}_{N}\}$$ and $$\mathbf{b}_{N}(x) = (b_{0}(x,N),b_{1}(x,N),\ldots,b_{N}(x,N))$$ is a row vector of order $$(N+1)$$. Here, $$\mathbf{A}_{N}$$ is a matrix of order $$(N+4)\times(N+1)$$ and $$\mathbf{d}_{N}$$ is column vector of order $$(N+4)$$ given by

\begin{aligned} \mathbf{A}_{N} = \begin{pmatrix} 1 & -2 & 1 \\ &1 & -2 & 1 \\ & & & \cdots& & & & \\ & & & & & 1& -2& 1\\ & & & & & & 1& -1 \\ & & & & & & & 1\\ -1& 1& & & & & & \\ 1& & & & & & & \\ -1& & & & & & & \end{pmatrix},\qquad \mathbf{d}_{N} = \begin{pmatrix} 0 \\ 0 \\ \vdots\\ \vdots\\ 0 \\ \frac{-e^{-r\tau}(K_{\max}-K_{\min})}{N} \\ e^{-q\tau}S_{t} - e^{-r\tau}K_{\min} \\ -e^{-q\tau}S_{t} \end{pmatrix}. \end{aligned}

The above optimization problem can also be written as a general quadratic programming problem with linear inequality constraints,

$$\textstyle\begin{array}{@{}l@{\quad}l} \min_{\boldsymbol {\beta}_{N}} & \displaystyle-\mathbf{f}_{N}^{\top}\boldsymbol {\beta}_{N}^{\top}+ \frac{1}{2}\boldsymbol {\beta}_{N} \mathbf{H}_{N} \boldsymbol {\beta}_{N}^{\top}\\ \mbox{subject to } & \mathbf{A}_{N}\boldsymbol {\beta}_{N}^{\top}\geq\mathbf{d}_{N}; \end{array}$$
(25)

where $$\mathbf{f}_{N} = [f_{0}, f_{1}, \ldots, f_{N}]^{\top}$$ is a column vector of order $$(N+1)$$, with the elements $$f_{i} = 2 \sum_{j=1}^{n}C_{j}b_{i}(x_{j},N)$$, and the Hessian $$\mathbf{H}_{N}$$ is a symmetric matrix of order $$(N+1)$$ with the elements $$h_{i,j}$$ defined by

$$h_{i,j} = \left \{ \textstyle\begin{array}{@{}l@{\quad}l} \sum_{k=1}^{n}b_{i}^{2}(x_{k},N) ; & \mbox{when } i=j, \\ \sum_{k=1}^{n}b_{i}(x_{k},N)b_{j}(x_{k},N) ; & \mbox{when } i\neq j. \end{array}\displaystyle \right .$$

For a given N, if the Hessian matrix $$\mathbf{H}_{N}$$ is strictly positive definite, the quadratic programming problem (25) is well posed and can be solved in polynomial time . Optimal N may be selected adaptively based on information criteria such as $$AIC$$ used in  and defined by

\begin{aligned} AIC(N) = n \log(SSE/n)+ 2 edf, \end{aligned}
(26)

where $$SSE = \sum_{i=1}^{n}(B_{N}(x_{i};f)-C_{i})^{2}$$, and $$edf = N+1$$ is the effective degree of freedom. From equation (26), we choose the optimal N as $$N_{\mathrm{opt}} = \arg \min _{N \in\mathbb{N}} AIC(N)$$.

## 4 Empirical applications

In this paper, we test the empirical applicability of our proposed method on S&P 500 Index call option data. S&P 500 Index options belong the most liquid exchange traded options in the world and are well suited as a test case, since numerous empirical studies are performed on this data [2, 12]. We use the end of day quotes of the S&P 500 Index call option obtained from the Chicago Board of Options Exchange (CBOE). For each trading day, the quotes comprise last-bid and last-ask price for the option contingent on the S&P 500 Index with different strikes and maturities.

For the exposition, we choose the call options with dates of maturity (T) December 31, 2010 and February 19, 2011, which were recorded on t= December 1, 2010. The S&P 500 Index closed at $$S_{t} = 1{,}206.07$$ on that day. As reported in , we consider an annualized risk-free rate r as 0.30% corresponding to the time to maturity $$\tau=30~\mbox{days}$$, and 0.37% for $$\tau= 80~\mbox{days}$$. Also, we consider the implied forward $$F_{t}^{\tau} = S_{t}e^{(r-q)\tau }$$, as 1,204.85 and 1,201.95 for the respective maturities. Using the risk-free rates, we calculate the dividend $$q = r - \frac{1}{\tau} \log (\frac{F_{t}^{\tau}}{S_{t}})$$, as 1.53% and 1.93% for the respective maturities. Since the call price can fluctuate around the median of the bid and ask price , we minimize the estimation error from the mid price of the bid-ask quotes as suggested by Glaser and Heider . To eliminate the error-prone observations and possible biases, we use a level 1 filter suggested by . After filtering, the remaining number of observations are $$n = 34$$ for $$\tau= 30~\mbox{days}$$ and $$n = 57$$ for $$\tau=80~\mbox{days}$$.

We apply our proposed method to the data and solve the resultant quadratic programming problem by an interior-point-convex method using the quadprog function of the Matlab package. Using equation (26), for times to maturity τ equal to 30 days and 80 days, we get $$N_{\mathrm{opt}}$$ as 17 and 15, respectively.

To validate the computed results we compare the outcome with the popular cubic spline smoothing method for a single time to maturity as suggested by Fengler . The natural cubic spline smoothing estimator can be obtained by minimizing the penalized sum of squares under the linear constraints given in Section 3.2 in ,

$$\sum_{i=1}^{n} \bigl(C_{i} - f(K_{i})\bigr)^{2} + \lambda \int _{K_{\min}}^{K_{\max}}\bigl(f^{\prime\prime}(K) \bigr)^{2}\,dK,$$
(27)

where λ is a smoothing parameter. As the constraints on the first and second derivatives of the call price function act as a strong smoothing device, the choice of λ is of secondary importance, and so we fix λ as $$1e\!-\!7$$ as suggested in . Finding an optimal number of knots and locations of these knots in the presence of constraints leads to a nonlinear optimization problem . So, for the sake of simplicity, we fix the knots at the observation points.

Since practitioners regularly use the implied volatility to compare options prices across various strikes, expiries, and underlying assets, we also compare the implied volatility function corresponding to the estimated call price curves. In general, the implied volatilities are obtained by inverting the call option price function using the Black-Scholes equation  given by

$$C(S_{t},t;K,\tau,r,q,\hat{\sigma}) = S_{t}e^{-q\tau}\Phi(d_{1}) - K e^{-r\tau} \Phi(d_{2}),$$
(28)

where σ̂ is the implied volatility, Φ is the CDF of the standard normal distribution and $$d_{1} = \frac{\log{(S_{t}/K)} + (r-q+\hat{\sigma}^{2}/2)\tau}{\hat{\sigma}\sqrt{\tau}}$$, and $$d_{2} = d_{1} - \hat{\sigma}\sqrt{\tau}$$. It is well known that the implied volatility curve will be arbitrage-free if it is deduced from the call price curve satisfying no-arbitrage conditions .

## 5 Results and discussion

The numerical results are computed and are compared with the cubic spline smoothing method to validate the performance. Figure 1 depicts the estimated arbitrage-free call price function for both of the maturities, which shows that our proposed estimator and CS method match well with the mid price of the observed bid-ask quotes. Also, Figure 1 ascertains that both of the estimators satisfy the price bound constraints defined in equation (10) globally.

Figure 2 demonstrates that, even in the region of a deep out-of-money strike price (i.e., $$K \gg S_{t}$$), the estimated curves from both of the methods lie between the bid and ask quotes. Estimates computed using the CS method are a bit closer to the mid price of the observed bid-ask quotes.

To see the closeness of the curve with the mid price, we present the price residuals in Figure 3. For both of the maturities, larger deviations are spotted for near-the-money options, which are consistent with the observation reported in .

Figure 4 displays the arbitrage-free implied volatility curve corresponding to the estimated call price function. We also show the computed implied volatility corresponding to the bid and ask price. It is observable that for the deep in-the-money strike price $$K_{\mathrm{deep}}$$ ($$\ll S_{t}$$), the implied volatility for the corresponding bid price $$C_{\mathrm{bid}}$$ does not exist as $$C_{\mathrm{bid}} < (e^{-q\tau}S_{t} - e^{-r\tau }K_{\mathrm{deep}})$$, i.e., the time value of the corresponding bid price is negative. Still, both of the methods produce a reasonable estimate of the implied volatility in that region. Again, we notice that the estimates from the CS method are a bit closer to the implied volatility corresponding to the mid price of the observed bid-ask quotes. On the other hand, our proposed method produce a bit smoother estimate.

Figure 5 plots pointwise implied volatility residuals. We observe larger deviations for in-the-money and out-of-money options where the first derivative of the implied volatility with respect to call price (known as the inverse vega) is sensitive.

To monitor the correct implementations of no-arbitrage inequalities, we compute the first derivative of the estimated call price function. It is obtained using equation (13) and transforming back to the strike domain. We display the derivative across strike in Figure 6, which shows that the estimated derivative from both of the methods is monotonically increasing and satisfy the bounds given by equation (3). However, unlike the cubic spline method our proposed method produces smooth estimates of the first derivative.

We also estimate the state price density function by computing the second derivative of the estimated call price function for both of the maturities. Figure 7 presents the estimated state price density function using both of the methods. The estimated state price density functions are positive in the entire domain, which shows that our proposed estimator satisfies the inequality arising from the convexity constraint. Since a relatively low penalty is added to the objective function to minimize equation (27), the contribution of the smoothness part becomes relatively negligible, which leads to a rough estimate of first and second derivatives of the cubic spline estimator. It is noticeable that unlike the CS method our proposed estimator produces a smooth state price density function.

Table 1 shows the comparative results in terms of the root mean square error (RMSE), the mean absolute error (MAE), and the percentage relative mean error (PRME) for the estimation of the call price function and the implied volatilities with respect to the mid price of the bid-ask quotes. It is apparent that both of the methods have an almost similar error behavior in terms of RMSE, MAE, and PRME measures.

Table 2 reports the computational details, i.e., the number of constraints, number of iterations, and the average computational time taken by the optimization routine of the proposed Bernstein polynomial method (BP) and the cubic spline smoothing method (CS). The average is taken over 100 runs of the optimization routine starting from the second run. The number of constraints in the cubic spline method is equal to $$2n + 1$$, where n is the number of observations, while our proposed method requires only $$N_{\mathrm{opt}}+4$$ constraints, $$N_{\mathrm{opt}}$$ being the optimal order of the Bernstein polynomial basis and in general, $$N_{\mathrm{opt}} \ll n$$. Although both of the methods take the same number of iterations, yet our proposed method consumes on average less computational time.

## 6 Conclusion

In this article, we propose an easy to use method for the construction of an arbitrage-free call option price function using a Bernstein polynomial basis. One of the most fundamental advantages here is that the estimation problem can be reduced to a quadratic programming problem with only a few linear constraints. The empirical results demonstrate that the proposed method is accurate and computationally efficient. For a limited number of bid-ask quotes, the method produces smooth estimates that satisfy all the inequality constraints imposed by no-arbitrage conditions on the entire domain. However, the spline-based model seems to provide a closer fit to the data, which may lead to overfitting and poor out-of-sample performance. These findings should motivate further research in the use of Bernstein polynomials for option pricing.

## References

1. Jackwerth, JC, Rubinstein, M: Recovering probability distributions from option prices. J. Finance 51(5), 1611-1631 (1996)

2. Aït-Sahalia, Y, Duarte, J: Nonparametric option pricing under shape restrictions. J. Econom. 116(1), 9-47 (2003)

3. Birke, M, Pilz, KF: Nonparametric option pricing with no-arbitrage constraints. J. Financ. Econom. 7(2), 53-76 (2009)

4. Fan, J, Mancini, L: Option pricing with model-guided nonparametric methods. J. Am. Stat. Assoc. 104(488), 1351-1372 (2009)

5. Fengler, MR: Arbitrage-free smoothing of the implied volatility surface. Quant. Finance 9(4), 417-428 (2009)

6. Bliss, RR, Panigirtzoglou, N: Testing the stability of implied probability density functions. J. Bank. Finance 26(2), 381-422 (2002)

7. Roper, M: Arbitrage free implied volatility surfaces. Preprint (2010)

8. Bates, DS: The crash of ’87: was it expected? The evidence from options markets. J. Finance 46(3), 1009-1044 (1991)

9. Kahalé, N: An arbitrage-free interpolation of volatilities. Risk 17(5), 102-106 (2004)

10. Laurini, MP: Imposing no-arbitrage conditions in implied volatilities using constrained smoothing splines. Appl. Stoch. Models Bus. Ind. 27(6), 649-659 (2011)

11. Orosi, G: Empirical performance of a spline-based implied volatility surface. J. Deriv. Hedge Funds 18(4), 361-376 (2012)

12. Orosi, G: Arbitrage-free call option surface construction using regression splines. Appl. Stoch. Models Bus. Ind. 31(4), 515-527 (2014)

13. Fengler, MR, Hin, L-Y: Semi-nonparametric estimation of the call-option price surface under strike and time-to-expiry no-arbitrage constraints. J. Econom. 184(2), 242-261 (2015)

14. Wang, X, Li, F: Isotonic smoothing spline regression. J. Comput. Graph. Stat. 17(1), 21-37 (2008)

15. Farouki, RT: The Bernstein polynomial basis: a centennial retrospective. Comput. Aided Geom. Des. 29(6), 379-419 (2012)

16. Lorentz, GG: Bernstein Polynomials. AMS Chelsea Publishing (2012)

17. Chak, PM, Madras, N, Smith, B: Semi-nonparametric estimation with Bernstein polynomials. Econ. Lett. 89(2), 153-156 (2005)

18. Chang, I, Hsiung, CA, Wu, Y-J, Yang, C-C, et al.: Bayesian survival analysis using Bernstein polynomials. Scand. J. Stat. 32(3), 447-466 (2005)

19. Chang, I, Chien, L-C, Hsiung, CA, Wen, C-C, Wu, Y-J, et al.: Shape restricted regression with random Bernstein polynomials. In: Complex Datasets and Inverse Problems: Tomography, Networks and Beyond, pp. 187-202 (2007)

20. Wang, J, Ghosh, SK: Shape restricted nonparametric regression with Bernstein polynomials. Comput. Stat. Data Anal. 56(9), 2729-2741 (2012)

21. Ding, J, Zhang, Z: Convex analysis in the semiparametric model with Bernstein polynomials. J. Korean Stat. Soc. 44(1), 58-67 (2015)

22. Turnbull, BC, Ghosh, SK: Unimodal density estimation using Bernstein polynomials. Comput. Stat. Data Anal. 72, 13-29 (2014)

23. Black, F, Scholes, M: The pricing of options and corporate liabilities. J. Polit. Econ. 81(3), 637-654 (1973)

24. Merton, RC: Theory of rational option pricing. Bell J. Econ. Manag. Sci. 4(1), 141-183 (1973)

25. Breeden, DT, Litzenberger, RH: Prices of state-contingent claims implicit in option prices. J. Bus. 51(4), 621-651 (1978)

26. Floater, MS: On the convergence of derivatives of Bernstein approximation. J. Approx. Theory 134(1), 130-135 (2005)

27. Floudas, CA, Visweswaran, V: Quadratic optimization. In: Handbook of Global Optimization, pp. 217-269 (1995)

28. Hentschel, L: Errors in implied volatility estimation. J. Financ. Quant. Anal. 38(4), 779-810 (2003)

29. Glaser, J, Heider, P: Arbitrage-free approximation of call price surfaces and input data risk. Quant. Finance 12(1), 61-73 (2012)

30. Constantinides, GM, Jackwerth, JC, Savov, A: The puzzle of index option returns. Rev. Asset Pricing Stud. 3(2), 229-257 (2013)

## Acknowledgements

We thank anonymous referees for their helpful suggestions. The first author is grateful to the Ministry of Human Resource and Development, India for the financial support to carry out this work.

## Author information

Authors

### Corresponding author

Correspondence to Sumit Kumar.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

All the authors contributed equally and significantly in writing this paper. All authors read and approved the final manuscript.

## Rights and permissions 