Skip to main content

Optimal transmission of messages in computer networks – an optimal control problem involving control-dependent time-delayed arguments

Abstract

In this paper, we find the optimal transmission of messages in computer networks. This problem has been formulated as a nondelayed optimal control problem in several recent papers on TCP (transmission control protocol). Since the actual transmission of messages from origins to destinations should consist of both forward transmission delays of the buffers and latency of the links, we remodel the problem as a time-delayed optimal control problem consisting of both control-dependent time-delayed arguments and discrete time-delayed arguments. We then develop a modified control parameterization method for solving this time-delayed optimal control problem. The gradients of the new objective function and constraint functions generated by this modified control parametrization method are derived. A numerical example is solved by using the time-delayed version of the problem that we formulate, as well as the nondelayed version of the problem in the literature. Numerical results clearly illustrate the efficiency of the modified control parameterization method for solving both versions of this optimal transmission problem. Comparison of results of the two versions concerning the optimal transmission rates at the origins, the optimal output flow rates at the destination, and the queue sizes at the buffers are obtained. These comparison results clearly reflect how the optimal transmission of messages in computer networks in real life can be affected by both the forward transmission delays of the buffers and the latency of the links.

1 Introduction

In computer networks, queues build up in the buffer of the links when the input rates are larger than the available bandwidth. Long queues can cause congestion leading to packet losses and delays in transmission. Various versions of TCP (transmission control protocol) have been developed. The existing congestion control techniques aim to adjust the transmission rates of the competing users in such a way that the network resources are efficiently shared.

Since the pioneer works [1] and [2], a lot of efforts ([314]) have been devoted to the modeling and design of the internet congestion control. These research works involve the establishment of a fluid model for finding the proper resource allocation of the network. Different resource allocation algorithms, such as primal, dual, and primal-dual algorithms have been developed, which enable the user to find the optimal transmission rates asymptotically using local feedback from the network.

Despite the progress in the analysis and synthesis of internet congestion control, an important modeling issue has been neglected for simplicity. Specifically, most existing fluid models of congestion control assume that the flow rate of a TCP flow at every link in its path is equal to the source rate at the first link. However, in practice, due to the queuing effect caused by more than one source using the same link, the rate of the TCP flow from each source in an intermediate link is usually (but not always) less than that of its preceding links. Thus the rate of a TCP flow should be decreasing most of the time as it travels from one link to another link from source to destination. In view of the above shortcoming of the existing models, [12] developed a closed-loop form for the buffer dynamic to address the instantaneous queuing rate of each TCP flow at the router of every link it passes through; consequently, the instantaneous flow rate of each TCP flow at a particular link can be obtained by subtracting all the queuing rates preceding this link from the input source rate. Hence a more accurate model has been developed for modeling the behavior of the network under the congestion controller. The stability of the congestion control algorithms was devised by imposing several simple conditions.

In this paper, we consider a computer network that has two origins, N sources, three links, and one destination such that each of the TCP flows from the sources uses two links to reach the destination. We then modify the model of [12] further by imposing the time-delay arguments into the buffer equation (i.e., the queuing equation at the buffer of a link) and the conservation of the flow equation (i.e., the equation governing the flow rate at one link and its successive link).

As mentioned in [4], there always exists a time difference between the time that a packet first arrives at the router of a link and the time that the router serializes it onto the link. This is called the forward transmission delay, which is equal to packet size (bits)/transmission rate (bits per second). Furthermore, there also exists a time difference between the time that the packet leaves the router of one link (i.e., immediately after the router has serialized the packet into the link) and the time that it arrives at the receiving end of the link (i.e., the time that the packet reaches the router of the preceding link during its trip from source to destination). This is called the latency of the link, which is a constant for each link (independent of the packet size and transmission rate). Thus by inserting the forward transmission delay arguments into the buffer equations and the latency into the conservation of flow equations we obtain even a more accurate behavior of the network under the congestion controller than that of [12].

Since each link in this “optimal transmitting of messages in computer networks” model may be shared by more than one source, competition for flow rate exists at some or all the given links. Thus the objective of this problem should involve allocating the flow rate of each link to all the competing sources in such a way that the weighted sum of all the messages sent by the sources to the destination in each time interval is maximized, subject to the conditions that both the buffer capacity and the link capacity (which is usually called the bandwidth of the link) are not violated.

Using the above objective function, we first formulate this problem as a time-delayed optimal control problem consisting of both control-dependent time-delayed arguments and discrete-time-delayed arguments. The control-dependent time-delayed arguments arise from the transmission delays of the buffers and the discrete-time-delayed arguments arise from the latency of the links. We then develop a modified control parameterization method for solving this optimal control problem. The gradients of the new objective functions and the constraint functions generated by this modified control parameterization method are derived for the first time in the literature. Finally, a numerical example is solved by using both the nondelayed version and the time-delayed version of the above optimal transmission problem.

Numerical results clearly illustrate the efficiency of this modified control parameterization method for solving time-delayed optimal control problems with control-dependent arguments.

Comparison of results of the two versions concerning the optimal transmission flow rates at the origin, the output flow rates at the destination, and the queue sizes at each buffer are obtained. These comparison results clearly reflect how the optimal transmission of messages in computer networks in real life can be affected by both the forward transmission delays of the buffers and the latency of the links.

The classical control parameterization method has been efficiently used for solving lots of optimization problems or optimal control problems, such as constrained optimization problems [15], nondelayed optimal control problems with lumped parameter systems [1621], nondelayed optimal control problems with distributed parameter systems [22], optimal control problems with constant time delays ([1719, 2325]), optimal control problems with given time-varying delays [26], and optimal control problems with unknown time-varying delays [27]. However, the extension of this method to solving optimal control problems with control-dependent time-delayed arguments is much more complicated due to the reasons given in the next two paragraphs.

To solve nondelayed optimal control problems, or optimal control problems involving constant time delays \(h_{1},\ldots,h_{n}\), or given time-varying delays \(\alpha _{1}(t),\ldots,\alpha _{n}(t)\) ([1626]) by the classical control parametrization method, we simply need to express each control function \(u(t)\) as

$$ u(t) = \sum_{i = 1}^{p} \sigma ^{i} \chi _{i}(t),\quad j = 1,\ldots,n, t \in [0,T], $$
(1)

where \(\sigma ^{1},\ldots,\sigma ^{p}\) are the control parameters, p is the number of partitions in the time interval \([0, T]\), and \(\chi _{i}(t)\) is the characteristic function. Then we can easily calculate the gradient of the objective function and the constraint functions with respect to the control parameters \(\sigma ^{1},\ldots,\sigma ^{p}\), as well as solve the state equations and the costate equations by any numerical scheme, such as the fourth-order Runge–Kutta method. Although solving nondelayed optimal control problems involving unknown time-varying delays \(\alpha _{1}(t),\ldots,\alpha _{n}(t)\) [27] is slightly more difficult, we can still employ the same technique given above to solve the problem by the classical control parameterization, except that we need to add additional control vectors \(v_{1}(t),\ldots,\nu _{n}(t)\) for the n unknown time delays via

$$ v_{j}(t) = \sum_{i = 1}^{p} \sigma _{j}^{i} \chi _{i}(t),\quad j = 1, \ldots,n, t \in [0,T], $$
(2)

where \(\sigma _{j}^{i}\) (\(i = 1,\ldots,p, j = 1,\ldots,n\)) are the control parameters.

However, the optimal control problem in this paper involves a control-delayed vector of the form (\(u_{1}(t - \alpha _{1}(u_{1})(t)),\ldots,u_{m}(t - \alpha _{m}(u_{m})(t))^{T} \). Thus, to efficiently solve the problem, besides expressing the control vector as vector-valued functions of the control parameters \(\sigma ^{1},\ldots,\sigma ^{p}\), we also need to express each component of the control-delayed vector \((u_{1}(t - \alpha _{1}(u_{1})(t)),\ldots,u_{m}(t - \alpha _{m}(u_{m})(t))^{T}\) as a function of the control parameters \(\sigma ^{1},\ldots,\sigma ^{p}\), which is a very difficult task. (\(\alpha _{i}(u_{i})(t)\), \(i = 1,\ldots, m\), are given functions of \(u_{i}(t)\).) Since each ith component of the control-delayed vector is a convolution of the function \(u_{i}(t) \), finding the gradient of the objective function or constraint functions of this time-delayed optimal control problem with control-dependent arguments is much harder than finding the optimal control problems with constant time delays or time-varying delays given in [1627]. Thus, we need to devise a concrete method for solving this difficult optimal control problem with control-dependent time-delayed arguments.

Thus the contribution of this paper is twofold. From the practical point of view, this is a pioneering paper, which provides an open-loop control for the time-delayed optimal transmission problem of the computer network; in this way, we can allocate the optimal flow rate to each link of the real-life computer network so that the maximum amount of messages can be sent from N origins to one destination in a given period. From the mathematical point of view, this pioneering paper develops a modified control parameterization method for solving time-delayed problems with both control-dependent time-delayed arguments and discrete time-delayed arguments. Thus this pioneering paper also provides a concrete method for solving the time-delayed optimal transmission problem of the computer network. Numerical results clearly illustrate the efficiency of this modified control parameterization method for solving real-life optimal control problems with control-dependent time-delayed arguments.

In Sect. 2, we provide a formulation of the time-delayed optimal transmission problem of the computer network. In Sect. 3, we convert the time-delayed optimal control problem into a canonical form so that it can be solved by a modified control parameterization method. In Sect. 4, we describe the modified control parameterization method for solving the time-delayed control problem discussed in Sects. 2 and 3. In Sect. 5, we derive the gradient formulae for solving the time-delayed optimal control problem by the modified control parameterization method. In Sect. 6, we solve a numerical example consisting of four sources, three links, two origins, and one destination; each source uses two links to travel from its origin to its destination. We solve this example using both the nondelayed version and the time-delayed version of the optimal transmission problem. The concluding remarks and suggestions for further study are given in Sect. 7.

2 Formulation of the time-delayed optimal transmission problem

2.1 Description of the computer network

We consider a computer network that has two origins, N sources, three links, and one destination, such that sources \(s_{1},\ldots,s_{K}\) use links \(l_{1}\) and \(l_{3}\) to reach the destination and sources \(s_{K + 1},\ldots,s_{N}\) use links \(l_{2}\) and \(l_{3}\) to reach the destination. (See Fig. 1 for details.)

Figure 1
figure 1

The structure of a computer network that has 2 origins, N sources, 3 links, and 1 destination

Let \(\tilde{u}_{1}(t),\ldots,\tilde{u}_{K}(t)\) be the input rates (bits per second) associated with sources \(s_{1},\ldots,s_{K}\) at the buffer of link \(l_{1}\) at time t. Let \(\tilde{u}_{K + 1}(t),\ldots,\tilde{u}_{N}(t)\) be the input rates (bits per second) associated with sources \(s_{K + 1},\ldots,s_{N}\) at the buffer of link \(l_{2}\) at time t. Let \(\hat{u}_{1}(t),\ldots,\hat{u}_{N}(t)\) be the input rates (bits per second) associated with sources \(s_{1},\ldots,s_{N}\) at the buffer of link \(l_{3}\) at time t. Then \(\tilde{u}_{1}(t),\ldots,\tilde{u}_{N}(t)\) and \(\hat{u}_{1}(t),\ldots,\hat{u}_{N}(t)\) are controls of this problem. (In fact, \(\hat{u}_{1}(t),\ldots,\hat{u}_{N}(t)\) are artificial controls only, because their values are completely determined as long as the input rates \(\tilde{u}_{1}(t),\ldots,\tilde{u}_{N}(t)\) and the queue sizes (to be defined later) at links \(l_{1}\) and \(l_{2}\) are given.)

For each \(i =1,\ldots, K\), let \(b_{i,1}(t)\) be the queue length at the buffer of link \(l_{1}\) associated with source \(s_{i}\) at time t. For each \(i = K+1,\ldots, N\), let \(b_{i,2}(t)\) be the queue length at the buffer of link \(l_{2}\) associated with source \(s_{i}\) at time t. For each \(i = 1,\dots ,N\), let \(b_{i,3}(t)\) be the queue length at the buffer of link \(l_{3}\) associated with source \(s_{i}\) at time t. Let \(b_{1}(t) = \sum_{i = 1}^{K} b_{i,1} (t)\), \(b_{2}(t) = \sum_{i = K + 1}^{N} b_{i,2} (t)\), and \(b_{3}(t) = \sum_{i = 1}^{N} b_{i,3} (t)\) be, respectively, the queue sizes associated with all the sources at links \(l_{1}\), \(l_{2}\), and \(l_{3}\) at time t. Let \(c_{j}\) (\(j = 1,2,3\)) (bits per second) be the link capacity (bandwidth) of link \(l_{j}\).

2.2 Preliminary formulation of the buffer (state) equation and the conservation of flow equation without considering the time-delayed arguments

In view of the computer network described in the previous section, we have the following buffer equation from [12]:

$$\begin{aligned}& \dot{b}_{1}(t) = \Biggl( \sum_{i = 1}^{K} \tilde{u}_{i}(t) - c_{1} \Biggr)_{b_{1}(t)}^{ +}, \end{aligned}$$
(3)
$$\begin{aligned}& \dot{b}_{2}(t) = \Biggl( \sum_{i = K + 1}^{N} \tilde{u}_{i}(t) - c_{2} \Biggr)_{b_{2}(t)}^{ +}, \end{aligned}$$
(4)
$$\begin{aligned}& \dot{b}_{3}(t) = \Biggl( \sum_{i = 1}^{N} \hat{u}_{i}(t) - c_{3} \Biggr)_{b_{3}(t)}^{ +}, \end{aligned}$$
(5)

with the initial condition

$$ b_{j}(0) = 0,\quad j = 1,\ldots,3, $$
(6)

where

$$ ( x )_{y}^{ +} = \textstyle\begin{cases} x& \text{if } y > 0, \\ \max (x,0)& \text{if } y \le 0. \end{cases} $$
(7)

Remark 2.1

The explanation of Eqs. (3)–(5) is as follows: If the queue length (backlog) at time t is positive, then the instantaneous rate of change of the queue length is equal to the total incoming source rate minus the capacity of the link; if the queue length (backlog) at time t is zero, the rate of change should be equal to the maximum between “total incoming source rate minus capacity” and zero, because in real life the queue length cannot be negative.

Since the buffer of any link is shared by more than one source (i.e., the buffers of links \(l_{1}\), \(l_{2}\), and \(l_{3}\) are shared by sources \(s_{1},\ldots,s_{K}\), sources \(s_{K + 1},\ldots,s_{N}\), and sources of all the links \(s_{1},\ldots,s_{N}\), respectively), we wish to specify at which rates the different flows leave the buffer. Let \(\theta _{i,j}(t)\) be the ratio of the queue rate due to source \(s_{i}\) at the buffer of link \(l_{j}\) to the total queue rate at the buffer of link \(l_{j}\). Then the equations for \(b_{i,j}(t)\) are as follows:

$$\begin{aligned}& \dot{b}_{i,1}(t) = \theta _{i,1}(t)\dot{b}_{1}(t), \quad i = 1,\ldots,K, t \in [0, T], \end{aligned}$$
(8)
$$\begin{aligned}& \dot{b}_{i,2}(t) = \theta _{i,2}(t)\dot{b}_{2}(t), \quad i = K + 1,\ldots,N, t \in [0, T], \end{aligned}$$
(9)
$$\begin{aligned}& \dot{b}_{i,3}(t) = \theta _{i,3}(t)\dot{b}_{3}(t), \quad i = 1,\ldots,N, t \in [0, T], \end{aligned}$$
(10)

with the initial conditions

$$\begin{aligned}& b_{i,1}(0) = 0,\quad i = 1,\ldots,K, \end{aligned}$$
(11)
$$\begin{aligned}& b_{i,2}(0) = 0,\quad i = K + 1,\ldots,N, \end{aligned}$$
(12)
$$\begin{aligned}& b_{i,3}(0) = 0,\quad i = 1,\ldots,N, \end{aligned}$$
(13)

where

$$\begin{aligned}& \theta _{i,1}(t) = \frac{\tilde{u}_{i}(t)}{\sum_{\bar{i} = 1}^{K} \tilde{u}_{\bar{i}}(t)},\quad i = 1, \ldots,K, \end{aligned}$$
(14)
$$\begin{aligned}& \theta _{i,2}(t) = \frac{\tilde{u}_{i}(t)}{\sum_{\bar{i} = K + 1}^{N} \tilde{u}_{\bar{i}}(t)},\quad i = K + 1, \ldots,N, \end{aligned}$$
(15)
$$\begin{aligned}& \theta _{i,3}(t) = \frac{\hat{u}_{i}(t)}{\sum_{\bar{i} = 1}^{N} \hat{u}_{\bar{i}}(t)}, \quad i = 1, \ldots,N \end{aligned}$$
(16)

are the ratios assigned in accordance with the WFQ service discipline described in [12].

Moreover, we need to impose the following control bounds:

$$ \bar{u} \ge \tilde{u}_{i}(t),\qquad \hat{u}_{i}(t) \ge \underline{u},\quad i = 1,\ldots,N, t \in [0, T]. $$
(17)

Due to the conservation of TCP flow from the buffer of one link to that of the next link, we have

$$ \hat{u}_{i}(t) = \textstyle\begin{cases} \tilde{u}_{i}(t) - \dot{b}_{i,1} (t),& i = 1,\ldots,K, \\ \tilde{u}_{i}(t) - \dot{b}_{i,2} (t),& i = K + 1,\ldots,N. \end{cases} $$
(18)

The above equation is called the conservation of flow equations.

Remark 2.2

By Remark 2.1, it is possible that the input rate associated with source \(s_{i}\) at the buffer of one link is larger than that at the buffer of the previous link. (In other words, \(\hat{u}_{i}(t) \) can be larger than \(\tilde{u}_{i}(t)\).)

2.3 Formulation of the time-delayed optimal control problem when forward transmission delays and latency of the links are taken into consideration

As mentioned in the introduction, the buffer equation and conversation of flow equation described in Sect. 2.1 ignore the existence of time delays during the transmission from the routers to links. Moreover, the conservation of the TCP flow equations ignores the existence of the latency of the link, which is equal to the difference between the time that a packet leaves the router of one link and the time that it arrives at the receiving end of the link, i.e., the time that the packet arrives at the router of the preceding link during its trip from source to destination. Thus (2)–(4) do not reflect the evolution of the buffer size accurately, and (17) does not reflect the conservation of TCP flow accurately. Because of this weakness, we follow the approach of [4] to change the buffer equations and the conservation of the TCP flow equations so that they do not ignore the time-delayed arguments. The situation representing the forward transmission delays and the latency is depicted in Fig. 2.

Figure 2
figure 2

The situation representing the forward transmission delays and the latency of a packet from source \(S_{1}\)

In Fig. 2, we use the following notations:

\(t_{0}\): The time that the packet from sources \(s_{1}\) is being transmitted from its origin

\(t_{1}\): The time that the packet serializes the packet onto link \(l_{1}\) from its router

\(t_{2}\): The time that the packet arrives at the receiving end of link \(l_{1}\) (i.e., the time it takes for the packet to arrive at the router of link \(l_{3}\), waiting to be serialized into link \(l_{3}\))

\(t_{3}\): The time that the packet serializes the packet onto \(l_{3}\) from its router

\(t_{4}\): The time that the packet arrives at the receiving end of link \(l_{3}\) (i.e., the time it takes for the packet to arrive at the destination)

\(\tilde{u}_{1}\): The transmission rate from source \(s_{1}\) at the time it reaches the router of link \(l_{1}\)

\(\hat{u}_{1}\): The transmission rate from source \(s_{1}\) at the time it reaches the router of link \(l_{3}\)

\(w/\tilde{u}_{1}(t)\): Forward transmission delay of a packet with size w using the transmission rate \(\tilde{u}_{1}(t)\).

(\(w/\hat{u}_{1}(t)\) can be defined in a similar way.)

\(d_{1}\): Latency of link \(l_{1}\), which is the difference between the time that the packet leaves link \(l_{1}\) and the time that it reaches the receiving end of link \(l_{1}\).

(\(d_{3}\) can be defined in a similar way.)

Suppose, on average, the router transmits a packet of constant size \(w_{1}\) (bits) at any time t to link \(l_{1}\) using the transmitting rate \(\tilde{u}_{1}(t)\) (bits per second). Then the forward transmission delay is given by \(w_{1}/\tilde{u}_{1}(t)\) (seconds) (i.e., it is directly proportional to the size of the packet but inversely proportional to the packet transmission rate.) In other words, the packet has to queue at the router for \(w_{1}/\tilde{u}_{1}(t)\) (seconds) before it can be serialized into link \(l_{1}\). For simplicity, we assume that the packet size inputted to each of the links \(l_{1}\), \(l_{2}\), and \(l_{3}\) at any time t is w (bits).

The phenomenon of the forward transmission delay and latency in a computer network can be explained as follows: Suppose that we wish to send goods from point A to point B. There is a large conveyor belt for transferring goods from point A to point B, and any item placed on the conveyor belt will move at a constant speed of u items per second. The distance between A and B is y meters. Using both hands, we can transfer the goods from the carton to the belt at a constant speed of w items per second. Then the forward transmission delay for the transportation of goods is (\(w/u\)) meters/second, and the latency is \(d = y/u\) seconds. Thus in a computer network the forward transmission delay is the time difference between the time that the packet first arrives at the router of a link (waiting for the router to put it into the belt) and the time that the router serializes it onto the link.

Thus by inserting the transmission delay into the buffer Eqs. (3)–(6) we modify the buffer equations as follows:

$$\begin{aligned}& \dot{b}_{1}(t) = \Biggl( \sum_{i = 1}^{K} \tilde{u}_{i} \bigl( t - w/\tilde{u}_{i}(t) \bigr) - c_{1} \Biggr)_{b_{1}(t)}^{ +},\quad t \in [0, T], \end{aligned}$$
(19)
$$\begin{aligned}& \dot{b}_{2}(t) = \Biggl( \sum_{i = K + 1}^{N} \tilde{u}_{i} \bigl( t - w/\tilde{u}_{i}(t) \bigr) - c_{2} \Biggr)_{b_{2}(t)}^{ +},\quad t \in [0, T], \end{aligned}$$
(20)
$$\begin{aligned}& \dot{b}_{3}(t) = \Biggl( \sum_{i = 1}^{N} \hat{u}_{i} \bigl( t - w/\hat{u}_{i}(t) \bigr) - c_{3} \Biggr)_{b_{3}(t)}^{ +}, \quad t \in [0, T], \end{aligned}$$
(21)

with the following initial conditions on the queue size and the input transmission rates:

$$\begin{aligned}& b_{j}(t) = 0,\quad j = 1, 2, 3, \end{aligned}$$
(22)
$$\begin{aligned}& \tilde{u}_{i} \bigl( t - w/\tilde{u}_{i}(t) \bigr) = 0 \quad \text{whenever } t - w/\tilde{u}_{i}(t) < 0, i = 1,\ldots,N, t \in [0, T], \end{aligned}$$
(23)
$$\begin{aligned}& \hat{u}_{i} \bigl( t - w/\hat{u}_{i}(t) \bigr) = 0\quad \text{whenever } t - w/\hat{u}_{i}(t) < 0, i = 1,\ldots,N, t \in [0, T], \end{aligned}$$
(24)

and

$$\begin{aligned}& \dot{b}_{i,1}(t) = \frac{\tilde{u}_{i} ( t - w/\tilde{u}_{i}(t) )}{\sum_{j = 1}^{K} \tilde{u}_{j} ( t - w/\tilde{u}_{j}(t) ) + \varepsilon} \dot{b}_{1}(t), \quad i = 1,\ldots,K, t \in [0, T], \end{aligned}$$
(25)
$$\begin{aligned}& \dot{b}_{i,2}(t) = \frac{\tilde{u}_{i} ( t - w/\tilde{u}_{i}(t) )}{\sum_{j = K + 1}^{N} \tilde{u}_{j} ( t - w/\tilde{u}_{j}(t) ) + \varepsilon} \dot{b}_{2}(t), \quad i = K + 1,\ldots,N, t \in [0, T], \end{aligned}$$
(26)
$$\begin{aligned}& \dot{b}_{i,3}(t) = \frac{\hat{u}_{i} ( t - w/\hat{u}_{i}(t) )}{\sum_{j = 1}^{N} \hat{u}_{j} ( t - w/\hat{u}_{j}(t) ) + \varepsilon} \dot{b}_{3}(t), \quad i = 1,\ldots,N, t \in [0, T], \end{aligned}$$
(27)
$$\begin{aligned}& b_{i,1}(0) = 0,\quad i = 1,\ldots,K, \end{aligned}$$
(28)
$$\begin{aligned}& b_{i,2}(0) = 0,\quad i = k + 1,\ldots,N, \end{aligned}$$
(29)
$$\begin{aligned}& b_{i,3}(0) = 0,\quad i = 1,\ldots,N, \end{aligned}$$
(30)

together with initial conditions on the input transmission rates given by (22)–(24), where ε is a very small number, and the notation \(( x )_{y}^{ +} \) is defined in (7).

To ensure that the state Eqs. (19)–(21) and (25)–(27) are well defined, we need to impose the bounds on the input transmission rates \(\tilde{u}_{i}(t)\), \(\hat{u}_{i}(t)\):

$$ \bar{u} \ge \tilde{u}_{i}(t), \qquad \hat{u}_{i}(t) \ge \underline{u},\quad i = 1,\ldots,N, t \in [0, T]. $$
(31)

Moreover, we need to impose the buffer capacity constraints:

$$ 0 \le b_{j}(t) \le B_{j},\quad j = 1, 2, 3. $$
(32)

Remark 2.3

Eqs. (19)–(21) state that the change of queue size at the buffer of each link occurs at the moment that the packet is being serialized into the link, instead of the time that it first arrives at the router of the link. Similarly, Eqs. (25)–(27) state that the allocation of queue size at the buffer of each link actually occurs when the packet is being serialized into the link, instead of the time that it first arrives at the router of the link. The reason for adding ε to the denominator of (25)–(27) is to ensure that the denominators of the terms of these equations is nonzero when each of the terms \(\tilde{u}_{j} ( t - w/\tilde{u}_{j}(t) )\) or \(\hat{u}_{j} ( t - w/\hat{u}_{j}(t) )\) is zero.

Moreover, as mentioned in [4], there also exists a time difference between the time that the packet leaves the router of one link (i.e., immediately after the router has serialized the packet into the link) and the time that it arrives at the receiving end of the link. This is called the latency of the link, which is time-independent. For simplicity, we assume that the latency of each link is the same. Let d (seconds) be the latency of the link. Thus by inserting the latency of the links into the conversation of the TCP flow equations we modify the conservation of TCP flow Eq. (16) as follows:

$$ \hat{u}_{i}(t) = \textstyle\begin{cases} \tilde{u}_{i}(t - d) - \dot{b}_{i,1} (t - d),& i = 1,\ldots,K, t \in [0, T], \\ \tilde{u}_{i}(t - d) - \dot{b}_{i,2} (t - d),& i = K + 1,\ldots,N, t \in [0, T], \end{cases} $$
(33)

where

$$ \tilde{u}_{i}(t - d) = 0\quad \text{whenever } t - d \le 0, i = 1, \ldots, N, t \in [0, T]. $$
(34)

Remark 2.4

Eq. (33) states that the input transmission rate at the router of link 3 at time t due to any source is equal to the output transmission rate at the router of the previous link (i.e., either link 1 or link 2) due to this source at time \(t - d\).

Thus, in view of (33), and (25)–(27), we can express the input transmission rate at the router of link 3 as a function of the delayed input transmission rates due to the forward transmission delay at the buffer of the previous link (i.e., either link 1 or link 2) and the latency of the previous link (i.e., either link 1 or link 2) as follows:

$$\begin{aligned}& \hat{u}_{i} ( t ) = \textstyle\begin{cases} \tilde{u}_{i} ( t - d ) - ( \frac{\tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) )}{\sum_{j = 1}^{K} \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) + \varepsilon} )\\ \qquad {} \times \sum_{j = 1}^{K} ( \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) - c_{1} )_{b_{1}(t)}^{ +}, \\ \quad i = 1,\ldots,K, t \in [0,T], \\ \tilde{u}_{i} ( t - d ) - ( \frac{\tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) )}{\sum_{j = K + 1}^{N} \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) + \varepsilon} )\\ \qquad {} \times \sum_{j = K + 1}^{N} ( \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) - c_{2} )_{b_{2}(t)}^{ +},\\ \quad i = K + 1,\ldots,N, t \in [0,T], \end{cases}\displaystyle \end{aligned}$$
(35)

where

$$\begin{aligned}& \tilde{u}_{i} ( t - d ) = 0\quad \text{whenever } t - d \le 0, i = 1,\ldots, N, t \in [0, T], \end{aligned}$$
(36)
$$\begin{aligned}& \tilde{u}_{i} \bigl( t - d - w/\tilde{u}_{i}(t) \bigr) = 0\quad \text{whenever } t - d - w/\tilde{u}_{i}(t) \le 0, i = 1, \ldots, N, t \in [0, T]. \end{aligned}$$
(37)

Let \(U(u)\) be the utility function of our problem, which represents the benefit of the entire network when the source \(s_{i}\), \(i = 1,\ldots,N\), is transmitting data at the rate \(\tilde{u}_{i}(t)\). Now the output flow rate of source \(s_{i}\), \(i = 1,\ldots,N\), at the receiving end of link \(l_{3}\) is given by \(\hat{u}_{i}(t - d) - \dot{b}_{i,3}(t - d)\). Thus by maximizing the weighted source output in accordance with the relative importance of the message sent by each source we obtain

$$\begin{aligned} \max U(u) ={}& \sum_{i = 1}^{N} M_{i} \int _{0}^{T} \bigl[ \hat{u}_{i}(t - d) - \dot{b}_{i,3}(t - d) \bigr]\,dt \\ ={}& \sum_{i = 1}^{N} M_{i} \int _{0}^{T} \Biggl[ \hat{u}_{i}(t - d) - \biggl( \frac{\hat{u}_{i} ( t - d - w/\hat{u}_{i}(t) )}{\sum_{j = 1}^{N} \hat{u}_{j} ( t - d - w/\hat{u}_{j}(t) ) + \varepsilon} \biggr) \\ &{} \times \sum_{j = 1}^{N} \hat{u}_{j} \bigl( t - d - w/\hat{u}_{j}(t) - c_{3} \bigr)_{b_{3}(t)}^{ +} \Biggr]\,dt, \end{aligned}$$
(38)

where

$$\begin{aligned}& \hat{u}_{i} ( t - d ) = 0\quad \text{whenever } t - d \le 0, i = 1, \ldots, N, t \in [0, T], \end{aligned}$$
(39)
$$\begin{aligned}& \hat{u}_{i} \bigl( t - d - w/\hat{u}_{i}(t) \bigr) = 0 \quad \text{whenever } t - d - w/\hat{u}_{i}(t) \le 0, i = 1,\ldots, N, t \in [0, T], \end{aligned}$$
(40)

and \(M_{i}\) represents the relative importance of the messages sent from source \(s_{i}\) to the destination.

The time-delayed optimal control problem can be stated as follows.

Problem (P1)

Subject to the buffer (state) Eqs. (19)–(21) with initial conditions (22)–(24), the conservation of TCP flow Eq. (35) with initial conditions (34), (36), and (37), the bounds on the input transmission rate (31) and the buffer capacity constraints (32), we want to find piecewise continuous controls \(\tilde{u}_{i}(t)\), \(i = 1,\ldots,N\), that maximize the objective function (38).

3 Obtaining a new time-delayed optimal control problem with smooth functions and smooth canonical constraints

The time-delayed optimal control problem described in Sect. 2 involves nonsmooth functions and nonsmooth constraints, which cannot be easily solved by computational methods. In this section, we describe a method for converting all the nonsmooth functions and nonsmooth constraints into smooth functions and smooth canonical constraints. The details are given in the next subsections.

3.1 Converting the nonsmooth functions in the buffer equations and the objective function into smooth functions

We first need to use an approximation method to convert the nonsmooth function in the right-hand side of the buffer Eqs. (19)–(21), the conservation of the TCP flow Eq. (35), and the objective function (38) into sufficiently smooth functions.

Noting that all these nonsmooth functions have the form \(( z(t) )_{y(t)}^{ +}\), we can approximate \(( z(t) )_{y(t)}^{ +} \) by a sufficiently smooth function \(\xi _{{\delta}} ( y(t), z(t) ) \in C^{1}(R^{2})\) as follows:

$$ \xi _{{\delta}} \bigl( y(t), z(t) \bigr) = \textstyle\begin{cases} z(t)& \text{if } y(t) > 0, \\ I_{\delta} (y(t)) z(t) + (1 - I_{\delta} (y(t)))\max_{{\delta}} ( z(t) )& \text{if } {-} \delta \le y(t) \le 0, \\ \max_{\delta} (z(t))& \text{if } y(t) < - \delta , \end{cases} $$
(41)

where \(\max_{\delta} (z(t))\) is the function used for smoothing \(\max \{ z(t), 0 \}\) defined by

$$ \max_{\delta} \bigl(z(t)\bigr) = \textstyle\begin{cases} 0& \text{if } z(t) \le - \delta , \\ \frac{ ( z(t) + \delta )^{2}}{4\delta} & \text{if } {-} \delta \le z(t) \le \delta , \\ z(t)& \text{if } z(t) > \delta , \end{cases} $$
(42)

and

$$ I_{\delta} \bigl(y(t)\bigr) = - 2 \biggl( \frac{y(t)}{\delta} \biggr)^{3} + 3 \biggl( \frac{y(t)}{\delta} \biggr)^{2} + 1\quad \text{if } {-} \delta \le y(t) \le 0 $$
(43)

is a real number between 0 and 1, and δ is a small number.

Using this approximation method, from (19)–(21) we obtain the new state equations

$$\begin{aligned}& \dot{b}_{1}(t) = \xi _{{\delta}} \Biggl( b_{1}(t), \sum_{j = 1}^{K} \tilde{u}_{j} \bigl( t - w/\tilde{u}_{j}(t) \bigr) - c_{1} \Biggr), \quad t \in [0, T], \end{aligned}$$
(44)
$$\begin{aligned}& \dot{b}_{2}(t) = \xi _{{\delta}} \Biggl( b_{2}(t), \sum_{j = K + 1}^{N} \tilde{u}_{j} \bigl( t - w/\tilde{u}_{j}(t) \bigr) - c_{2} \Biggr), \quad t \in [0, T], \end{aligned}$$
(45)
$$\begin{aligned}& \dot{b}_{3}(t) = \xi _{{\delta}} \Biggl( b_{3}(t), \sum_{j = 1}^{N} \hat{u}_{j} \bigl( t - w/\hat{u}_{j}(t) \bigr) - c_{3} \Biggr),\quad t \in [0, T]. \end{aligned}$$
(46)

Similarly, by using this approximation method from (38) we obtain the new objective function

$$\begin{aligned} \max U(u) =& \sum_{i = 1}^{N} M_{i} \int _{0}^{T} \Biggl[ \hat{u}_{i}(t - d) - \biggl( \frac{\hat{u}_{i} ( t - d - w/\hat{u}_{i}(t) )}{\sum_{j = 1}^{N} \hat{u}_{j} ( t - d - w/\hat{u}_{j}(t) ) + \varepsilon} \biggr) \\ &{}\times \xi _{\varepsilon} \Biggl( b_{3}(t),\sum _{i = 1}^{N} \hat{u}_{j} \bigl( t - d - w/\hat{u}_{j}(t) \bigr) - c_{3} \Biggr) \Biggr]\,dt. \end{aligned}$$
(47)

3.2 Converting the nonsmooth constraint functions into smooth canonical functions

Similarly, by using the above approximation method together with the constraint approximation method described in [1921] we convert the new conservation of the TCP flow Eq. (35) into smooth canonical constraints

$$\begin{aligned}& \int _{0}^{T} \bigl[ G_{i} \bigl( b_{1}(t), \hat{u}_{i}(t), \tilde{u}_{i}(t - d), \tilde{u}_{1} \bigl( t - d - w/\tilde{u}_{1}(t) \bigr),\ldots, \tilde{u}_{K} \bigl( t - d - w/\tilde{u}_{K}(t) \bigr) \bigr) \bigr]^{2} \,dt \\& \quad= 0\quad \text{for } i = 1,\ldots,K, \end{aligned}$$
(48)
$$\begin{aligned}& \int _{0}^{T} \bigl[ G_{i} \bigl( b_{2}(t), \hat{u}_{i}(t), \tilde{u}_{i}(t - d), \tilde{u}_{K + 1} \bigl( t - d - w/\tilde{u}_{K + 1}(t) \bigr),\ldots, \tilde{u}_{N} \bigl( t - d - w/\tilde{u}_{N}(t) \bigr) \bigr) \bigr]^{2}\,dt \\& \quad = 0\quad \text{for } i = K + 1,\ldots,N, \end{aligned}$$
(49)

where

$$\begin{aligned}& G_{i} \bigl( b_{1}(t), \hat{u}_{i}(t), \tilde{u}_{i}(t - d), \tilde{u}_{1} \bigl( t - d - w/ \tilde{u}_{1}(t) \bigr),\ldots, \tilde{u}_{K} \bigl( t - d - w/\tilde{u}_{K}(t) \bigr) \bigr) \\& \quad = \hat{u}_{i} ( t ) - \tilde{u}_{i} ( t - d ) + \biggl( \frac{\tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) )}{\sum_{j = 1}^{K} \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) + \varepsilon} \biggr) \\& \qquad {}\times \delta _{\varepsilon} \Biggl( b_{1}(t), \sum _{j = 1}^{K} \tilde{u}_{j} \bigl( t - d - w/\tilde{u}_{j}(t) \bigr) - c_{1} \Biggr),\quad i = 1,\ldots,K, \end{aligned}$$
(50)
$$\begin{aligned}& G_{i} \bigl( b_{2}(t), \hat{u}_{i}(t), \tilde{u}_{i}(t - d), \tilde{u}_{K + 1} \bigl( t - d - \alpha \bigl( \tilde{u}_{K + 1}(t) \bigr) \bigr),\ldots, \tilde{u}_{N} \bigl( t - d - \alpha \bigl( \tilde{u}_{N}(t) \bigr) \bigr) \bigr) \\& \quad = \hat{u}_{i} ( t ) - \tilde{u}_{i} ( t - d ) + \biggl( \frac{\tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) )}{\sum_{j = K + 1}^{N} \tilde{u}_{j} ( t - d - w/\tilde{u}_{j}(t) ) + \varepsilon} \biggr) \\& \qquad {} \times \delta _{\varepsilon} \Biggl( b_{2}(t), \sum _{j = K + 1}^{N} \tilde{u}_{j} \bigl( t - d - w/\tilde{u}_{j}(t) \bigr) - c_{2} \Biggr), \quad i = K + 1,\ldots,N. \end{aligned}$$
(51)

Furthermore, by using the constraint approximation method described in [1921] we convert the new buffer capacity constraints (32) into smooth canonical constraints

$$\begin{aligned}& \int _{0}^{T} \bar{G}_{i}\bigl(b(t) \bigr) = 0,\quad i = 1, 2, 3, 4, 5, 6, \end{aligned}$$
(52)

where

$$\begin{aligned}& \bar{G}_{i}\bigl(b(t)\bigr) = \textstyle\begin{cases} \hat{G}_{i}(b(t)),& \text{if } \hat{G}_{i}(b(t)) \le - \varepsilon , \\ - \frac{ ( \hat{G}_{i}(b(t)) - \varepsilon )^{2}}{4\varepsilon} & \text{if } {-} \varepsilon \le \hat{G}_{i}(b(t)) \le \varepsilon , \\ 0,& \text{if } \hat{G}_{i}(b(t)) \ge \varepsilon , \end{cases}\displaystyle \end{aligned}$$
(53)
$$\begin{aligned}& \hat{G}\bigl(b(t)\bigr) = \bigl( b_{1}(t), b_{2}(t), b_{3}(t),B_{1} - b_{1}(t), B_{2} - b_{2}(t), B_{3} - b_{3}(t) \bigr)^{T}. \end{aligned}$$
(54)

3.3 Obtaining a smooth time-delayed optimal control problem

With the preparation work given in the previous subsections, we are able to convert problem (P1) into a new smooth time-delayed optimal control problem (P2), which can be solved by a modified control parameterization method given in the next section. Problem (P2) is as follows.

Problem (P2)

Subject to the buffer Eqs. (44)–(46) with initial conditions (23)–(25), the conservation of the TCP flow constraints in canonical form (48)–(49) with initial conditions (34), (36), and (37), the bounds on the input transmission rate (32), and the buffer capacity constraints in canonical form (52), we want to find piecewise continuous controls \(\tilde{u}_{i}(t)\), \(i = 1,\ldots,N\), which maximize the objective function (47).

4 Solving the optimal control problem with control and state-dependent time delays by the control parametrization technique

In this section, we use the control parameterization technique to solve problem (P2). In the past, the control parameterization method was used to solve optimal control problems without any time-delayed arguments or time-delayed optimal control problems with discrete time-delayed arguments only. In this paper, we extend the control parameterization method to solve a more difficult optimal control problem, namely, an optimal control problem with control-dependent time-delayed arguments. We first create an approximated problem for problem (P2) in such a way that it does not contain any control-dependent time-delayed argument. The details are as follows.

Define

$$ U = \bigl\{ v = ( v_{1},\ldots,v_{2N} )^{T}: \underline{u} \le v_{i} \le \bar{u}, i = 1,\ldots,2N \bigr\} . $$
(55)

A piecewise continuous function \(u(t) = (\tilde{u}_{1}(t),\ldots,\tilde{u}_{N}(t),\hat{u}_{1}(t),\ldots,\hat{u}_{N}(t) )^{T}\) from \([0, T]\) into \(R^{2N}\) is said to be an admissible control of problem (P2) if \(u(t) \in U\) for all \(t \in [0, T]\). Let \(\mathcal{U}\) be the set consisting of all such admissible controls.

We now partition the time horizon \([0,T]\) into p equal subintervals so that the component of the admissible control \(u(t)\) is a constant over each subinterval. Let \(I^{p}\) be the partition of \([0,T]\) defined by

$$ I^{p} = \bigl\{ [ ( k - 1 )\Delta , k\Delta ): k = 1,\ldots, p \bigr\} , $$
(56)

where \(\Delta = T/p\) is the length of each subinterval.

Let \(\mathcal{U}^{p}\) be the subset of \(\mathcal{U}\) consisting of all the piecewise constant controls consistent with the partition \(I^{p}\). Let \(u(t) \in \mathcal{U}^{p}\). Then each component of \(u(t)\) assumes the form

$$\begin{aligned}& \tilde{u}_{i}(t) = \textstyle\begin{cases} \sum_{k = 1}^{p} \tilde{\sigma}_{i}^{k} \chi _{I_{k}^{p}}(t), &i = 1,\ldots,N, t \in [0, T], \\ 0,& i = 1,\ldots,N, t \le 0, \end{cases}\displaystyle \end{aligned}$$
(57)
$$\begin{aligned}& \hat{u}_{i}(t) = \textstyle\begin{cases} \sum_{k = 1}^{p} \hat{\sigma}_{i}^{k} \chi _{I_{k}^{p}}(t),& i = 1,\ldots,N, t \in [0, T], \\ 0,& i = 1,\ldots,N, t \le 0, \end{cases}\displaystyle \end{aligned}$$
(58)

where \(\chi _{I}\) denotes the indicator function of I defined by

$$ \chi _{I}(t) = \textstyle\begin{cases} 1& \text{if } t \in I, \\ 0& \text{otherwise}. \end{cases} $$
(59)

Let

$$ \sigma ^{k} = \bigl[ \tilde{\sigma}_{1}^{k}, \ldots,\tilde{\sigma}_{N}^{k},\hat{\sigma}_{1}^{k}, \ldots,\hat{\sigma}_{N}^{k} \bigr]^{T} $$
(60)

and

$$ \sigma = \bigl[ \sigma ^{1},\ldots,\sigma ^{p} \bigr]^{T} \in R^{2NP}. $$
(61)

Restricting to \(\mathcal{U}^{p}\), the control constraints (i.e., the bounds on the input transmission rates), (31) becomes

$$ \underline{u} \le \tilde{\sigma}_{i}^{j},\qquad \hat{ \sigma}_{i}^{j} \le \bar{u},\quad i = 1,\ldots,N, k = 1, \ldots,p. $$
(62)

Let \(\Xi ^{p}\) be the set of all vectors σ that satisfy constraint (62). Then each control \(u \in \mathcal{U}^{p}\) can be uniquely determined by a control parameter vector \(\sigma \in \Xi ^{p}\) and vice versa. Let \(\hat{u} ( t|\sigma ) \in \mathcal{U}^{p}\) be the control completely specified by the vector \(\sigma \in \Xi ^{p}\).

Suppose that the length of each partition \(T/p\) is sufficiently large so that the lower bound \(\underline{u}\) of the control satisfies the inequality

$$ 0 \le \frac{w}{\underline{u}} + d < \frac{T}{p}. $$
(63)

In other words, the duration from one switching time to the next switching time of the control \(u \in \mathcal{U}^{p}\) should be greater than its largest time delay (i.e., the forward transmission delay plus the latency) at any time \(t \in [0, T]\). (By (31) the maximum possible value of “forward” transmission delay plus latency is \(\frac{w}{\underline{u}} + d\).) In view of (63) and the initial control conditions (23)–(24), by identifying the control \(\hat{u} \in \mathcal{U}^{p}\) with parameter \(\sigma \in \Xi ^{p}\), we can convert the control-dependent time-delayed term \(\tilde{u}_{i} ( t - w/\tilde{u}_{i}(t) )\) and \(\hat{u}_{i} ( t - w/\hat{u}_{i}(t) ), i = 1,\ldots,N\), in (44)–(46) into \(\tilde{\tilde{u}}_{i} ( t|\sigma )\) and \(\hat{\hat{u}}_{i} ( t|\sigma )\), respectively, where

$$\begin{aligned} \tilde{\tilde{u}}_{i} ( t|\sigma ) =& \tilde{u}_{i} \bigl( t - \bigl(w/\tilde{u}_{i}(t)\bigr) \bigr) \\ =& \textstyle\begin{cases} 0& \text{if } t \in [0, w/\tilde{\sigma}_{i}^{1}), \\ \tilde{\sigma}_{i}^{k}& \text{if } t \in [w/\tilde{\sigma}_{i}^{k} + (k - 1)T/ p, w/\tilde{\sigma}_{i}^{k + 1} + kT/ p), k = 1,\ldots, p - 1, \\ \tilde{\sigma}_{i}^{p}& \text{if } t \in [w/\tilde{\sigma}_{i}^{p} + (p - 1) T/ p, T), \end{cases}\displaystyle \end{aligned}$$
(64)
$$\begin{aligned} \hat{\hat{u}}_{i} ( t|\sigma ) =& \hat{u}_{i} \bigl( t - \bigl(w/\hat{u}_{i}(t)\bigr) \bigr) \\ =& \textstyle\begin{cases} 0& \text{if } t \in [0, w/\hat{\sigma}_{i}^{1}), \\ \hat{\sigma}_{i}^{k}& \text{if } t \in [w/\hat{\sigma}_{i}^{k} + (k - 1)T/ p, w/\hat{\sigma}_{i}^{k + 1} + kT/ p), k = 1,\ldots, p - 1, \\ \hat{\sigma}_{i}^{p}& \text{if } t \in [w/\hat{\sigma}_{i}^{p} + (p - 1) T/ p, T). \end{cases}\displaystyle \end{aligned}$$
(65)

Similarly, in view of (63) and the initial conditions (36)–(37), by identifying the control \(\hat{u} \in \mathcal{U}^{p}\) with parameter \(\sigma \in \Xi ^{p}\) we can convert the control-dependent time-delayed term \(\tilde{u}_{i} ( t - d - (w/\tilde{u}_{i}(t)) )\) and \(\hat{u}_{i} ( t - d - (w/\hat{u}_{i}(t)) ), i = 1,\ldots,N\), in (48)–(49) into \(\tilde{\tilde{\tilde{u}}}_{i} ( t|\sigma )\) and \(\hat{\hat{\hat{u}}}_{i} ( t|\sigma )\), respectively, where

$$\begin{aligned} \tilde{\tilde{\tilde{u}}}_{i} ( t|\sigma ) =& \tilde{u}_{i} \bigl( t - d - \bigl(w/\tilde{u}_{i}(t)\bigr) \bigr) \\ =& \textstyle\begin{cases} 0& \text{if } t \in [0, w/\tilde{\sigma}_{i}^{1} + d), \\ \tilde{\sigma}_{i}^{k}& \text{if } t \in [w/\tilde{\sigma}_{i}^{k} + d + (k - 1)T/p, w/\tilde{\sigma}_{i}^{k + 1} + d + kT/p),\\ & k = 1,\ldots, p - 1, \\ \tilde{\sigma}_{i}^{p}& \text{if } t \in [w/\tilde{\sigma}_{i}^{p} + d + (p - 1)T/p, T), \end{cases}\displaystyle \end{aligned}$$
(66)
$$\begin{aligned} \hat{\hat{\hat{u}}}_{i} ( t|\sigma ) =& \hat{u}_{i} \bigl( t - d - \bigl(w/\hat{u}_{i}(t)\bigr) \bigr) \\ =& \textstyle\begin{cases} 0& \text{if } t \in [0, w/\hat{\sigma}_{i}^{1} + d), \\ \hat{\sigma}_{i}^{k}& \text{if } t \in [w/\hat{\sigma}_{i}^{k} + d + (k - 1)T/p, w/\hat{\sigma}_{i}^{k + 1} + d + kT/p),\\ & k = 1,\ldots, p - 1, \\ \hat{\sigma}_{i}^{p}& \text{if } t \in [w/\hat{\sigma}_{i}^{p} + d + (p - 1)T/p, T). \end{cases}\displaystyle \end{aligned}$$
(67)

Let \(b ( t|\sigma ) \in R^{3}\) be the solution of the state (buffer) Eqs. (44)–(46) and the initial conditions (22)–(24) corresponding to \(\sigma \in \Xi ^{p}\). When we identify the control \(u(t) \in \mathcal{U}^{p}\) with the parameter \(\sigma \in \Xi ^{p}\), we obtain the new state equations as follows:

$$\begin{aligned}& \dot{b}_{1} ( t|\sigma ) = F_{1} \bigl( b_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ), \ldots,\tilde{\tilde{u}}_{K} ( t|\sigma ) \bigr),\quad t \in [0, T], \end{aligned}$$
(68)
$$\begin{aligned}& \dot{b}_{2} ( t|\sigma ) = F_{2} \bigl( b_{2} ( t|\sigma ), \tilde{\tilde{u}}_{K + 1} ( t|\sigma ), \ldots,\tilde{\tilde{u}}_{N} ( t|\sigma ) \bigr),\quad t \in [0, T], \end{aligned}$$
(69)
$$\begin{aligned}& \dot{b}_{3} ( t|\sigma ) = F_{3} \bigl( b_{3} ( t|\sigma ), \hat{\hat{u}}_{1} ( t|\sigma ), \ldots,\hat{\hat{u}}_{N} ( t|\sigma ),b_{3} ( t|\sigma ) \bigr), \quad t \in [0, T], \end{aligned}$$
(70)
$$\begin{aligned}& b_{i} ( 0|\sigma ) = 0, \quad i = 1, 2, 3, \end{aligned}$$
(71)

where \(F_{1} ( b_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ),\ldots,\tilde{\tilde{u}}_{N} ( t|\sigma ) )\), \(F_{2} ( b_{2} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ),\ldots,\tilde{\tilde{u}}_{N} ( t|\sigma ) )\), and \(F_{3} ( b_{3} ( t|\sigma ), \hat{\hat{u}}_{1} ( t|\sigma ),\ldots,\hat{\hat{u}}_{N} ( t|\sigma ) )\) can be obtained from the functions in the right-hand side of (44), (45), and (46) in an obvious manner.

Similarly, we obtain from (66) and (67) the new canonical constraints concerning the conversation of flow equation as follows:

$$\begin{aligned}& \int _{0}^{T} \bar{\bar{G}}_{i} \bigl( b_{1}(t|\sigma ), \hat{u}_{i}(t|\sigma ), \tilde{u}_{i}(t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) \bigr) = 0,\quad i = 1,\ldots,K, \end{aligned}$$
(72)
$$\begin{aligned}& \int _{0}^{T} \bar{\bar{G}}_{i} \bigl( b_{2}(t|\sigma ), \hat{u}_{i}(t|\sigma ), \tilde{u}_{i}(t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{K + 1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{N} ( t|\sigma ) \bigr) = 0, \\& \quad i = K + 1,\ldots,N, \end{aligned}$$
(73)

where \(\bar{\bar{G}}_{i} ( b_{1}(t|\sigma ), \hat{u}_{i}(t|\sigma ), \tilde{u}_{i}(t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) )\) and \(\bar{\bar{G}}_{i} ( b_{1}(t|\sigma ), \hat{u}_{i}(t|\sigma ), \tilde{u}_{i}(t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) )\) can be obtained from the functions on the left-hand side of (48) and (49) in an obvious manner.

Similarly, we obtain from (52) the new canonical state constraints

$$ \int _{0}^{T} \hat{\hat{G}}_{i} \bigl( b ( t|\sigma ) \bigr) = 0,\quad i = 1, 2, 3, 4, 5, 6. $$
(74)

Similarly, from (67) we obtain the new objective function

$$\begin{aligned}& \max U(u) \\& \quad = \sum_{i = 1}^{N} \int _{0}^{T} J \bigl( b_{3}(t), \hat{u}_{i}(t - d|\sigma ), \hat{\hat{\hat{u}}}_{1}(t| \sigma ),\ldots,\hat{\hat{\hat{u}}}_{N}(t|\sigma ) \bigr), \end{aligned}$$
(75)

where \(J ( b_{3}(t), \hat{u}_{i}(t - d|\sigma ), \hat{\hat{\hat{u}}}_{i}(t|\sigma ) )\) can be obtained from the right-hand side of the objective function (47) in an obvious manner.

Thus the new optimal control problem, problem (P3), can be stated as follows.

Problem (P3)

Subject to the new state (buffer) Eqs. (68)–(70) with initial condition (71), the new conversation of TCP flow constraints in canonical form (72)–(73), the bounds on the input transmission rate (32), and the buffer capacity constraints in canonical form (74), we wish to find \(\sigma \in \Xi ^{p}\) that minimizes the new objective function (75).

Remark 4.1

Using the modified control parameterization method, we have translated problem (P2) into the new problem (P3), which contains only discrete time-delayed arguments (i.e., does not contain any control-dependent time-delayed arguments) with no explicit initial condition on the control functions when the time t is nonpositive. (The discrete time-delayed arguments are \(\tilde{u}_{i}(t - d|\sigma )\) and \(\hat{u}_{i}(t - d|\sigma )\) in (72), (73), and (75), respectively.) Thus problem (P3) can be solved more easily by any optimization method than problem (P2).

5 Gradient formulae for solving the control-dependent time-delayed optimal control problem by the control parametrization method

To solve problem (P3) as a mathematical programming problem, we require the gradient formulae for the objective function and all constraints. We will derive the gradient formula for the canonical constraints because the gradient formulae for the objective function (79) and the canonical continuous state constraints (75) can be obtained in a similar manner. For simplicity, let the superscript \(i =1\) in (72) in the derivation of the gradient formula. Thus we consider

$$ g_{1}(\sigma ) = \int _{0}^{T} \bar{\bar{G}}_{1} \bigl( b_{1}(t|\sigma ), \hat{u}_{1}(t|\sigma ), \tilde{u}_{1}(t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots,\tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) \bigr) $$
(76)

subject to

$$ \frac{d ( b_{1} ( t|\sigma ) )}{dt} = F_{1} \bigl( b_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ),\ldots,\tilde{ \tilde{u}}_{K} ( t|\sigma ) \bigr). $$
(77)

Before we derive the gradient formulae for \(g_{1}(\sigma )\), we need the following analytic considerations.

Remark 5.1

From the control bounds (31) and the initial conditions of controls (23) and (36), we observe that \(\tilde{u}_{i} ( t - w/\tilde{u}_{i}(t) )\) and \(\tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) )\) are continuously differentiable with respect to \(\tilde{u}_{i} ( t )\) for all \(u \in \mathcal{U}\). Thus from (64), (66), (48), (50), and the fact that \(\xi _{{\delta}} ( b_{1}(t), \sum_{i = 1}^{K} \tilde{u}_{i} ( t - w/\tilde{u}_{i}(t) ) - c_{1} ) \) and \(\xi _{{\delta}} ( b_{1}(t), \sum_{i = 1}^{K} \tilde{u}_{i} ( t - d - w/\tilde{u}_{i}(t) ) - c_{1} ) \) are continuously differentiable with respect to each of their arguments, it is clear that all \(F_{1} ( b_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ),\ldots,\tilde{\tilde{u}}_{K} ( t|\sigma ) )\) in (68) and \(\bar{\bar{G}}_{1} ( b_{1}(t|\sigma ), \hat{u}_{1}(t|\sigma ), \tilde{u}_{1} ( t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) )\) in (72) are continuously differentiable with respect to each of their arguments. However, by (64) \(\tilde{\tilde{u}}_{i} ( t|\sigma )\) is not continuous with respect to σ at the discrete times \(t = \frac{w}{\tilde{\sigma}_{i}^{k}} + \frac{(k - 1)T}{p}\), \(k = 1, p - 1\), and by (66) \(\tilde{\tilde{\tilde{u}}}_{i} ( t|\sigma )\) is not continuous with respect to σ at the discrete times \(t = \frac{w}{\tilde{\sigma}_{i}^{k}} + d + \frac{(k - 1)T}{p}\), \(k = 1, p - 1\). Thus \(F_{1} ( b_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ),\ldots,\tilde{\tilde{u}}_{K} ( t|\sigma ) )\) and \(\bar{\bar{G}}_{1} ( b_{1}(t|\sigma ), \hat{u}_{1}(t|\sigma ), \tilde{u}_{1} ( t - d|\sigma ), \tilde{\tilde{\tilde{u}}}_{1} ( t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K} ( t|\sigma ) )\) are both continuously differentiable with respect to σ, except at the discrete times mentioned above.

Let \(\sigma = [ \tilde{\sigma}_{1}^{1},\ldots,\tilde{\sigma}_{N}^{1},\hat{\sigma}_{1}^{1},\ldots,\hat{\sigma}_{N}^{1},\ldots,\tilde{\sigma}_{1}^{p},\ldots,\tilde{\sigma}_{N}^{p},\hat{\sigma}_{1}^{p},\ldots,\hat{\sigma}_{N}^{p} ]^{T} \in R^{2Np}\). Then from (76) and (77) for all \(k = 1,\ldots,p\), we have

$$ \frac{\partial g_{1}(\sigma )}{\partial \tilde{\sigma}_{i}^{k}} = 0,\quad i = K + 1,\ldots,N, $$
(78)

and

$$ \frac{\partial g_{1}(\sigma )}{\partial \hat{\sigma}_{i}^{k}} = 0,\quad i = 2,\ldots,N. $$
(79)

Moreover, the method of finding the gradients of \(g_{1} ( \sigma )\) with respect to \(\tilde{\sigma}_{i}^{k}\) and \(\hat{\sigma}_{1}^{k}\), \(i = 1,\ldots,K\) are the same. Thus we only consider the method of finding the gradients of \(g_{1} ( \sigma )\) with respect to \(\tilde{\sigma}_{i}^{k}\), \(i = 1,\ldots,K\). To derive the gradient of \(g_{1} ( \sigma )\) with respect to \(\sigma _{i}^{k}\), \(i = 1,\ldots,K\), we define the following adjoint system defined on \([0,T]\):

$$\begin{aligned}& \bigl( \dot{\lambda}_{i} ( t|\sigma ) \bigr)^{T} \\& \quad = - \frac{\partial H ( b_{1} ( t|\sigma ), \stackrel{\frown}{b}_{1} ( t|\sigma ), \hat{u}_{1} ( t|\sigma ), \stackrel{\frown}{u}_{1} ( t|\sigma ), \bar{u}_{1} ( t|\sigma ), \tilde{u}_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ), \ldots,\tilde{\tilde{u}}_{{K}} ( t|\sigma ), \tilde{\tilde{\tilde{u}}}_{{{1}}}(t|\sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{K}(t|\sigma ), \lambda _{1}(t|\sigma ) )}{\partial ( b_{1} ( t|\sigma ) )}, \\& \qquad t \in [0, T], \end{aligned}$$
(80)
$$\begin{aligned}& \bigl( \lambda _{1} ( T|\sigma ) \bigr)^{T} = 0, \end{aligned}$$
(81)

where

$$\begin{aligned}& \bar{u}_{1}(t|\sigma ) = \tilde{u}_{1}(t - d|\sigma ), \end{aligned}$$
(82)
$$\begin{aligned}& \stackrel{\frown}{u}_{1}(t|\sigma ) = \hat{u}_{1}(t + d|\sigma ), \end{aligned}$$
(83)
$$\begin{aligned}& \stackrel{\frown}{b}_{1} ( t|\sigma ) = \tilde{b}_{1} ( t + d|\sigma ), \end{aligned}$$
(84)

and \(H:R^{2K + 4} \to R\) is the Hamiltonian function defined by

$$\begin{aligned}& H ( b_{1}, \stackrel{\frown}{b}_{1}, \hat{u}_{{1}}, \stackrel{\frown}{u}_{{1}}, \bar{u}_{1}, \tilde{u}_{1} \tilde{\tilde{u}}_{{{1}}}, \ldots, \tilde{\tilde{u}}_{K}, \tilde{\tilde{\tilde{u}}}_{{{1}}}, \ldots, \tilde{\tilde{\tilde{u}}}_{{K}}, \lambda _{1} ) \\& \quad = \bar{\bar{G}}_{1} ( b_{1}, \hat{u}_{1}, \bar{u}_{1}, \tilde{\tilde{\tilde{u}}}_{{{1}}},\ldots, \tilde{\tilde{\tilde{u}}}_{{K}} ) + \bar{\bar{G}}_{1} ( \stackrel{\frown}{b}, \stackrel{\frown}{u}_{{1}}, \tilde{u}_{1}, \tilde{\tilde{u}}_{1},\ldots, \tilde{ \tilde{u}}_{K} )e ( T - d - t ) \\& \qquad {} + \lambda _{1}F_{1} ( b_{1}, \tilde{\tilde{u}}_{{{1}}},\ldots,\tilde{\tilde{u}}_{K} ), \end{aligned}$$
(85)

where \(e ( t )\) is the unit step function defined by

$$ e ( t ) = \textstyle\begin{cases} 1& \text{if } t > 0, \\ 0& \text{otherwise}. \end{cases} $$
(86)

Furthermore, we set

$$ \bar{u}_{1} ( t|\sigma ) = 0,\quad t \in ( T - d, T ]. $$
(87)

For simplicity, we replace the notation \(t|\sigma \) by t in the discussion of the remainder of this section, except for the statement of Theorem 5.1 and the last part of its proof. Moreover, we set

$$\begin{aligned}& \tilde{F}_{1} = F_{1} \bigl( b_{1} ( t| \sigma ), \tilde{\tilde{u}}_{{{1}}} ( t|\sigma ),\ldots,\tilde{ \tilde{u}}_{K} ( t|\sigma ) \bigr), \end{aligned}$$
(88)
$$\begin{aligned}& \tilde{G}_{1} = \bar{\bar{G}}_{1} \bigl( b_{1}(t|\sigma ), \hat{u}_{1}(t|\sigma ), \bar{u}_{1}(t|\sigma ), \tilde{\tilde{\tilde{u}}}_{1}(t| \sigma ),\ldots, \tilde{\tilde{\tilde{u}}}_{{K}}(t|\sigma ) \bigr), \end{aligned}$$
(89)
$$\begin{aligned}& \tilde{\tilde{G}}_{1} = \bar{\bar{G}}_{1} \bigl( b_{1}(t + d|\sigma ), \hat{u}_{1}(t + d|\sigma ), \tilde{u}_{1}(t|\sigma ), \tilde{\tilde{u}}_{1}(t|\sigma ),\ldots, \tilde{\tilde{u}}_{{K}}(t|\sigma ) \bigr), \end{aligned}$$
(90)

and

$$\begin{aligned} \tilde{H} ={}& H \bigl( b_{1} ( t|\sigma ), \stackrel{\frown}{b}_{1} ( t|\sigma ), \hat{u}_{1} ( t|\sigma ), \stackrel{\frown}{u}_{1} ( t|\sigma ), \bar{u}_{1} ( t| \sigma ), \tilde{u}_{1} ( t|\sigma ), \tilde{\tilde{u}}_{1} ( t|\sigma ), \ldots, \\ &{}\tilde{\tilde{u}}_{{K}} ( t|\sigma ), \tilde{\tilde{ \tilde{u}}}_{{{1}}}(t|\sigma ),\ldots, \tilde{\tilde{ \tilde{u}}}_{K}(t|\sigma ), \lambda _{1}(t|\sigma ) \bigr). \end{aligned}$$
(91)

In view of Remark 5.1, we have the following theorem. (Note that the technique used in proving the following theorem is modified from those used in the proofs of Theorems 5.3.1 and 12.3.1 of [19].)

Theorem 5.1

Consider problem (P3). The gradient of the functional \(g_{1} ( \sigma )\) in (76) with respect to the component \(\tilde{\sigma}_{i}^{k}, i = 1,\ldots,K\), \(k = 1,\ldots,p\), is given by

$$\begin{aligned}& \frac{\partial g_{1}(\sigma )}{\partial \tilde{\sigma}_{i}^{k}} \\& \quad = \int _{0}^{T} \bigl[ \bar{\bar{G}}_{1} \bigl( b_{1} ( t + 0 ), \hat{u}_{1} ( t + 0 ), \bar{u}_{1} ( t + 0 ),\tilde{\tilde{\tilde{u}}}_{1} ( t + 0 ),\ldots,\tilde{\tilde{\tilde{u}}}_{K} ( t + 0 ) \bigr) \\& \qquad {}- \bar{\bar{G}}_{1} \bigl( b_{1} ( t - 0 ), \hat{u}_{1} ( t - 0 ), \bar{u}_{1} ( t - 0 ),\tilde{ \tilde{\tilde{u}}}_{1} ( t - 0 ),\ldots,\tilde{\tilde{ \tilde{u}}}_{N} ( t - 0 ) \bigr) \bigr] \\& \qquad {}\times \delta \biggl( t - \frac{w}{\sigma _{i}^{k}} - \frac{(k - 1)T}{p} - d \biggr) \times \frac{w}{ ( \sigma _{i}^{k} )^{2}}\,dt \\& \qquad {}+ \int _{0}^{T} \biggl[ \lambda ^{T}(t) \times \bigl[ \bar{F} \bigl( b_{1}(t + 0), \tilde{\tilde{u}}_{1} ( t + 0 ),\ldots,\tilde{\tilde{u}}_{K}(t + 0) \bigr) \\& \qquad {}- \bar{F} \bigl( b_{1}(t - 0), \tilde{ \tilde{u}}_{1} ( t + 0 ),\ldots,\tilde{\tilde{u}}_{K}(t - 0) \bigr) \bigr] \\& \qquad {}\times \delta \biggl( t - \frac{w}{\sigma _{i}^{k}} - \frac{(k - 1)T}{p} \biggr) \times \frac{w}{ ( \sigma _{i}^{k} )^{2}} \biggr]\,dt + \int _{0}^{T} \bar{f}_{i}^{k} (t)\,dt, \end{aligned}$$
(92)

where

$$ \bar{f}_{i}^{k}(t) = \textstyle\begin{cases} \frac{\partial \tilde{H}}{\partial \tilde{u}_{1}} \times \chi _{[(k - 1)T/ p, kT/ p)}(t) + \frac{\partial \tilde{H}}{\partial \tilde{\tilde{u}}_{1}} \times \chi _{[w/\tilde{\sigma}_{1}^{k} + (k - 1)T/ p, w/\tilde{\sigma}_{1}^{k + 1} + kT/ p)}(t),\\ \quad \textit{for } i = 1, k = 2,\ldots,p - 1, \\ \frac{\partial \tilde{H}}{\partial \tilde{u}_{1}} \times \chi _{[(k - 1)T/ p, kT/ p)}(t) + \frac{\partial \tilde{H}}{\partial \tilde{\tilde{u}}_{1}} \times \chi _{[w/\tilde{\sigma}_{1}^{k} + (k - 1)T/ p, T)}(t)\quad \textit{for } i = 1, k = p, \\ \frac{\partial \tilde{H}}{\partial \tilde{\tilde{u}}_{i}} \times \chi _{[w/\tilde{\sigma}_{i}^{k} + (k - 1)T/ p, w/\tilde{\sigma}_{i}^{k + 1} + kT/ p)}(t)\quad \textit{for } i = 2, \ldots,K, k = 1,\ldots,p - 1, \\ \frac{\partial \tilde{H}}{\partial \tilde{\tilde{u}}_{i}} \times \chi _{[w/\tilde{\sigma}_{i}^{k} + (k - 1)T/ p, T)}(t)\quad \textit{for } i = 2, \ldots,K, k = p. \end{cases} $$
(93)

and for any analytic function \(f(t)\), we have

$$ f ( t + 0 ) = \lim_{\tau \downarrow t}f(\tau ), $$
(94)

and

$$ f ( t - 0 ) = \lim_{\tau \uparrow t}f(\tau ), $$
(95)

Proof

Let \(\sigma = [ \tilde{\sigma}_{1}^{1},\ldots,\tilde{\sigma}_{N}^{1},\hat{\sigma}_{1}^{1},\ldots,\hat{\sigma}_{N}^{1},\ldots,\tilde{\sigma}_{1}^{p},\ldots,\tilde{\sigma}_{N}^{p},\hat{\sigma}_{1}^{p},\ldots,\hat{\sigma}_{N}^{p} ]^{T} \in R^{2Np}\) be given. Let us perturb each component \(\tilde{\sigma}_{i}^{k}\), \(i = 1,\ldots,K\), \(k = 1,\ldots,p\) by an arbitrary small number ε. Let

$$ \sigma (\varepsilon ) = \sigma + e \bigl( \tilde{\rho}_{i}^{k} \bigr)\varepsilon , $$
(96)

where \(e ( \tilde{\rho}_{i}^{k} ) \in R^{2Np}\) is a column vector whose \([ (2k - 1) + i ]\)th component is 1, and for other components, the value is 0. Thus the first-order variation of \(g_{1} ( \sigma )\) can be written as

$$ \Delta g_{1} ( \sigma ) = \frac{d g_{1}(\sigma (\varepsilon )}{d\varepsilon} \biggm|_{\varepsilon = 0} = \sum_{i = 1}^{K} \sum _{k = 1}^{p} \frac{\partial g_{1}(\sigma )}{\partial \tilde{\sigma}_{i}^{k}}\tilde{ \rho}_{i}^{k}. $$
(97)

Let \(\Delta b_{1}(t), \Delta \bar{u}_{1}(t), \Delta \tilde{u}_{1}(t), \Delta \tilde{\tilde{u}}_{1}(t),\ldots,\Delta \tilde{\tilde{u}}_{K}(t), \Delta \tilde{\tilde{\tilde{u}}}_{1}(t),\ldots,\Delta \tilde{\tilde{\tilde{u}}}_{K}(t)\) be, respectively, the first-order variation of \(b_{1}(t), \bar{u}_{1}(t), \tilde{u}_{1}(t), \tilde{\tilde{u}}_{1}(t),\ldots,\tilde{\tilde{u}}_{K}(t), \tilde{\tilde{\tilde{u}}}_{1}(t),\ldots,\tilde{\tilde{\tilde{u}}}_{K}(t)\), respectively, when σ changes to \(\sigma (\varepsilon )\). Thus by using a similar approach as that used in the proof of Theorem 5.3.1 of [19] we get

$$\begin{aligned}& \Delta g_{1}(\sigma ) \\& \quad = \int _{0}^{T} \sum_{i = 1}^{K} \sum_{k = 1}^{p} \bigl[ \bar{ \bar{G}}_{1} \bigl( \tilde{b}_{1} ( t + 0 ), \hat{u}_{1} ( t + 0 ), \bar{u}_{1} ( t + 0 ), \tilde{ \tilde{\tilde{u}}}_{1} ( t + 0 ),\ldots,\tilde{\tilde{ \tilde{u}}}_{K} ( t + 0 ) \bigr) \\& \qquad {}- \bar{\bar{G}}_{1} \bigl( \tilde{b}_{1} ( t - 0 ), \hat{u}_{1} ( t - 0 ), \bar{u}_{1} ( t - 0 ), \tilde{\tilde{\tilde{u}}}_{1} ( t - 0 ),\ldots,\tilde{\tilde{ \tilde{u}}}_{K} ( t + 0 ) \bigr) \bigr]\\& \qquad {} \times \delta \biggl( t - \frac{w}{\sigma _{i}^{k}} - \frac{(k - 1)T}{p} - d \biggr) \times \frac{w}{ ( \sigma _{i}^{k} )^{2}} \times \tilde{\rho}_{{i}}^{k}\,dt \\& \qquad {}+ \int _{0}^{T} \Biggl[ \frac{\partial \tilde{G}_{1}}{\partial b_{1}} \Delta b_{1}(t) + \frac{\partial \tilde{G}_{1}}{\partial \bar{u}_{1}}\Delta \bar{u}_{1}(t) + \sum_{k = 1}^{K} \frac{\partial \tilde{G}_{1}}{\partial \tilde{\tilde{\tilde{u}}}_{i}} \Delta \tilde{\tilde{\tilde{u}}}_{i}(t) \Biggr]\,dt. \end{aligned}$$

Then the proof of this theorem can be obtained by using a standard method used in obtaining the gradient of an analytical function in optimal control theory, such as that given in Theorem 12.3.1 of [19]. □

6 Numerical example

Example 6.1

In this numerical example, we consider a particular problem that consists of four sources, three links, two origins, and a single destination, such that sources \(s_{1}\) and \(s_{2}\) use links \(l_{{1}}\) and \(l_{3}\) to reach the destination, whereas sources \(s_{3}\) and \(s_{4}\) use links \(l_{{2}}\) and \(l_{3}\) to reach the destination.

The structure of the computer network is the same as that shown in Fig. 1.

The parameters of this example are as follows:

The final time of the problem (second): 3.

The capacity of the three links, i.e., the bandwidth of the three links (bits per second): \(c_{1} = 1\), \(c_{2} = 0.5\), \(c_{3} = 0.4\).

The packet size transmitted by the router of the three links (bits): \(w_{1} = 0.1\), \(w_{2} = 0.15\), \(w_{3} = 0.11\).

The capacity of the three routers (bits per second): \(B_{1} = 1\), \(B_{2} = 1.8\), \(B_{3} = 2.7\).

The relative importance of the messages sent from the four sources to the destination: \(M_{1} = 1.5\), \(M_{2} = 2.5\), \(M_{3} = 1.7\), \(M_{4} = 2.7\).

The upper bound of the input source rate from each of the sources \(s_{i}\) (\(i = 1, 2, 3, 4\)) (bits per second) at the origin: (1.05, 1.00, 0.98, 1.00).

The lower bound of the input source rate from each of the sources \(s_{i}\) (\(i = 1, 2, 3, 4\)) (bits per second) at the origin: (0.05, 0.06, 0.07, 0.05).

The latency of each link \(d_{i}\) (\(i=1, 2, 3\)) = 0.3, 0.5, 0.5.

Problem (P3) has been formulated and solved by the modified control parameterization method described in Sect. 5 using the number of partitions of the control in the time interval \([0, 3]\) equal to 5. For comparison, the nondelayed version of problem (P3) has also been solved using the same number of partitions in the time interval \([0,3]\), where the nondelayed version of problem (P3) is obtained by deleting both the forward transmission delays and the latencies in the buffer equations and the conservation of flow equation of problem (P3).

The graphical results for the nondelayed version of the problem are plotted in Figs. 34, and 5. The graphical results for the delayed version of the problem are plotted in Figs. 67, and 8.

Figure 3
figure 3

Input Transmission Rate vs. Time for the Delayed Version of the problem

Figure 4
figure 4

Output Flow Rate vs. Time for the Nondelayed Version of the problem

Figure 5
figure 5

Queue Size at the Buffer of the Links vs. Time for the Nondelayed Version of the problem

Figure 6
figure 6

Input Transmission Rate vs. Time for the Delayed version of the problem

Figure 7
figure 7

Output Flow Rate vs. Time for the Delayed version of the problem

Figure 8
figure 8

Queue Size at the Buffer of the Links vs. Time for the Delayed Version of the problem

Comparison of the optimal input transmission rates of the two versions of the problem

From Fig. 3, we observe that at the optimal solution of the nondelayed version of the problem, source \(s_{4}\), whose input transmission rate lies in the upper bound (i.e., 1.0) of its control constraint region for all \(t \in [0, 3]\), has the largest average input transmission rate of 1.0; source \(s_{2}\) has the second largest average transmission rate of about 0.784; source \(s_{3}\), whose input transmission rate lies in the lower bound (i.e., 0.07) of the constraint region for all \(t \in [0, 3]\), has the third largest average transmission rate of 0.07; source \(s_{1}\), whose input transmission rate lies in the lower bound (i.e., 0.05) of the constraint region for all \(t \in [0, 3]\), has the smallest average input transmission rate of 0.05.

From Fig. 7, we observe that at the optimal solution of the delayed version of the problem, source \(s_{3}\) has the largest average input transmission rate of about 0.572; source \(s_{4}\) has the second largest average input transmission rate of about 0.540; source \(s_{1}\) has the third average input transmission rate of about 0.538; and source \(s_{2}\) has the smallest average input transmission rate of about 0.492.

The average input transmission rates of source \(s_{1}\), source \(s_{2}\), source \(s_{3}\), and source \(s_{4}\) of the delayed version are higher than those of the nondelayed version by about 976.00%, −37.24%, 717.14%, and −46.00%, respectively. The overall input transmission rate of all the four sources of the delayed version is higher than that of the nondelayed version by 12.50%. However, if we take the relative importance of the messages sent from each of the four sources to the destination also into consideration, then we obtain that the overall weighted input transmission rates of all the four sources of the delayed version is lower than those of the nondelayed version by 7.96%.

Comparison of the output flow rates of the two versions of the problem

From Fig. 4 we observe that at the optimal solution of the nondelayed version of the problem, source \(s_{2}\) has the largest average output flow rate of 0.774; source \(s_{4}\), whose output flow rate is equal to 0.48 for all \(t \in [0, 3]\), has the second largest average output flow rate of 0.48; source \(s_{1}\), whose output flow rate is equal to 0.06 for all \(t \in [0, 3]\), has the third largest average output flow rate of 0.06; source \(s_{3}\), whose output flow rate is equal to 0.04 for all \(t \in [0, 3]\), has the smallest average output flow rate of 0.04.

From Fig. 7 we first observe that the output flow rates of all the sources of the delayed version of the problem are equal to zero during short periods at the beginning of their transmission. This phenomenon confirms the fact that due to the forward transmission delays and the latencies, all the sources require certain times to send their messages from the origin to the destination. From Fig. 8 we observe that at the optimal solution of the delayed version of problem (P3), source \(s_{1}\), whose output flow rate is equal to zero for all \(t \in [0, 0.4]\), has the largest average output flow rate of 0.5777; source \(s_{2}\), whose output flow rate is equal to zero for all \(t \in [0, 0.5]\), has the second largest average output flow rate of 0.4870; source \(s_{3}\), whose output flow rate is equal to zero for all \(t \in [0, 0.75]\), has the third largest average output flow rate of 0.1780; source \(s_{4}\), whose output flow rate is equal to zero for all \(t \in [0, 0.75]\), has the smallest average output flow rate of 0.1692.

Thus the average output flow rates of source \(s_{1}\), source \(s_{2}\), source \(s_{3}\), and source \(s_{4}\) of the delayed version are higher than those of the nondelayed version by 862.83%, −37.36%, 345.00%, and −64.75%, respectively. The overall output flow rate of all the four sources of the delayed version is higher than that of the nondelayed version by 4.02%. However, if we take the relative importance of the messages sent from each of the four sources to the destination also into consideration, we obtain that the overall weighted output flow rates of all the four sources of the delayed version is lower than that of the nondelayed version by 16.31%.

Comparison of the queue sizes of the two different versions of the problem

From Fig. 5, we observe that at the optimal solution of the nondelayed version of the problem, the queue size at the buffer of link \(l_{1}\) is zero for all \(t \in [0, 3]\); the queue size at the buffer of link \(l_{2}\) is increasing with a constant speed of about 0.53; the queue size at the buffer of link \(l_{3}\) is increasing with an average speed of about 0.9. Thus the buffer capacity constraints of links \(l_{1}\) and \(l_{2}\) are both nonbinding for all \(t \in [0, 3]\), but the buffer capacity constraint of link \(l_{3}\) is binding at the final time \(t = 3\) (i.e., the buffer capacity \(B_{3}\) of link \(l_{3} = 2.7\) at \(t = 3\).) This means that the buffer of link \(l_{3}\) should not handle more queues at the final time \(t = 3\); otherwise, some messages will be lost.

Similar to the behavior of the output flow rates of the delayed version of the problem, we observe from Fig. 8 that the queue sizes at all the buffers of the delayed version are equal to zero during short periods at the beginning of the sources’ transmission. This phenomenon confirms the fact that due to the forward transmission delays and/or the latency, all the routers at the beginning of the sources’ transmission must wait for certain times before they can serialize the first packet of messages into the links. From Fig. 8 we observe that at the optimal solution of the delayed version of the problem, the queue size at the buffer of link \(l_{1}\) is equal to zero for all \(t \in [0, 0.2]\), but increases with a constant speed of about 0.52 for all \(t \in [0.2, 0.9]\) and decreases with an almost constant speed of about 0.17 for all \(t \in [0.9, 3.0]\); the queue size at the buffer of link \(l_{2}\) is also equal to zero for all \(t \in [0, 0.2]\), but increases with a speed of about 1.4 for all \(t \in [0.2, 0.92]\) and increases with a speed of about 0.74 for all \(t \in [0.92, 3]\); the queue size at the buffer of link \(l_{3}\) is equal to zero for all \(t \in [0, 0.6]\), but increases with an average speed of about 0.92 for all \(t \in [0.6, 3]\). Thus the buffer capacity constraint of each of the links \(l_{1}\), \(l_{2}\), and \(l_{3}\) is nonbinding for all \(t \in [0, 3]\).

Thus, unlike the queue sizes of the optimal solution of the nondelayed version of the problem, the queue sizes of one of the links (i.e., link \(l_{1}\)) of the optimal solution of the delayed version show both increasing and decreasing trends. Moreover, the average queue sizes at the buffers of links 1, 2, and 3 of the delayed version are higher than those of the nondelayed version by a percentage of infinity (because the queue size at the buffer of link 1 is zero), 51.57%, and −2.208%, respectively. (The main reason that the average queue size at the buffer of link \(l_{3}\) of the delayed version is slightly less than that of the nondelayed version is mainly due to the fact that at the beginning of the transmission, the router of link \(l_{3}\) of the delayed version has to wait for 0.6 units of time before they can serialize the first packet into the link.)

Comparison of the objective function values of the two versions of the problem

The optimal objective function values for the nondelayed and time-delayed versions of the problem are 10.0313 and 7.4805, respectively. Thus the objective function value of the delayed version is less than that of the delayed version by 38.415.

From the above discussions, we can see that the amount of weighted sources’ input flows at the origins and the amount of the weighted sources’ output flows at the destination of the delayed version of the problem are both less than those of the nondelayed version of the problem by margins of about 8% and 16%, respectively, but the queue sizes at the buffers of the delayed version of the problem are very much greater than those of the nondelayed version of the problem. Thus this example clearly reflects how the optimal transmission of messages in computer networks in real life can be affected by both the forward transmission delays in the buffers and the latencies in the links.

7 Conclusion and suggestions for further study

In this paper, we first formulate the optimal transmission of messages in a computer network as a nondelayed optimal control problem. We then develop a modified control parameterization method for solving this time-delayed optimal control problem. The gradients of the new objective function and the new constraint functions generated by this modified control parameterization method are derived. In this way, we can provide an open-loop control for the time-delayed optimal transmission problem of the computer network; by allocating the optimal flow rate to each link of a real-life computer network, we maximize the number of messages sent from N origins to one destination in a given period.

A numerical example involving four sources, three links, two origins, and one destination has been solved by using both the nondelayed and the time-delayed versions of the above optimal transmission problem. The results show that the amount of weighted sources’ input flows at the origins and the amount of weighted sources’ output flows at the destination of the delayed version of the transmission problem are both less than those of the nondelayed version of the problem by margins of about 8% and 16%, respectively, but the queue sizes at the buffers of the delayed version of the problem are much greater than those of the nondelayed version of the problem. Thus this example clearly reflects how the optimal transmission of messages in computer networks in real life can be affected by both the forward transmission delays in the buffers and the latencies in the links.

For further research direction, we would like to extend the modified control parameterization method to solve more difficult optimal control problems, which involve both state-dependent and control-dependent time-delayed arguments.

Availability of data and materials

There is no data set used in this paper.

References

  1. Kelly, F.P.: Charging and rate control for elastic traffic. Eur. Trans. Telecommun. 8(1), 33–37 (1997)

    Article  Google Scholar 

  2. Kelly, F.P., Maullo, A., Tan, D.: Rate control in communication networks: shadow prices, proportional fairness and stability. J. Oper. Res. Soc. 49, 237–252 (1998)

    Article  Google Scholar 

  3. Abolfazli, Shah-Mansouri, V.: Dynamic adjustment of queue levels in TCP legas-based networks. Electron. Lett. 52, 361–363 (2016)

    Article  Google Scholar 

  4. Lai, K., Baker, M.: Measuring link bandwidths using a deterministic model of packet delay. Comput. Commun. Rev. 30, 283–294 (2000)

    Article  Google Scholar 

  5. Lavaei, J., Doyle, J.C., Low, S.H.: Utility functionals associated with available congestion control algorithms. In: Proceeding of IEEE International Conference Computation Communication, pp. 1–9 (2010)

    Google Scholar 

  6. Lestas, I., Vinnicombe, G.: How good are deterministic models for analyzing congestion control in delayed stochastic networks? In: Proceedings IEEE Conference Decision Control, pp. 4984–4989 (2004)

    Google Scholar 

  7. Mistra, V., Gong, W.B., Towsley, D.: Fluid-based analysis of a network of AQM routers supporting TCP flows with an application to RED. In: Proceedings of ACM SIGCOMM, pp. 151–160 (2000)

    Google Scholar 

  8. Mo, J., Walrand, J.: Fair end-to-end window-based congestion control. IEEE/ACM Trans. Netw. 8(5), 556–567 (2000)

    Article  Google Scholar 

  9. Raina, G., Manjunath, S., Prasad, Giridhar, K.: Stability and performance analysis of compound TCP with rem and drop-tail queue management. IEEE/ACM Trans. Netw. 24, 1961–1973 (2016)

    Article  Google Scholar 

  10. Shakkottai, S., Srikant, R.: Network optimization and control. Found. Trends Netw. 2(3), 271–379 (2008)

    Article  Google Scholar 

  11. Sirisena, H., Hassan, M., Haider, A.: Optimal TCP Congestion Control, Computer Communication (2002)

    Google Scholar 

  12. Sojoudi, S., Low, S.H.: Buffering Dynamics and Stability of Internet Congestion Controllers. IEEE/ACM Transactions on Networking 22(6) (2014)

  13. Sojoudi, S., Low, S.H., Doyle, J.C.: Effect on buffers on stability of Internet congestion controllers. In: Proceeding of IEEE International Conference Computer Communication Mini-Conference, 471–475 (2011)

    Google Scholar 

  14. Zhu, J., Luo, T., Xie, Y.W., Dullerud, G.E.: An average queue-length-difference-based detection algorithm in TCP/AQM network. Int. J. Adapt. Control Signal Process. 32, 742–752 (2018)

    Article  MathSciNet  Google Scholar 

  15. Yu, C., Teo, K.L., Zhang, L., Bai, Y.: A new exact penalty function method for continuous inequality constrained optimization problem. J. Ind. Manag. Optim. 6, 895–910 (2010)

    Article  MathSciNet  Google Scholar 

  16. Li, B., Wang, Y., Zhang, K., Duan, G.R.: Constrained feedback control for spacecraft reorientation with an optimal gain. IEEE Trans. Aerosp. Electron. Syst. 57, 3916–3926 (2021)

    Article  Google Scholar 

  17. Li, B., Zhang, J., Dai, L., Teo, K.L., Wang, S.: A hybrid offline optimization method for reconfiguration of multi-UAV formations. IEEE Trans. Aerosp. Electron. Syst. 57, 506–520 (2021)

    Article  Google Scholar 

  18. Lin, Q., Loxton, R., Teo, K.L.: The control parametrization method for nonlinear optimal control: a survey. J. Ind. Manag. Optim. 10, 275–309 (2014)

    Article  MathSciNet  Google Scholar 

  19. Teo, K.L., Goh, C.J., Wong, K.H.: A Unified Computational Approach to Optimal Control Problems. Pitman Monographs and Surveys in Pure and Applied Mathematics. London (1990)

    MATH  Google Scholar 

  20. Teo, K.L., Li, B., Yu, C., Rehbock, V.: Applied and Computational Optimal Control: A Control Parametrization Approach. Springer, Berlin (2021)

    Book  Google Scholar 

  21. Yang, F., Teo, K.L., Loxton, R., Rehbock, V., Li, B., Yu, C., Jennings, L.: VISUAL MISER: an efficient user-friendly visual program for solving optimal control problems. J. Ind. Manag. Optim. 12, 781–810 (2016)

    MathSciNet  MATH  Google Scholar 

  22. Wong, K.H., Lee, H.W.J., Chan, C.K., Myburgh, C.: Control parametrization and finite element method for controlling multi-species reactive transport in an underground channel. J. Optim. Theory Appl. 151, 168–187 (2013)

    Article  MathSciNet  Google Scholar 

  23. Liu, C., Loxton, R., Lin, Q., Teo, K.L.: Dynamic optimization for switched time-delay systems with state-dependent switching conditions. SIAM J. Control Optim. 56, 3499–3523 (2018)

    Article  MathSciNet  Google Scholar 

  24. Wu, D., Bai, Y., Yu, C.: A new computational approach for optimal control problems with multiple time-delay. Automatica 101, 388–395 (2019)

    Article  MathSciNet  Google Scholar 

  25. Yu, C., Lin, Q., Loxton, R., Teo, K.L., Wang, G.: A hybrid time-scaling transformation for time-delay optimal control problems. J. Optim. Theory Appl. 169, 876–901 (2016)

    Article  MathSciNet  Google Scholar 

  26. Wu, D., Bai, Y., Xie, F.: Time-scaling transformation for optimal control problem with time-varying delay. Discrete Contin. Dyn. Syst. 13(6), 1683–1695 (2020)

    MathSciNet  MATH  Google Scholar 

  27. Liu, C., Loxton, R., Teo, K.L., Wang, S.: Optimal State-Delay Control in Nonlinear Dynamic Systems. Automatica 135 (2022). https://doi.org/10.1016/j.automatica.2021.109981

Download references

Acknowledgements

The authors would like to thank the anonymous referees for the valuable comments and suggestions.

Funding

No funding is acquired for this research work.

Author information

Authors and Affiliations

Authors

Contributions

Each author contributes equally to the production of this manuscript. KH focuses on the theory of the research; both KH and YCE work on the modeling of the problem and focus on the computational work, and HW focuses on the background of the posed problem. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yu-chung Eugene Lee.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wong, Kh., Lee, Yc.E. & Lee, Hw.J. Optimal transmission of messages in computer networks – an optimal control problem involving control-dependent time-delayed arguments. J Inequal Appl 2022, 89 (2022). https://doi.org/10.1186/s13660-022-02823-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13660-022-02823-y

Keywords