-
The service-oriented computing (SOC) paradigm uses services to support the development of rapid, low-cost, interoperable, evolvable, and massively distributed applications. Services are considered autonomous, platform-independent entities that can be described, published, discovered, and loosely coupled in novel ways[1]. Service composition, i.e., the ability to generate new, more useful services from existing ones, is an active field of research in the SOC area and has been actively investigated for over two decades.
Particularly interesting in this context is the so-called Roman Model[2−4] where services are conversational, i.e., have an internal state and are procedurally described as finite transition systems (TS), where at each state the service offers a certain set of actions, and each action changes the state of the service in some way. The designer is interested in generating a new service, called target, which is described as the other service; however, it is virtual in the sense that no code is associated with its actions. So, for executing the target, one has to delegate each of its actions to some of the available services by suitably orchestrating the services, considering the current state of the target and the current states of the available services. Service composition amounts to synthesizing a controller that can suitably orchestrate the executions of the available services to guarantee that the target actions are always delegated to some service that can actually execute them in its current state.
Recently, a renewed interest in service composition à la Roman Model has emerged from its potential applications in smart manufacturing. In particular, through digital twins technology, manufacturing devices can export their behaviour as transition systems, and hence be orchestrated in very much the same way as services did back in the early 2000s[5−7].
These technological trends align with the vision of smart manufacturing, which advocates the need for flexible, modular, and interoperable system architectures. Modern manufacturing systems often consist of heterogeneous components, ranging from robotic workstations to cyber-physical subsystems, that communicate through standardized APIs. In such settings, each component can be naturally abstracted as a service. Consequently, adopting the Service-Oriented Architectures (SOAs) approach enables the dynamic composition and coordination of such services, leveraging their exposed behaviors while maintaining loose coupling and scalability.
In this context, service composition à la Roman Model provides a useful abstraction for orchestrating device behavior toward achieving temporally extended goals. By treating each device or software module as a service with internal dynamics, and expressing high-level objectives via Linear Temporal Logic on finite traces (ʟᴛʟf), our framework bridges declarative specification with executable orchestration.
Interestingly, these new applications are also pointing out several variations that are not typically considered in earlier literature on services. First, they advocate for considering a stochastic behaviour of services, such as those studied in Yadav et al.[8] and Brafman et al.[9]. Unlike the classical model, in which the target specification can either be satisfied or not with no middle ground, in the stochastic setting, it is possible to define a notion of 'approximate solution' in case an exact one does not exist. Second, the notion of goal-oriented target specification is increasingly championed [5,6,10]. That is, instead of having the target specified as a transition system, it is specified as a (possibly temporally extended) goal that the composition has to fulfill. Of particular interest are specifications in ʟᴛʟf [11], which are at the base of declarative process specification in Business Process Management (BPM) through the so-called declare paradigm[12−14]. Third, apart from satisfying the target, it is of interest to also minimize the cost coming from the service utilization[15−17]. This concern, together with the satisfaction of the target, calls for resorting to Multi-Objective Optimization for computing a solution.
A first attempt to do so may resort to Multi-Objective MDPs (MOMDPs)[18]. One common solution is to reduce the multi-objective reward/cost optimization to a single reward optimization via a linear weighting of different sources of rewards/costs. However, this means that the two objectives, namely the maximization of target rewards and the minimization of cost uses, are blurred into one scalar value, which hides precious information from the agent. Instead, the maximization of the target objective has the highest priority. Among those strategies that maximize the first objective, we aim to find those strategies that achieve the minimum utilization cost. In the literature of multi-objective optimization, the setting in which there is a strict preference order among objectives is called lexicographic multi-objective optimization[19−22]. It is known that, in general, single Markovian rewards cannot capture certain multi-objective tasks, such as ones with lexicographic preferences[23]; hence, such problems cannot be easily reduced to standard techniques on MDPs[24].
Among related works, one of the earliest attempts at combining stochastic planning models with service composition is a study by Gao et al.[25]. There are works based on Markov-HTN Planning[26], multi-objective optimization[27,28], and lexicographic optimization[29], helpful to model the stochastic behavior as well as complex QoS preferences. However, in all cases, either there are no stateful services, no high-level declarative specification of the desired solution, or no strict preference among objectives.
1.1. Contributions
-
In this paper, we address the three above requirements and study goal-oriented stochastic service composition, where, as goals, we adopt arbitrary ʟᴛʟf specifications, under process-traces semantics, where only a single action is executed at a time, as commonly done in reasoning about business processes in the declare framework[12]. Specifically, we are given a goal specification, and we want to synthesize an orchestrator that, on the one hand, proactively chooses actions to form a sequence that satisfies the goal and, on the other hand, delegates each action to an available service in such a way that at the end of the sequence, all services are in their final states. The key contribution of this work extends the findings of De Giacomo et al.[30], including both reachability and safety tasks, introducing a novel reduction technique for safety specification. The composition problem consists of maximizing the satisfaction probability of the ʟᴛʟf objective, and conditioned on this, minimizing the expected cost of utilization of the services.
We consider two variants of the ʟᴛʟf stochastic service composition problem. In the first variant of the problem, as the first objective, we aim to maximize the probability of satisfying at least once an ʟᴛʟf formula
. This is a reachability property: there exists a finite prefix$ \varphi $ of an infinite trace$ t_{<k} $ such that$ t $ . This is the classical use ofʟᴛʟf to specify synthesis tasks[31]. As the second objective, we consider the minimization of the expected cost of utilization of services, conditioned on the optimal satisfaction of the first objective. In the second variant, as the first objective, we aim to maximize the probability of not violating an ʟᴛʟf formula$ t_{<k} \models \varphi $ . This is a safety property: every finite prefix$ \varphi $ of an infinite trace$ t_{<k} $ is such that$ t $ . This is the classical use of ʟᴛʟf to specify environment behaviors[32]. As the second objective, we consider minimizing the expected mean cost of utilization of services, conditioned on the optimal satisfaction of the first objective.$ t_{<k} \models \varphi $ Observe that our reachability and safety ʟᴛʟf properties are conceptually similar to those formalized in studies by Aminof et al.[33,34], where they use
to denote reachability and$ \exists \varphi $ to denote safety, where$ \forall \varphi $ is an arbitrary ʟᴛʟf formula. However, the difference in our reachability and safety tasks within our composition framework lies in the fact that, in the satisfaction condition of these tasks, we also impose constraints on whether the services are in an accepting state; thus, we consider both the satisfaction of the ʟᴛʟf task and the configurations of the service community.$ \varphi $ The solution technique for both variants of the composition problem relies on solving a bi-objective lexicographic optimization[35] over a special MDP, allowing for minimizing the services' utilization costs while guaranteeing maximum probability of goal satisfaction. We point out that although this paper has mainly a foundational nature, it also has a significant applicative interest since it gives the foundations and solution techniques of goal-oriented compositions, which are indeed envisioned in the current literature on smart manufacturing where the notion of goal-oriented target specification is increasingly championed[5−7,10].
This paper presents an approach for stochastic service composition to find orchestrators that maximize the probability of satisfying the reachability and safety ʟᴛʟf tasks and, among such orchestrators, find those who also minimize a cost utilization function. We extended our previous work[30] by (i) generalizing the framework so as to encompass both variants of stochastic service composition, one for reachability and the other for safety ʟᴛʟf tasks; (ii) including a new section regarding the composition under ʟᴛʟf safety tasks; (iii) discussed in more detail a case study as a conceptual validation of our framework.
1.2. Outline
-
The rest of the paper is structured as follows. Section 2 explains the theoretical concepts on which the paper is based. Section 3 introduces the service composition framework in stochastic settings. Section 4 studies the stochastic composition problem in the case of the reachability task and provides a solution by reducing the problem to a bi-objective lexicographic optimization on MDPs[35]. Section 5 studies the stochastic composition problem in the case of a safety task and provides reduction to bi-objective lexicographic optimization on MDPs[35]; the reduction is analogous to the reachability case, but with several technical differences that require a deeper analysis. Section 6 shows the conceptual validation of the formal framework to an industrial case study of an electric motor assembly process, describing in detail the manufacturing goal and the manufacturing actors. Finally, Section 7 concludes the paper with final remarks and future work.
-
ʟᴛʟf is a variant of Linear Temporal Logic (ʟᴛʟ) interpreted over finite traces, instead of infinite ones[11]. Given a set
of atomic propositions, ʟᴛʟf formulas$ {\cal{P}} $ are defined by$ \varphi $ $ \varphi :: = {tt} \mid p \mid \lnot \varphi \mid \varphi \wedge \varphi \mid \bigcirc\varphi \mid \varphi \;{\cal{U}}\; \varphi $ where
represents the true formula,$ {tt} $ denotes an atomic proposition in$ p $ ,$ {\cal{P}} $ is the next operator, and$ \bigcirc$ is the until operator. We use abbreviations for other Boolean connectives, as well as the following: eventually as$ \;{\cal{U}}\; $ ; always as$ \Diamond \varphi \equiv true \;{\cal{U}}\; \varphi $ ; weak next as$ \square \varphi \equiv \lnot \Diamond \lnot \varphi $ (note that, on finite traces,$ {\LARGE \bullet } \varphi \equiv \lnot {\bigcirc} \lnot\varphi $ is not equivalent to$ \lnot {\bigcirc} \varphi $ ); and weak until as$ {\bigcirc} \lnot \varphi $ ($ \varphi_1 \;{\cal{W}}\; \varphi_2 \equiv (\varphi_1 \;{\cal{U}}\; \varphi_2 \lor \Box \varphi_1) $ holds until$ \varphi_1 $ or forever). In the original proposal, ʟᴛʟf formulas were interpreted on finite traces. For simplicity, here we consider the semantics on simple (possibly empty) finite traces, also known as process traces[36] (while the choice between finite-trace and process-trace semantics does not affect the generality of our framework, we prefer process-trace semantics as it aligns more naturally with our context. In this case, the traces result from the services' actions, which are more effectively represented as atomic propositions rather than in a factorized form). A process trace over propositions$ \varphi_2 $ is a sequence$ {\cal{P}} $ where$ a = a_0 \dots a_{l-1} $ , and$ a_i\in {\cal{P}} $ is the length of the trace. The empty trace is denoted with$ l $ .$ \epsilon $ Given a finite (possibly empty) trace
, we define when an ʟᴛʟf formula$ a $ is true in$ \varphi $ at the instant$ a $ , denoted by$ i\in\{0, \dots, l-1\} $ , by inductively considering subformulas as follows[37]:$ a,i\models\varphi $ ●
;$ a,i\models {tt} $ ●
iff$ a,i \models p $ ;$ a_i = p $ ●
iff$ a,i\models\lnot\varphi $ ;$ a, i \not\models \varphi $ ●
iff$ a, i \models \varphi_1 \wedge\varphi_2 $ and$ a, i \models \varphi_1 $ ;$ a, i \models \varphi_2 $ ●
iff$ a, i \models {\bigcirc} \varphi $ and$ i<l-1 $ ;$ a, i + 1 \models \varphi $ ●
iff there exists$ a, i \models \varphi_1 \;{\cal{U}}\; \varphi_2 $ with$ j $ such that$ i \ge j \lt l $ , and for all$ a, j \models \varphi_2 $ with$ k $ we have that$ i \le k \lt j $ .$ a, k \models \varphi_1 $ Whenever
holds, we say that$ a,0\models\varphi $ is a model of$ a $ , or that$ \varphi $ satisfies$ a $ . We also write$ \varphi $ as an abbreviation for$ a\models\varphi $ .$ a,0\models\varphi $ A deterministic finite automaton (dfa) is a tuple
where: (i)$ {\cal{A}} = \langle {\cal{P}}, Q, q_0, F, \delta\rangle $ is the alphabet, (ii)$ {\cal{P}} $ is a finite set of states, (iii)$ Q $ is the initial state, (iv)$ q_0 $ is the set of accepting states, and (v)$ F\subseteq Q $ is a total transition function. Note that the DFA alphabet is the same as the set of traces that satisfy the formula$ \delta:Q\times {\cal{P}} \to Q $ . A trace$ \varphi $ is accepted by a$ a $ $ {\rm{DFA}}$ iff a sequence of states$ {\cal{A}} $ exists in$ q_0,\dots,q_l $ such that: (i)$ Q $ for$ q_{i+1}\in\delta(q_i,a_i) $ ; and (ii)$ i = 0,\dots,l-1 $ . We denote with$ q_l\in F $ the set of traces accepted by$ {\cal{L}}( {\cal{A}}) $ . An$ {\cal{A}} $ formula can be transformed into an equivalent DFA${\rm{LTL}} _f $ in at most 2EXPTIME[11].$ {\cal{A}}_\varphi $ 2.2. Markov Decision Process (MDP)
-
A Markov Decision Process (MDP) is a tuple M = (S, A, P, s0), where: (i) S is a finite set of states, (ii) s0 is the initial state, (iii) A is a finite set of actions, and (iv)
is the transition probability function, i.e. a mapping from state-action pairs to probability distributions over S. With Supp(d) , we denote the support of a probability distribution$ P : (S \times A) \to \Delta(S) $ . An infinite path$ d $ is a sequence$ \rho \in (S \times A)^\omega $ , where$ \rho = s_0a_1s_1a_2 \dots $ ,$ s_i \in S $ , and$ a_{i+1} \in A $ , for all$ s_{i+1}\in {\text{Supp}}(P(s_i,a_{i+1})) $ . Similarly, a finite path$ i \in {\mathbb{N}} $ is a finite sequence$ \rho \in (S \times A)^* \times S $ . For any path$ \rho = s_0a_1s_1a_2 \dots a_{m}s_m $ of length at least$ \rho $ and any$ j $ , we let$ i \leq j $ denote the subsequence$ \rho[i: j] $ . Moreover, let$ s_ia_{i+1}s_{i+1}a_{i+2} \dots a_{j}s_j $ to denote the last state visited by the finite path$ {\text{last}}(\rho) = s_m $ . We use$ \rho $ and$ {\text{Paths}}_ {\cal{M}}^\omega = (S \times A)^\omega $ to denote the set of infinite and finite paths, respectively. A policy$ {\text{Paths}}_ {\cal{M}} = (S \times A)^* \times S $ maps a finite path$ \pi : {\text{Paths}}_ {\cal{M}} \rightarrow A $ to an action$ \rho \in {\text{Paths}}_ {\cal{M}} $ . We denote with$ a\in A $ the set of finite paths over$ {\text{Paths}}_{ {\cal{M}}_\pi} $ whose actions are compatible with$ {\cal{M}} $ . Given a finite path$ \pi $ , the cylinder of$ \rho = s_0a_1 \dots a_ms_m $ , denoted by$ \rho $ , is the set of all infinite paths starting with prefix$ {\text{Paths}}_{ {\cal{M}}}^\omega(\rho) $ . The$ \rho $ -algebra associated with MDP$\sigma$ and a fixed policy$ {\cal{M}} $ is the smallest$ \pi $ -algebra that contains the cylinder sets$\sigma$ for all$ {\text{Paths}}_{ {\cal{M}}_\pi}^\omega(\rho)$ . For a state$\rho \in {\text{Paths}}_{ {\cal{M}}_\pi}$ in$s$ , a measure is defined for the cylinder sets as$S$ if$ {\mathbb{P}}_{ {\cal{M}}_\pi,s}( {\text{Paths}}^\omega_{ {\cal{M}}_\pi}(s_0 a_1 s_1 \dots a_m s_m)) = \prod_{k = 0}^{m-1} P(s_{k+1}|s_k,a_{k+1}) $ and for all$ s_0 = s $ ,$ k $ , otherwise$ a_{k+1} = \pi(\rho[0: k]) $ . We also have$ 0 $ and$ {\mathbb{P}}_{ {\cal{M}}_\pi,s}( {\text{Paths}}^\omega_{ {\cal{M}}_\pi}(s)) = 1$ for$ {\mathbb{P}}_{ {\cal{M}}_\pi,s}( {\text{Paths}}^\omega_{ {\cal{M}}_\pi}(s')) = 0$ . This can be extended to a unique probability measure$s'\neq s$ on the aforementioned$ {\mathbb{P}}_{ {\cal{M}}_\pi,s}$ -algebra. In particular, if$\sigma$ is a set of finite paths forming pairwise disjoint cylinder sets, then$ {\cal{R}} \subseteq {\text{Paths}}_{ {\cal{M}}_\pi}$ . We denote with$ {\mathbb{P}}_{ {\cal{M}}_\pi,s}(\bigcup_{\rho \in {\cal{R}}} {\text{Paths}}^\omega_{ {\cal{M}}_\pi}(\rho)) = \sum_{\rho \in {\cal{R}}} {\mathbb{P}}_{ {\cal{M}}_\pi,s}( {\text{Paths}}^\omega_{ {\cal{M}}_\pi}(\rho)) $ the expected value of a random variable$ {{\mathbb{E}}}_{ {\cal{M}}_{\pi},s}[X] $ with respect to the distribution$ X $ .$ {\mathbb{P}}_{ {\cal{M}}_{\pi},s} $ -
In this section, we present our service composition framework in stochastic settings.
A (stochastic) service is a tuple
, where: (i)$ {\cal{S}} = \langle \Sigma, A, \sigma_{0}, F, P, C\rangle $ is the finite set of service states, (ii)$ \Sigma $ is the finite set of services' actions, (iii)$ A $ is the initial state, (iv)$ \sigma_0\in \Sigma $ is the set of final states (i.e., states in which the computation may stop but does not necessarily have to), (v)$ F\subseteq \Sigma $ is the transition function that returns for every state$ P: \Sigma \times A \to \Delta(\Sigma) $ and action$ \sigma $ a distribution over next states, and (vi)$ a $ is the cost function that assigns a (strictly positive) cost to each state-action pair. Without loss of generality, we assume that for any$ C: \Sigma \times A \to {\mathbb{R}}^+ $ , there exists at least one$ \sigma\in\Sigma $ such that$ a\in A $ . A (stochastic) service community is a collection of stochastic services$ | {\text{Supp}}(\sigma, a)|>0 $ . A trace of$ {\cal{C}} = \{ {\cal{S}}_1,\dots, {\cal{S}}_n \} $ is an infinite alternating sequence of the form$ {\cal{C}} $ , where$ t = (\sigma_{10}\dots\sigma_{n0}),(a_1,o_1),\dots $ is the initial state of every service$ \sigma_{i0} $ and, for every$ {\cal{S}}_i $ , we have (i)$ k\ge 1 $ for all$ \sigma_{ik} \in \Sigma_i $ , (ii)$ i \in \{1,\dots,n\} $ is the service index chosen by the orchestrator at step$ o_k \in \{1,\dots, n\} $ , (iii)$ k $ , and (iv) for all$ a_k \in A $ ,$ i $ if$ \sigma_{ik}\in {\text{Supp}}(P_i(\sigma_{i,k-1},a_{ik})) $ , and$ o_k = i $ otherwise. A history of$ \sigma_{ik} = \sigma_{i,k-1} $ is a finite prefix of a trace of$ {\cal{C}} $ . We denote the length of$ {\cal{C}} $ with$ h $ , and we denote the service state configuration at the last step of$ |h| = m $ , i.e.,$ h $ , with$ (\sigma_{1m}\dots\sigma_{nm}) $ . Given a trace$ {\text{last}}(h) $ , we call$ t $ sequence of states of$ {\text{states}}(t) $ , i.e.$ t $ $ {\text{states}}(t) = (\sigma_{10}\dots\sigma_{n0}), $ . The choices of a trace$ (\sigma_{11}\dots\sigma_{n1}),\cdots $ , denoted with$ t $ , is the sequence of actions in$ {\text{choices}}(t) $ , i.e.$ t $ . Note that, due to nondeterminism, there might be many traces of$ {\text{choices}}(t) = (a_1,o_1),(a_m,o_m), \dots $ associated with the same run. Moreover, we define the action run of a trace$ {\cal{C}} $ , denoted with$ t $ , the projection of$ {\text{actions}}(t) $ only to the components in$ {\text{choices}}(t) $ . The predicates$ A $ ,$ {\text{states}} $ and$ {\text{choices}} $ are defined also on history$ {\text{actions}} $ , in a similar way. Note that both$ h $ and$ {\text{choices}}(h) $ are empty sequences if$ {\text{actions}}(h) $ .$ h = (\sigma_{10}\dots\sigma_{n0}) $ An orchestrator is a function
that, given a sequence of states$ {\gamma}: (\Sigma_1\times\cdots\times\Sigma_n)^* \to A\times \{1\dots n\} $ , returns the action to perform$ (\sigma_{10}\dots\sigma_{n0})\dots(\sigma_{1m}\dots\sigma_{nm}) $ , and the service (actually the service index) that will perform it. Given a trace$ a \in A $ , with$ t $ , we denote the set of prefixes of the trace$ {\text{prefixes}}(t) $ that ends with a services state configuration. A trace$ t $ is an execution of an orchestrator$ t $ over$ {\gamma} $ if for all$ {\cal{C}} $ , we have$ k\ge 0 $ . Let$ (a_{k+1},o_{k+1}) = {\gamma}((\sigma_{10}\dots\sigma_{n0})\dots(\sigma_{1k}\dots\sigma_{nk})) $ be the set of such executions. Note that due to the services' nondeterminism, we can have many executions for the same orchestrator despite the orchestrator being a deterministic function. If$ {\cal{T}}_{ {\gamma}, {\cal{C}}} $ for some (infinite) execution$ h\in {\text{prefixes}}(t) $ , we call$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ a finite execution, or a history, of$ h $ over$ {\gamma} $ , and we define$ {\cal{C}} $ to be the set of finite executions of$ {\cal{H}}_{ {\gamma}, {\cal{C}}} $ over$ {\gamma} $ . Analogously to what has been done for MDPs, for a finite execution$ {\cal{C}} $ of$ h $ over$ {\gamma} $ , we use$ {\cal{C}} $ to denote the set of all (infinite) executions$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ such that$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ . Moreover, the$ h\in {\text{prefixes}}(t) $ -algebra associated with the stochastic behaviour of the orchestrator$ \sigma $ over the stochastic community$ {\gamma} $ is the smallest$ {\cal{C}} $ -algebra that contains the trace sets$ \sigma $ , for all finite executions$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ , with the unique probability measure over it defined as:$ h $ $ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(h) = \prod\limits_{k = 1}^{|h|} P_{o_k}(\sigma_{o_k,k} \mid \sigma_{o_k,k-1}, a_{k}) $ (1) where
. In particular, note that$ a_k = {\gamma}((\sigma_{10}\dots\sigma_{n0}),\dots(\sigma_{1k}\dots\sigma_{nk})) $ .$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle)) = 1 $ Example 1 This example is inspired by the 'garden bots system' scenario[8]. In this environment, the garden bots can:
the garden by picking fallen leaves and removing dirt,$ {clean} $ the plants, and$ {water} $ the ripe fruits and flowers. We assume there are three available garden bots,$ {pluck} $ , each with different capabilities and action rewards. In Fig. 1, the three service specifications are shown. Transitions labels are of the form$ {\cal{B}}_1, {\cal{B}}_2, {\cal{B}}_3 $ .$ \langle {action}, {prob}, {reward}\rangle $ We consider an orchestrator
such that (i) always picks$ \gamma $ , and in state$ {\cal{B}}_1 $ chooses the unique available action$ a_0 $ while in state$ {clean} $ chooses the unique available action$ a_1 $ . The probability of the finite execution$ {empty} $ to occur under$ h_1 = (a_0,b_0,c_0),({clean}, {\cal{B}}_1), (a_0,b_0,c_0) $ is$ {\gamma} $ , meaning that the transition$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(h_1) = 0.8 $ is taken. Instead, the probability of the finite execution$ a_0\xrightarrow[]{{clean}}a_0 $ to occur is$ h_1 = (a_0,b_0,c_0),({clean}, {\cal{B}}_1), (a_1,b_0,c_0) $ , meaning that the transition$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(h_1) = 0.2 $ is taken.$ a_0\xrightarrow[]{{clean}}a_1 $ -
In this section, we present our service composition framework in stochastic settings where the goal is to maximize the probability of satisfying the ʟᴛʟf reachability task of
, where$ \varphi $ is an ʟᴛʟf formula. Intuitively, we are interested in orchestrators that maximize the probability of satisfaction of the ʟᴛʟf specification, even when the specification cannot be surely satisfied (e.g., due to stochastic misbehavior of some service). Moreover, while guaranteeing the optimal probability of satisfaction, we aim to find those orchestrators that minimize the expected utilization cost of the services, conditioned on the achievement of the task.$ \varphi $ Formally, we consider a ʟᴛʟf formula
expressed over the set of actions$ \varphi $ , and consider a community of$ A $ services$ n $ , where each set of actions$ {\cal{C}} = \{ {\cal{S}}_1, \dots, {\cal{S}}_n\} $ . We say that some finite execution$ A_i\subseteq A $ is successful (for$ h $ and$ {\cal{C}} $ ), denoted with$ \varphi $ , if there exist some execution$ {\text{successful}}_{ {\cal{C}},\varphi}(h) $ such that the following two conditions hold: (1)$ h'\in {\text{prefixes}}(h) $ and (2) all service states$ {\text{actions}}(h')\models\varphi $ are such that$ \sigma_i \in {\text{last}}( {\text{states}}(h')) $ . If for execution$ \sigma_i\in F_i $ there exist a finite prefix history$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ such that$ h\in {\text{prefixes}}(t) $ for$ {\text{successful}}_{ {\cal{C}},\varphi}(h) $ , we say that$ \varphi $ is successful for$ t $ . Finally, we say that an orchestrator$ \varphi $ realizes the$ {\gamma} $ reachability task${\rm{LTL}} _f $ with$ \varphi $ if, for all traces$ {\cal{C}} $ ,$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ is successful with respect to$ t $ . When$ \varphi $ and$ {\cal{C}} $ are clear from the context, we omit the suffix in$ \varphi $ and only write$ {\text{successful}}_{ {\cal{C}},\varphi} $ .$ {\text{successful}} $ Example 2 Continuing example 1, we consider the following sequential task:
the garden by picking fallen leaves and removing dirt,$ {clean} $ the plants, and$ {water} $ the ripe fruits and flowers. The action$ {pluck} $ must be performed at least once, followed by$ {clean} $ and$ {water} $ in any order. In$ {pluck} $ , the task can be expressed as${\rm{LTL}} _f $ $ \varphi = {clean} \wedge {\bigcirc}({clean} \;{\cal{U}}\;(({water}\land {\bigcirc}{pluck})\lor({pluck}\land {\bigcirc}{water}))) $ A successful execution is
$ \begin{array}{l} h_1 = \langle (a_0,b_0,c_0), ({clean}, {\cal{B}}_1), (a_0, b_0, c_0), ({water}, {\cal{B}}_2), (a_0, b_0, c_0),\\ \;\;\;\;\;\;\;\; ({pluck}, {\cal{B}}_2), (a_0, b_1, c_0),({empty}, {\cal{B}}_2), (a_0, b_0, c_0)\rangle \end{array} $ Note that, before the action
is taken, the ʟᴛʟf task$ ''{empty}'' $ is already satisfied; however, the garden bot$ \varphi $ is not in an accepting state. The last action in the execution,$ {\cal{B}}_2 $ , is necessary to move the bot$ ''{empty}'' $ to its accepting state. Another successful execution is$ {\cal{B}}_2 $ $ \begin{array}{l} h_2 = \langle (a_0,b_0,c_0), ({clean}, {\cal{B}}_1), (a_0, b_0, c_0), ({pluck}, {\cal{B}}_3), (a_0, b_0, c_1), \\ \;\;\;\;\;\;\;\;\;({empty}, {\cal{B}}_3), (a_0, b_0, c_0), ({water}, {\cal{B}}_2), (a_0, b_0, c_0)\rangle \end{array} $ On the other hand, any extension of the execution
$ h_3 = \langle(a_0,b_0,c_0), ({clean}, {\cal{B}}_1), (a_0, b_0, c_0), ({pluck}, {\cal{B}}_3), (a_0, b_0, c_2)\rangle $ will not be successful since service
can never reach one of its accepting states from$ {\cal{B}}_3 $ .$ c_2 $ The satisfaction probability of the reachability task
under orchestrator$ \varphi $ and community$ {\gamma} $ is given by:$ {\cal{C}} $ $ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\big\{t \in {\cal{T}}_{ {\gamma}, {\cal{C}}} \,\big\vert\, {\text{successful}}(t)\big\}\bigg) $ (2) Intuitively,
is the probability measure (under$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) $ ) of the set of successful executions generated by executing the orchestrator$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}} $ over the service community$ {\gamma} $ .$ {\cal{C}} $ Moreover, we define the (conditioned) expected utilization cost of services as the expected cost an orchestrator incurs in its successful executions, i.e.:
$ {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi} ( {\gamma}) = {\mathbb{E}}_{h\sim {\mathbb{P}}_{ {\gamma}, {\cal{C}}}}\Bigg[ \sum\limits_{k = 1}^{|h|} C_{o_k}(\sigma_{o_k,k-1},a_{k}) \bigg\vert\ {\text{successful}}(h) \Bigg] $ (3) Let
be the set of orchestrators for the community$ \Gamma( {\cal{C}}) $ . Let$ {\cal{C}} $ be an objective function. We say an orchestrator$ f: \Gamma( {\cal{C}}) \to {\mathbb{R}} $ is$ {\gamma}\in\Gamma( {\cal{C}}) $ -optimal if$ f $ , and write$ f( {\gamma}) = \sup_{\tau\in \Gamma( {\cal{C}})} f(\tau) $ for the set of$ \Gamma_f $ -optimal orchestrators.$ f $ Finally, we define our optimization problem. We want to compute an orchestrator
such that the following holds:$ {\gamma} $ $ {\gamma}\in\Gamma_{ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}} \text{ and } {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) = \inf\limits_{ {\gamma}'\in\Gamma_{ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}}} {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}') $ (4) Intuitively, we fix a lexicographic order on the objective functions
and$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}} $ , meaning that we aim to minimize the expected utilization cost to satisfy the specification, conditioned to the satisfaction of the specification, while guaranteeing the optimal probability of satisfying it. Interestingly, in case the specification is exactly realizable, the notion of optimal orchestrator according to Eq. (4) coincides with the notion of realizability, as shown in the following results.$ {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi} $ First, let
be the set of finite executions$ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ of orchestrator$ h $ on service community$ {\gamma} $ that start from$ {\cal{C}} $ such that (i)$ \sigma_{10}\dots\sigma_{n0} $ holds, and (ii) there is no prefix history$ {\text{successful}}(h) $ such that$ h'\in {\text{prefixes}}(h) $ holds:$ {\text{successful}}(h') $ $\begin{split} {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} =\;& \{ h\in {\cal{H}}_{ {\gamma}, {\cal{C}}}: {\text{successful}}(h) \wedge \not\exists h'\in {\text{prefixes}}(h) \text{ s.t. } h'\\\neq\;& h \wedge {\text{successful}}(h) \}\end{split} $ (5) Intuitively, such a set only contains the finite executions
such that they are successful for$ h $ for the first time. In fact, by definition of$ \varphi $ , if$ {\cal{H}}^\varphi_{ {\gamma}, {\cal{C}}} $ then$ h\in {\cal{H}}^\varphi_{ {\gamma}, {\cal{C}}} $ holds.$ {\text{successful}}(h) $ Lemma 1 The following equality holds:
$ \big\{t \in {\cal{T}}_{ {\gamma}, {\cal{C}}} \,\big\vert\, {\text{successful}}(t)\big\} = \bigcup\limits_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ Proof We prove the two set inclusions (i)
and (ii)$ \subseteq $ separately.$ \supseteq $ (i) if
and$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ , then there exist a prefix$ {\text{successful}}(t) $ such that$ h\in {\text{prefixes}}(t) $ is successful. Let$ h $ be the shortest of such prefixes: this means that no prefix$ h' $ of$ h" $ is successful. From this, by definition of$ h' $ , it follows that$ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ . Then, given that$ h'\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ is an execution of$ t $ over$ {\gamma} $ with prefix$ {\cal{C}} $ , by definition of$ h' $ , we have that$ {\cal{T}}_{ {\gamma}, {\cal{C}}} $ , and therefore the thesis holds.$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h') $ (ii) let
. Let$ t\in\bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ be the history such that$ h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ , which must exist by assumption. By definition of$ t\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ , we have$ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ , and by definition of$ {\text{successful}}(h) $ , since$ {\text{successful}} $ , we also have$ h\in {\text{prefixes}}(t) $ . Moreover,$ {\text{successful}}(t) $ , which concludes the proof.$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)\subseteq {\cal{T}}_{ {\gamma}, {\cal{C}}} $ □ It is crucial to observe that since by definition of
there is no pair$ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ such that$ h',h"\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}(h) $ , all trace sets$ h'\in {\text{prefixes}}(h") $ for$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ are pairwise disjoint sets, which means that$ h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ is a well-defined probability.$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) $ Lemma 2 If
realizes the reachability task$ {\gamma} $ over$ \varphi $ , then$ {\cal{C}} $ .$ \bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) = {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ Proof We prove (i)
and (ii)$ \bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \subseteq {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ separately. Proposition (i) is immediate: every execution belongs to the set of executions starting from$ \bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \supseteq {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ . To prove proposition (ii), we start by observing that for all$ \sigma_{10}\dots\sigma_{n0} $ , by definition of realizing orchestrator, they have a prefix$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ that is successful. In particular, if$ h'\in {\text{prefixes}}(h) $ is the shortest prefix of$ h" $ that is successful, then$ h $ and$ h"\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ . This implies that$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h") $ .$ t\in\bigcup_{h"\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h") $ □ Theorem 1 Let
be a community of stochastic services, and$ {\cal{C}} $ be a reachability task. The orchestrator$ \varphi $ realizes$ {\gamma} $ with community$ \varphi $ iff$ {\cal{C}} $ .$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = 1 $ Proof (
) If an orchestrator$ \Rightarrow $ realizes$ {\gamma} $ , then all infinite executions$ \varphi $ are successful. By Lemma 1, this implies that$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ . By Lemma 2, this set is equal to$ t\in \bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ . Since by definition$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots \sigma_{n0})\rangle) $ $ {\text{Supp}}( {\mathbb{P}}_{ {\gamma}, {\cal{C}}}) \subseteq $ , we have that$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ .$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(\bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots \sigma_{n0})\rangle)) = 1 $ (
) Assume an orchestrator$ \Leftarrow $ is such that$ {\gamma} $ . This implies that for all infinite executions$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = 1 $ $ t\in {\text{Supp}}( {\mathbb{P}}_{ {\gamma}, {\cal{C}}}) \subseteq $ of orchestrator${\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle(\sigma_{10}\dots\sigma_{n0})\rangle) $ , there is a prefix$ \gamma $ such that$ h\in {\text{prefixes}}(t) $ and$ h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ . This means that t is successful, and therefore, all executions are successful, i.e., the definition of realizability.$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ □ Theorem 2 Assume the reachability task
is realizable. If an orchestrator$ \varphi $ satisfies Eq. (4), then it realizes the reachability task$ {\gamma} $ .$ \varphi $ Proof Since by assumption the reachability task
is realizable, then there exists an orchestrator$ \varphi $ that realizes it. By Theorem 1, we can deduce that the optimal value of$ {\gamma}' $ is$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}') $ . Moreover, by assumption and by Eq. (4), it follows that$ 1 $ , i.e.$ {\gamma} \in \Gamma_{ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}} $ , by the arguments above. Finally, again by Theorem 1, we get that$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = 1 $ realizes$ {\gamma} $ .$ \varphi $ □ Finally, we formally state the stochastic version of our problem:
Problem 1 (Stochastic Composition for ʟᴛʟf Reachability Tasks). Given the pair
, where$ ( {\cal{C}}, \varphi) $ is an ʟᴛʟf reachability task over the set of actions$ \varphi $ , and$ A $ is a community of$ {\cal{C}} $ stochastic services$ n $ , compute an orchestrator that is optimal according to Eq. (4).$ {\cal{C}} = \{ {\cal{S}}_1,\dots, {\cal{S}}_n\} $ Interestingly, Theorems 1 and 2 show that one can find an orchestrator even in a non-stochastic setting by considering arbitrary services' probability distributions for
, for any pair$ P_i(\sigma_i,a) $ and$ \sigma_i $ , whose support is compatible with$ a $ , and then check whether$ \delta_i $ .$ \max_ {\gamma} {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) = 1 $ Example 3 Continuing example 2, we aim to find a composition of the available bots
so as to maximize the probability of the satisfaction of the reachability task$ {\cal{B}}_1, {\cal{B}}_2, {\cal{B}}_3 $ , which also considers rewards/costs. The$ \varphi $ action can only be delegated to$ {clean} $ , and the optimal solution must take into account its stochastic behaviour to correctly compute the expected cost. Regarding the$ {\cal{B}}_1 $ action, both$ {pluck} $ and$ {\cal{B}}_2 $ can perform it. However, the optimal orchestrator will not dispatch it to$ {\cal{B}}_3 $ because, despite the cost being smaller than the one in$ {\cal{B}}_3 $ , choosing$ {\cal{B}}_2 $ will lead to a probability of$ {\cal{B}}_3 $ of not reaching the final state configuration since the state$ 0.1 $ is a failure state while choosing$ c_2 $ does not compromise the optimal probability of goal satisfaction.$ {\cal{B}}_3 $ 4.1. Solution technique
-
The solution technique is based on finding an optimal policy for a bi-objective lexicographic optimization on a specifically built MDP. In particular, we consider a variant of the framework introduced in study by Busatto-Gaston et al.[35]: while as the second objective they considered the expected number of steps to a target, here we consider the expected cost. Given an instance of the reachability composition problem
, our technique breaks down into the following steps: (1) first, we translate the$ ( {\cal{C}},\varphi) $ formula into the equivalent DFA${\rm{LTL}} _f $ ; then (2) we combine$ {\cal{A}}_\varphi $ with the services, computing a product of$ {\cal{A}}_\varphi $ with the stochastic services in$ {\cal{A}}_{\varphi} $ , obtaining a new MDP,$ {\cal{C}} $ , that we call the 'composition MDP'; (3) we solve the resulting MDP, finding a policy$ {\cal{M}}' $ for$ \pi $ that is optimal w.r.t. the bi-objective lexicographic function, as in study by Busatto-Gaston et al.[35], and then (4) we derive an orchestrator$ {\cal{M}}' $ from$ {\gamma} $ that is optimal w.r.t. Eq. (4). We now detail each step.$ \pi $ Step 1. The dfa of an ʟᴛʟf formula can be computed by exploiting a well-known correspondence between ʟᴛʟf formulas and automata on finite words[11]. That is, we can compute a dfa
that is equivalent to the specification$ {\cal{A}}_\varphi = (A, Q, q_0, F, \delta) $ , which can be double-exponentially larger than the size of the formula.$ \varphi $ Step 2. Next, we define the Composition MDP
as follows:$ {\cal{M}}' = (S', A', P', s'_0) $ ●
;$ S' = Q\times\Sigma_1\times\cdots\times \Sigma_n $ ●
;$ A' = A\times \{1\dots n\} $ ●
;$ s'_0 = (q_0,\sigma_{10}\dots\sigma_{n0}) $ ●
if$ P'(q',\sigma'_1\dots\sigma'_i\dots\sigma'_n|q,\sigma_1\dots \sigma_i\dots \sigma_n, (a,i)) = P_i(\sigma'_i|\sigma_i, a) $ and$ \delta(q,a) = q' $ for all$ \sigma'_{j} = \sigma_j $ , otherwise$ j\neq i $ .$ 0 $ Moreover, let the composition cost function
be defined as$ C': S'\times A' \to {\mathbb{R}}^+ $ .$ C'((q,\sigma_1\dots\sigma_i\dots\sigma_n), (a,i)) = C_i(\sigma_i, a) $ We are interested in computing optimal policies for
, where the optimality is defined as follows. As the set of target states$ {\cal{M}}' $ , we consider:$ T $ $ T = \{(q,\sigma_1,\dots,\sigma_n) \mid (q,\sigma_1,\dots,\sigma_n)\in S' \wedge q\in F \wedge \forall i = 1,\dots,n\,: \sigma_i \in F_i\} $ Intuitively,
is the set of states of$ T\subseteq S' $ where the state component from$ {\cal{M}}' $ ,$ {\cal{A}} $ , is accepting, and the state components from the services$ q $ ,$ {\cal{S}}_i $ , are accepting in$ \sigma_i $ . We consider the bi-objective lexicographic optimization over$ {\cal{S}}_i $ , similarly to what has been done in study by Busatto-Gaston et al.[35]. In particular, we first consider the probability of reaching a set of target states$ {\cal{M}}' $ from$ T $ , following a policy$ s\in S' $ over the MDP$ \pi $ , denoted with$ {\cal{M}}' $ ; with$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s}(\lozenge T) $ , we denote the set of policies with the maximum probability of reaching$ \Pi_{ {\cal{M}}',s}(\lozenge T) $ , i.e.$ T $ . Then, we consider the cost of the shortest prefix of$ \arg\max_\pi {\mathbb{P}}_{ {\cal{M}}'_\pi,s}(\lozenge T) $ that reaches one of the target states in$ \rho $ , i.e.$ T $ if$ {\text{cost}}_{T}(\rho) = \sum_{k = 0}^{i} C'(s'_k, a'_k) $ and for all$ \rho[i]\in T $ ,$ j \lt i $ . An optimal solution for$ \rho[j] \not\in T $ is a policy$ {\cal{M}}' $ that minimizes the conditional expected cost of reaching a target state$ \pi $ among the policies in$ {\mathbb{E}}_{\rho\sim {\cal{M}}'_{\pi'},s'_0}[ {\text{cost}}_{T}(\rho) | \lozenge T] $ , that is, the policies which maximize$ \Pi_{ {\cal{M}}',s'_0}(\lozenge T) $ , i.e.:$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}(\lozenge T) $ $ \pi\in \Pi_{ {\cal{M}}',s_0}(\lozenge T) \text{ and } \pi\in\arg\min\limits_{\pi'} {\mathbb{E}}_{\rho\sim {\cal{M}}'_\pi,s'_0}\big[ {\text{cost}}_T(\rho) | \lozenge T\big] $ (6) Step 3. The solution technique we will use is based on the work[35], where the authors propose a two-stage technique to find an optimal policy for a bi-objective lexicographic function in the form of Eq. (6). First, we compute the set of policies (in the form of a set of optimal actions for each state) that maximize the probability of reaching the target states; however, this set of policies also contains the 'deferral' policies, i.e. policies that defer the actual reaching of the target states indefinitely, but in such a way that the target can still be reached with maximum probability at any moment. Then, we consider a 'pruned MDP' in which (i) only optimal action can be taken, and (ii) only states from which the target can be reached are kept. The new MDP is used to find policies that minimize the expected cost of reaching the target. By construction, the optimal policy of the pruned MDP guarantees the target is always reached since any deferral policy will incur an infinite cost. The difference between our scenario and Busatto-Gaston et al.[35] is that they consider the length of the path, rather than its cost, as the second objective function. Nevertheless, it is easy to see that their approach works if, instead of considering the expected length of successful paths, we consider their expected total costs (i.e., minimizing path length can be seen as minimizing costs with each transition having unitary cost). Note that the techniques used to find the solutions are standard: the first stage requires solving the maximal reachability probability problem[38] on the composition MDP with the accepting end components as the set of states
. The second stage requires solving a stochastic shortest path problem[24] on the pruned MDP. The two subproblems can be solved efficiently using standard planning algorithms, e.g., Value Iteration or Linear Programming.$ T $ Step 4. Once an optimal policy is found, we can obtain its equivalent
as follows: for any finite prefix of a run$ {\gamma} $ , we set$ \rho = (q_0,\sigma_{10}\dots\sigma_{n0}), (a_1,o_1),\dots(a_m,o_m), (q_m,\sigma_{1m}\dots\sigma_{nm}),\dots $ , where$ {\gamma}((\sigma_{10}\dots\sigma_{n0}) \dots(\sigma_{1m}\dots\sigma_{nm})) = (a_{m+1},o_{m+1}) $ .$ \pi(\rho) = (a_{m+1}, o_{m+1}) $ Now we aim to establish a relationship between optimal orchestrators according to Eq. (4), and optimal policies for
according to Eq. (6). Given an infinite run$ {\cal{M}}' $ , we define the trace$ \rho = (q_0,\sigma_{10}\dots\sigma_{n0}),(a_1,o_1)\dots $ . The definition easily applies to finite prefixes of$ t = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) = (\sigma_{10}\dots\sigma_{n0}),(a_1,o_1),\dots $ but this time mapping into histories of$ \rho $ .$ t $ In the following, we are going to prove a sequence of lemmata. Lemma 3 shows that, once fixed a policy
over$ \pi $ , there is a one-to-one correspondence between paths$ {\cal{M}}' $ in$ \rho $ following$ {\cal{M}}' $ and the executions$ \pi $ of the equivalent orchestrator of$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ ,$ \pi $ ; Lemma 4 shows that the probabilities of finite paths and histories are the same; Lemma 5 shows that paths that end with states in$ {\gamma} $ correspond to successful histories; and Lemma 6 shows a correspondence between paths in$ T $ and$ {\text{Paths}}_{T, {\cal{M}}'_\pi} $ .$ {\cal{H}}_{ {\gamma}, {\cal{C}}}^\varphi $ Lemma 3 Let
be a policy for$ \pi $ and let$ {\cal{M}}' $ be its equivalent orchestrator. Moreover, let$ {\gamma} $ and$ \rho\in {\text{Paths}}_{{ {\cal{M}}'}_\pi}^\omega $ be a trace such that$ t $ . Then,$ t = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ iff$ \rho\in {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\langle s'_0 \rangle) $ .$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle (\sigma_{10}\dots\sigma_{n0})\rangle) $ Proof Let the infinite path
, and the infinite trace$ \rho\in {\text{Paths}}_{ {\cal{M}}'_{\pi}}^\omega(\langle s'_0 \rangle) $ . We prove the claim by induction on the position of the run/trace.$ t = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho)) $ Base case: we have the claim holds for position
because$ 0 $ , and$ \rho[0] = (q_0,\sigma_{10}\dots\sigma_{n0}) $ . Therefore,$ h[0] = \sigma_{10}\dots\sigma_{n0} $ satisfies the conditions of the definitions of history and execution of$ \langle h[0]\rangle $ iff$ {\gamma} $ .$ s'_0\in S' $ Inductive case: assume the claim holds up to position
.$ k\ge 0 $ (
) Consider the$ \Rightarrow $ -th action according to$ (k+1) $ , i.e.$ \pi $ , and its successor state$ \pi(\rho[0: k]) = (a_{k+1}, o_{k+1}) $ . By inductive hypothesis, we have that$ \rho[k+1] = (q_{k+1}, \sigma_{1,k+1}, \dots, \sigma_{n,k+1}) $ is a valid execution under orchestrator$ h = t[0:k] $ , where$ {\gamma} $ . We need to show that$ t = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ is a valid extension of$ h' = t[0:k],(a_{k+1}, o_{k+1}), (\sigma_{1,k+1},\dots,\sigma_{n,k+1}) $ under orchestrator$ h $ . By construction of$ {\gamma} $ , we have that (i)$ {\cal{M}}' $ for all services, (ii)$ \sigma_{i,k+1}\in \Sigma_i $ , (iii)$ o_{k+1}\in\{1\dots n\} $ . We also aim to show that (iv) for all$ a_{k+1}\in A $ ,$ j = 1,\dots,n $ if$ \sigma_{j,k+1}\in {\text{Supp}}(P_{j}(\sigma_{j,k},a_{k+1})) $ ; otherwise$ j = o_{k+1} $ . Proposition (iv) holds because of the definition of$ \sigma_{j,k+1} = \sigma_{j,k} $ and by the hypothesis of$ P' $ being a path of$ \rho $ (the support of the transition probability function of all the states along a path must be non-empty). The propositions (i), (ii), (iii), (iv) are precisely the conditions that each step of a valid execution must satisfy. Moreover, by construction of$ {\cal{M}}' $ ,$ {\gamma} $ $ (a_{k+1}, o_{k+1}) = $ , hence${\gamma}( {\text{states}}(t[0: k])) $ is a proper (finite) execution under orchestrator$ h' = t[0: k], (a_{k+1},o_{k+1}),(\sigma_{1,k+1}\dots\sigma_{n,k+1}) $ .$ {\gamma} $ (
) Now, consider a finite execution under orchestrator$ \Leftarrow $ ,$ {\gamma} $ , and its extension$ h = t[0:k] $ . We need to show that the finite path prefix$ h' = t[0:k],(a_{k+1},o_{k+1}),(\sigma_{1,k+1}\dots\sigma_{n,k+1}) $ is a valid finite path in$ \rho[0:k+1] $ under policy$ {\cal{M}}' $ . Since by inductive hypothesis the claim holds up to the$ \pi $ -th step, we only need to show that the$ k $ -th step defined as$ (k+1) $ , for some$ (a_{k+1}, o_{k+1}),(q_{k+1}, \sigma_{1,k+1}\dots\sigma_{n,k+1}) $ , is a valid extension of$ q_{k+1} $ under policy$ \rho[0:k] $ . By construction, we have that$ \pi $ . Regarding the successor state, we have to show that$ \pi(\rho[0:k]) = {\gamma}( {\text{states}}(h')) = (a_{k+1}, o_{k+1}) $ . But this follows from our choice of$ (q_{k+1}, \sigma_{1,k+1}\dots\sigma_{n,k+1})\in {\text{Supp}}(P') $ and by the condition (iv) of a valid execution step.$ q_{k+1} = \delta(q_k,a_{k+1}) $ By induction, the claim also holds for any arbitrary position, and therefore
iff$ \rho\in {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\langle s'_0 \rangle) $ .$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(\langle (\sigma_{10}\dots\sigma_{n0})\rangle) $ □ Lemma 4 Let
a policy on$ \pi $ ,$ {\cal{M}}' $ be its equivalent orchestrator,$ {\gamma} $ be a finite path on$ \rho = s'_0a_1\dots s'_m\in {\text{Paths}}_{ {\cal{M}}'_\pi}(s'_0) $ , and$ {\cal{M}}' $ be its associated history. Then,$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ .$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}( {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\rho)) = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) $ Proof
$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}( {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\rho)) = \prod\limits_{k = 1}^{m} P'(s'_{k} \mid s'_{k-1},(a_k,o_k)) $ (7) $ = \prod\limits_{k = 1}^m P_{o_k}(\sigma_{o_k,k} \mid \sigma_{o_k,k-1}, (a_k,o_k)) $ (8) $ = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) $ (9) where step 7 is by definition of the probability of a cylinder set, step 8 by definition of
in$ P' $ , and step 9 by Eq. (1).$ {\cal{M}}' $ □ Lemma 5 Let
be a finite path on$ \rho = s_0a_1\dots s_m\in {\text{Paths}}_{ {\cal{M}}'_\pi} $ for some policy$ {\cal{M}}'_\pi $ , and let$ \pi $ be its associated history. Then,$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ iff there exists$ {\text{successful}}(h) $ such that$ k\in [0,m] $ .$ s_k\in T $ Proof Let
be an integer between$ k $ and$ 0 $ . By construction of$ m $ , we have that$ {\cal{M}}' $ iff$ s_k = (q_k,\sigma_{1k}\dots\sigma_{nk})\in T $ $ (a) $ and$ q_k\in F $ for all$ (b) $ we have$ i = 1,\dots,n $ . The proposition (a) means that the sequence of states$ \sigma_{ik}\in F_i $ is an accepting run over$ r = q_0,\dots,q_k $ for trace$ {\cal{A}}_\varphi $ , where$ {\text{actions}}(h') = a_1,\dots a_{k} $ . By definition of acceptance, the above holds iff$ h' = h[0:k] $ . By the correctness of the construction of$ {\text{actions}}(h')\in {\cal{L}}( {\cal{A}}_\varphi) $ , this is true iff$ {\cal{A}}_\varphi $ . Finally, since$ {\text{actions}}(h')\models \varphi $ , we have$ h'\in {\text{prefixes}}(h) $ and (b) both imply that$ {\text{actions}}(h')\models\varphi $ holds.$ {\text{successful}}(h) $ Conversely, assume
. Then by definition there exists a prefix$ {\text{successful}}(h) $ such that$ h' = h[0:k] $ and such that (c)$ {\text{actions}}(h')\models\varphi $ for all$ \sigma_{ik}\in F_i $ . By the correctness of the construction of$ i = 1,\dots,n $ ,$ {\cal{A}}_\varphi $ iff$ {\text{actions}}(h')\models\varphi $ , meaning that there exists an accepting run$ {\text{actions}}(h')\in {\cal{L}}( {\cal{A}}_\varphi) $ over$ r = q_0,\dots,q_k $ , where (d)$ {\cal{A}}_\varphi $ . Conditions (c) and (d) imply that$ q_k\in F $ , by definition of the set$ \rho[k] = s_k = (q_k,\sigma_{1k}\dots\sigma_{nk})\in T $ .$ T $ □ Let
be the set of finite paths following$ {\text{Paths}}_{T, {\cal{M}}'_\pi}(s'_0) $ on$ \pi $ such that they start with$ {\cal{M}}' $ and enter in a state in$ s'_0 $ only at the end of the path and for the first time, i.e.$ T $ .$ {\text{Paths}}_{T, {\cal{M}}'_\pi}(s'_0) = ((S' \setminus T ) \times A)^*T \cap {\text{Paths}}_{ {\cal{M}}'_\pi}(s'_0) $ Lemma 6
iff$ \rho\in {\text{Paths}}_{T, {\cal{M}}'_\pi}(s'_0) $ $ {\tilde \tau}_{\varphi, {\cal{C}}}(\rho)\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ Proof By Lemma 5,
iff$ \rho\in {\text{Paths}}_{T, {\cal{M}}'_\pi} $ is successful, since$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ ends in a state in$ \rho $ . Moreover, since$ T $ , by Lemma 3, we have that$ {\text{Paths}}_{T, {\cal{M}}'_\pi}\subseteq {\text{Paths}}_{ {\cal{M}}'_\pi} $ iff$ \rho\in {\text{Paths}}_{T, {\cal{M}}'_\pi} $ is an execution of$ h $ . Furthermore, by assumption, any finite prefix$ {\gamma} $ , say of length$ \rho' $ , of$ m $ , is such that$ \rho $ . Then, again by Lemma 5, this holds iff$ \rho'[m] \not\in T $ is not successful, meaning that does not exist a prefix$ h' = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho') $ with$ h'\in {\text{prefixes}}(h) $ such that$ h'\neq h $ is successful. But this is precisely the membership condition for$ h' $ .$ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ □ This result shows the correctness of our technique:
Theorem 3 Let
be an instance of Problem 1, and let$ ( {\cal{C}}, \varphi) $ be the composition MDP for$ {\cal{M}}' $ and$ {\cal{C}} $ . We have that$ \varphi $ is optimal (w.r.t. Eq. [6]) if its equivalent orchestrator$ \pi $ is optimal (w.r.t. Eq. [4]).$ {\gamma} $ Proof First, we show that
iff$ \pi = \arg\max_{\pi'} {\mathbb{P}}_{ {\cal{M}}_\pi',s'_0}(\lozenge T) $ . For any pair$ {\gamma} = \arg\max_{ {\gamma}'} {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}') $ and its equivalent$ \pi $ , we have:$ {\gamma} $ $ {\mathbb{P}}_{\pi,s'_0}(\lozenge T) = \sum\limits_{\rho_T\in {\text{Paths}}_{T,\pi}(s'_0)} {\mathbb{P}}_{ {\cal{M}}'_{\pi},s'_0}( {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\rho_T)) $ (10) $ = \sum\limits_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) $ (11) $ = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\bigcup_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg) $ (12) $ = {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}) $ (13) where step 10 is by definition of probabilistic reachability, step 11 is by Lemma 4 and 6, step 12 is by disjointness of all
for$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ , and step 13 is by Lemma 1 and Eq. (2). From this, we obtain that$ h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}} $ iff$ \pi^* = \arg\max_{\pi'} {\mathbb{P}}_{ {\cal{M}}_{\pi'},s'_0}(\lozenge T) $ .$ {\gamma}^* = \arg\max_{ {\gamma}'} {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{reach}}( {\gamma}') $ It remains to prove that
is cost-optimal iff$ \pi $ is cost-optimal. We have:$ {\gamma} $ $ \begin{array}{l} {\mathbb{E}}_{\rho\sim {\cal{M}}'_\pi,s'_0} [ {\text{cost}}_T(\rho) \mid \lozenge T] = \\ \;\;\;\;\;\;\;\;\;\; = \sum\limits_{\rho_T\in {\text{Paths}}_{T,\pi}(s'_0)} {\mathbb{P}}_{ {\cal{M}}'_{\pi'},s'_0}( {\text{Paths}}_{ {\cal{M}}'_\pi}^\omega(\rho_T)) \cdot \sum\limits_{k = 0}^{|\rho_T|} C'(s'_k,a'_{k+1}) \end{array} $ (14) $ = \sum\limits_{h\in {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) \cdot \sum\limits_{k = 0}^{|h|} C_{o_{k+1}}(\sigma_{o_k,k},a_{k+1}) $ (15) $ = {\mathbb{E}}_{h\sim {\mathbb{P}}_{ {\gamma}, {\cal{C}}}}\Bigg[ \sum\limits_{k = 1}^{|h|} C_{o_k}(\sigma_{o_k,k-1},a_{k}) \bigg\vert\ {\text{successful}}(h) \Bigg] $ (16) $ = {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) $ (17) where step 14 by definition of total expected cost conditioned on reaching of target states
, step 15 by construction of$ T $ and by Lemma 4 and 6, step 16 by definition of total expected cost on successful executions of$ {\cal{M}}' $ , and step 17 by Eq. (3). Therefore, if$ {\gamma} $ then$ \pi\in \arg\min_{\pi'} {\mathbb{E}}_{\rho\sim {\cal{M}}'_\pi,s'_0}[ {\text{cost}}_T(\rho) \mid \lozenge T] $ . Combining both results, we get the thesis.$ {\gamma}\in \arg\min_{ {\gamma}'} {\cal{J}}^{ {{\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}') $ □ Computational cost. Theorem 3 guarantees that we can reduce Problem 1 to the problem of finding an optimal policy for the lexicographic bi-objective optimization problem (Eq. [6]) over a composition MDP
. As explained above, the two-stage technique requires solving a planning problem over MDPs. Since it is known that both steps require polynomial time complexity in the number of states and actions of the MDP[24] and that our composition MDP has a state space that is double-exponentially larger in the size of the goal specification, we get this result:$ {\cal{M}}' $ Theorem 4 Problem 1 can be solved in at most double-exponential time in the size of the formula, in at most exponential time in the number of services, and in polynomial time in the size of the services.
This is in line with the classical setting of ʟᴛʟ/ʟᴛʟf synthesis on probabilistic systems[39,40], and analogous to our solution method for the non-stochastic case[41].
-
In this section, we consider the composition problem where the first objective is to maximize the probability of not violating a safety ʟᴛʟf specification, and as second objective, we maximize the expected conditional cost. This means that the notion of successful execution must be revised in order to consider a safety goal rather than a reachability goal.
We say that some infinite execution
is legal (for$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ and$ {\cal{C}} $ ), denoted with$ \varphi $ , if for all finite prefix histories$ {\text{legal}}_{ {\cal{C}},\varphi}(h) $ we have that (1)$ h\in {\text{prefixes}}(t) $ , i.e. the safety condition is not violated at any time on the trace, and (2) for all positions$ {\text{actions}}(h)\models\varphi $ and all services$ k $ , it holds that$ i $ , i.e. every service is in an accepting state. We use$ \sigma_{i,k}\in F_i $ also for finite executions. We denote with$ {\text{legal}}_{ {\cal{C}},\varphi}(h) $ the set of finite and legal execution of orchestrator$ {{\cal{H}}}^{\varphi,{\text{safe}}}_{ {\gamma}, {\cal{C}}} $ on service community$ {\gamma} $ . Finally, we say that an orchestrator$ {\cal{C}} $ realizes the ʟᴛʟf specification$ {\gamma} $ with$ \varphi $ if, for all traces$ {\cal{C}} $ ,$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ is legal. When$ t $ and$ {\cal{C}} $ are clear from the context, we omit the suffix in$ \varphi $ and only write$ {\text{legal}}_{ {\cal{C}},\varphi} $ .$ {\text{legal}} $ Example 4 Continuing with example 1, we consider the following
safety task:${\rm{LTL}} _f $ , i.e. never$ \Box\lnot{pluck} $ the ripe fruits and flowers. For this safety task, differently from Fig. 1, we consider states$ {pluck} $ ,$ a_1 $ , and$ b_1 $ to be 'accepting', meaning that they are states of their respective service in which the execution of the composition safety task can enter into. An orchestrator that maximizes the probability of satisfying this goal will only use$ c_2 $ and$ {\cal{B}}_1 $ , since$ {\cal{B}}_2 $ can only perform$ {\cal{B}}_3 $ from state$ {pluck} $ , whose execution would make the safety task to fail.$ c_0 $ The satisfaction probability of the safety task
under orchestrator$ \varphi $ and community$ {\gamma} $ is given by:$ {\cal{C}} $ $ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\big\{t \in {\cal{T}}_{ {\gamma}, {\cal{C}}} \,\big\vert\, {\text{legal}}(t)\big\}\bigg) $ (18) Intuitively,
is the probability measure (under$ {\cal{P}}_{ {\cal{C}},\varphi}^ {\text{safe}}( {\gamma}) $ ) of the set of legal executions generated by executing the orchestrator$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}} $ over the service community$ {\gamma} $ .$ {\cal{C}} $ To formally define the second objective, we first define the total cost of horizon
for an execution$ m $ as$ t = (\sigma_{10}\dots\sigma_{n0}),(a_1,o_1),\dots $ . Then, for an orchestrator$ {{\text{Cost}}^{{\rm{C}}}_m}(t) = \sum^{m-1}_{i = 0} C_{o_i}(\sigma_{o_i}, a_{i+1}) $ and a state$ {\gamma} $ , the expected mean-cost is defined as:$ \sigma = (\sigma_1,\dots,\sigma_n) $ $ {\mathbb{E}}_{ {\gamma},\sigma}[MC] = \limsup\limits_{m\to\infty} \frac{1}{m} {\mathbb{E}}_{h\sim {\gamma},\sigma}[ {{\text{Cost}}^{{\rm{C}}}_m}] $ (19) The optimal expected average cost starting from a state
is defined over all orchestrators$ \sigma $ for$ {\gamma} $ as$ {\cal{C}} $ . Moreover, we use$ \inf_ {\gamma}{\mathbb{E}}_{ {\gamma}, \sigma}[MC] $ to denote the expected conditional finite horizon cost, where the condition is on whether$ {\mathbb{E}}_{h\sim\mathbb{P}_{ {\gamma}, {\cal{C}}}}[ {{\text{Cost}}^{{\rm{C}}}_m} \mid {\text{legal}}(h)] $ is legal. Finally, we define our second objective, the expected conditional mean-cost, as:$ h $ $ {\cal{J}}^{ {{\text{avg}\text-\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) = \limsup\limits_{m\to\infty} \frac{1}{m}{\mathbb{E}}_{h\sim\mathbb{P}_{ {\gamma}, {\cal{C}}}}\Bigg[ {{\text{Cost}}^{{\rm{C}}}_m} \mid {\text{legal}}(h)\Bigg] $ (20) Intuitively,
takes into account the costs that the orchestrator incurs in executing its policy by picking the chosen services, but averaged across the entire length of the execution.$ {\cal{J}}^{ {{\text{avg}\text{-}\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) $ Finally, we define our optimization problem. We want to compute an orchestrator
such that the following holds:$ {\gamma} $ $ {\gamma}\in\Gamma_{ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}} \text{ and } {\cal{J}}^{ {{\text{avg}\text-\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) = \inf\limits_{ {\gamma}'\in\Gamma_{ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}}} {\cal{J}}^{ {{\text{avg}\text-\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}') $ (21) Intuitively, we fix a lexicographic order on the objective functions
and$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi} $ , meaning that we aim to minimize the expected utilization mean-cost to satisfy the specification, conditioned to the satisfaction of the specification, while guaranteeing the optimal probability of satisfying it.$ {\cal{J}}^{ {{\text{avg}\text{-}\text{cost}}}}_{ {\cal{C}},\varphi} $ Now we aim to find a connection between the satisfaction probability and the notion of realizability.
Let us consider the set of finite executions of
with the community$ {\gamma} $ that becomes illegal for the first time, denote with$ {\cal{C}} $ . Formally:$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ $ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} = \{ h\in {\cal{H}}_{ {\gamma}, {\cal{C}}} \mid \lnot {\text{legal}}(h) \wedge \forall h'\in( {\text{prefixes}}(h)\setminus\{h\}): {\text{legal}}(h') \} $ (22) Lemma 7 Let
be an infinite execution of$ t $ over$ {\gamma} $ .$ {\cal{C}} $ is legal iff$ t $ $ t\not\in \left(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \right) $ Proof
by contrapositive, assume$ (\Rightarrow) $ . Then, there exist some$ t\in (\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) $ such that$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ . By definition of$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ , we have that$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ . Since$ \lnot {\text{legal}}(h) $ by definition of$ h\in {\text{prefixes}}(t) $ , we have that, by definition of$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ , and from the above arguments, it follows that$ {\text{legal}} $ , i.e., there exist a finite prefix of$ \lnot {\text{legal}}(t) $ that is not legal, and therefore its infinite extension$ t $ is also not legal.$ t $ by contrapositive, assume$ (\Leftarrow) $ is not legal. This means that there exists a prefix$ t $ such that$ h\in {\text{prefixes}}(t) $ . Let$ \lnot {\text{legal}}(h) $ be the shortest of such prefixes, i.e., the finite execution that violates the safety conditions for the first time. By definition of$ h' $ , it holds that$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ . Finally, since$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ , we also have$ h'\in {\text{prefixes}}(t) $ . Therefore, the claim holds.$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h') $ □ Observe that, from Lemma 7, the following equality holds:
$ \big\{t \in {\cal{T}}_{ {\gamma}, {\cal{C}}} \,\big\vert\, {\text{legal}}(t)\big\} = {\cal{T}}_{ {\gamma}, {\cal{C}}} \setminus \bigg(\bigcup\limits_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg) $ (23) And in turn, from Eq. (23), we have:
$ \begin{split} {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) =\;&{\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\big\{t \in {\cal{T}}_{ {\gamma}, {\cal{C}}}) \,\big\vert\, {\text{legal}}(t)\big\}\bigg)\\ =\;& {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\Bigg( {\cal{T}}_{ {\gamma}, {\cal{C}}} \setminus \bigg(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg)\Bigg)\\ =\;& {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}) - {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\Bigg( {\cal{T}}_{ {\gamma}, {\cal{C}}} \cap \bigg(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg)\Bigg)\\ =\;& 1 - {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg) \end{split} $ (24) The intuitive meaning of the expression in Eq. (24) for
is that we can define the probability of satisfaction of the safety specification as the probability of avoiding finite executions$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi} $ such that$ h $ .$ \lnot {\text{legal}}(h) $ In case the specification is exactly realizable, the notion of optimal orchestrator according to Eq. (21) coincides with the notion of 'safe realizability', as shown in the following results.
Lemma 8 Orchestrator
realizes the safety specification$ {\gamma} $ over$ \varphi $ iff$ {\cal{C}} $ .$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} = \emptyset $ Proof (
) Assume$ \Rightarrow $ . By assumption, there exist one finite execution$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} \neq \emptyset $ in the set$ h $ . By Lemma 8, this implies that all infinite executions$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ that are extensions of$ t $ , i.e.,$ h $ , are not legal. But this means that there exists at least one execution of$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ that does satisfy the safety specification$ {\gamma} $ ; hence,$ \varphi $ does not realize the safety specification$ {\gamma} $ .$ \varphi $ (
) Assume that$ \Leftarrow $ does not realize the safety specification$ {\gamma} $ . This means that there exist a trace$ \varphi $ such that$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}} $ . By Lemma 7, this implies that$ \lnot {\text{legal}}(t) $ , for some$ t\in {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ . Therefore,$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ .$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}\neq\emptyset $ □ Theorem 5 Let
be a community of stochastic services, and$ {\cal{C}} $ be a$ \varphi $ safety specification. The orchestrator${\rm{LTL}} _f $ realizes$ {\gamma} $ with community$ \varphi $ iff$ {\cal{C}} $ .$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = 1 $ Proof (
) if$ \Rightarrow $ realizes$ {\gamma} $ , then by Lemma 8,$ \varphi $ , and therefore$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} = \emptyset $ . Finally, by the definition of$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) = 0 $ (Eq. [18]) and by Eq. (24), we have that$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi} $ $ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = 1 - {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) = 1 $ (
) If$ \Leftarrow $ , then by definition of$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = 1 $ (Eq. [18]) and by Eq. (24) we have that$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi} $ . Since the left-hand side is a probability, the equation holds only if the argument of$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}}(\bigcup_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) = 0 $ is empty, which in turn implies that$ {\mathbb{P}}_{ {\gamma}, {\cal{C}}} $ . Finally, the claim follows by applying Lemma 8.$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} = \emptyset $ □ Theorem 6 Assume ʟᴛʟf safety specification
is realizable. If an orchestrator$ \varphi $ satisfies Eq. (21), then it realizes the specification$ {\gamma} $ .$ \varphi $ Proof Since by assumption
is realizable, then there exists an orchestrator$ \varphi $ that realizes it. By Theorem 5, we can deduce that the optimal value of$ {\gamma}' $ is$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}') $ . Moreover, by assumption and by Eq. (21), it follows that$ 1 $ , i.e.$ {\gamma} \in \Gamma_{ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}} $ , by the arguments above. Finally, again by Theorem 5, we get that$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = 1 $ realizes$ {\gamma} $ .$ \varphi $ Finally, we formally state the stochastic version of our problem:
Problem 2 (Stochastic Composition for ltlf Safety Tasks). Given the pair
, where$ ( {\cal{C}}, \varphi) $ is an ʟᴛʟf safety specification over the set of actions$ \varphi $ , and$ A $ is a community of$ {\cal{C}} $ stochastic services$ n $ , compute an orchestrator that is optimal according to Eq. (21).$ {\cal{C}} = \{ {\cal{S}}_1,\dots, {\cal{S}}_n\} $ As in the reachability case of Section 4, Theorems 5 and 6 show that one can find an orchestrator even in a non-stochastic setting by considering arbitrary services' probability distributions for
, for any pair$ P_i(\sigma_i,a) $ and$ \sigma_i $ , whose support is compatible with the service's transition function (even non-deterministic, as in study by De Giacomo et al.[41]), and then check whether$ a $ .$ \max_ {\gamma} {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) = 1 $ 5.1. Solution technique
-
Similarly to the previous case, the solution technique for the safety variant of the composition problem relies on finding an optimal policy for a bi-objective lexicographic optimization on a specifically built MDP. In particular, also for this problem, we consider a variant of the framework introduced in study by Busatto-Gaston et al.[35]; while the second objective they considered was the maximization of the expected conditional mean payoff, here we consider the minimization of the expected conditional mean costs. In general, the technique has a significant overlap with the reachability variant. However, the main difference lies in how we build the automata-based construction: in the reachability case, we are interested in an orchestrator such that, for any (infinite) execution
, a finite prefix of$ t $ satisfies the task$ t $ ; in the safety case, we are interested in an orchestrator such that, for all prefixes, the safety task$ \varphi $ is satisfied. The definition of the composition MDP must take into account this different requirement.$ \varphi $ Our technique breaks down into the following steps: (1) first, we translate the
formula into the equivalent${\rm{LTL}} _f $ $ {\rm{DFA}}$ ; then (2) we combine$ {\cal{A}}_\varphi $ with the services, computing a product of$ {\cal{A}}_\varphi $ with the stochastic services in$ {\cal{A}}_{\varphi} $ , obtaining a new MDP,$ {\cal{C}} $ , that we call the 'composition MDP'; (3) we solve the resulting MDP, finding a policy$ {\cal{M}}' $ for$ \pi $ that is optimal w.r.t. the (safety) bi-objective lexicographic function, as in study by Busatto-Gaston et al.[35], and then (4) we derive an orchestrator$ {\cal{M}}' $ from$ {\gamma} $ that is optimal w.r.t. Eq. (4).$ \pi $ Step 1. This step is the same as the previous case: starting from the
specification${\rm{LTL}} _f $ , we compute the$ \varphi $ $ {\rm{DFA}}$ that is equivalent to the specification$ {\cal{A}}_\varphi = (A, Q, q_0, F, \delta) $ [11].$ \varphi $ Step 2. Then, we define the composition MDP
with the associated cost function$ {\cal{M}}' = (S', A', P', s'_0) $ , as in step 2 for the reachability case. We are interested in computing optimal policies for$ C' $ , where the optimality is defined as follows. Consider the set of bad states:$ {\cal{M}}' $ $ {\text{Bad}} = \{(q,\sigma_1,\dots,\sigma_n) \mid (q,\sigma_1,\dots,\sigma_n)\in S' \wedge (q\not\in F \vee \exists i\in [1,n]\,: \sigma_i \not\in F_i)\} $ In other words,
is the set of states such that their$ {\text{Bad}}$ -component is not an accepting state or the current state of some service is not accepting. We consider the safety bi-objective lexicographic optimization over$ Q $ , similarly to what has been done in study by Busatto-Gaston et al.[35]. In particular, we first consider the probability of avoiding a set of bad states$ {\cal{M}}' $ from$ {\text{Bad}} $ , following a policy$ s\in S' $ over the MDP$ \pi $ , denoted with$ {\cal{M}}' $ ; with$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s}(\Box \lnot {\text{Bad}}) $ , we denote the set of policies with the maximum probability of avoiding states in$ \Pi_{ {\cal{M}}',s}(\Box \lnot {\text{Bad}}) $ from state$ {\text{Bad}} $ , i.e.$ s $ . Then, as the second objective, we consider the expected conditional mean-cost$ \arg\max_\pi {\mathbb{P}}_{ {\cal{M}}'_\pi,s}(\Box\lnot {\text{Bad}}) $ . An optimal solution for$ {\mathbb{E}}_{ {\cal{M}}'_\pi,s'_0}[MC \mid \Box\lnot {\text{Bad}}] $ is a policy$ {\cal{M}}' $ that minimizes the expected conditional mean-cost$ \pi $ , while avoiding the set of bad states, but only among the policies in$ \limsup_{m\to\infty} \frac{1}{n} {\mathbb{E}}_{ {\cal{M}}'_\pi,s'_0}[ {{\text{Cost}}_m} \mid \Box\lnot {\text{Bad}}] $ , i.e., the policies that maximize$ \Pi_{ {\cal{M}}',s'_0}(\Box\lnot {\text{Bad}}) $ :$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}(\Box\lnot {\text{Bad}}) $ $ \pi\in \Pi_{ {\cal{M}}',s'_0}(\Box\lnot {\text{Bad}}) \text{ and } \pi\in\arg\limits_{\pi'}\inf {\mathbb{E}}_{\rho\sim {\cal{M}}'_{\pi'},s_0}[MC \mid \Box\lnot {\text{Bad}}] $ (25) Step 3. The solution technique we will use is, once again, based on the work[35], where the authors propose a two-stage technique to find an optimal policy for a bi-objective lexicographic function in the form of Eq. (25). First, we compute the set of policies (in the form of a set of optimal actions for each state) that maximize the probability of staying far from the bad set of states
; however, this set of policies also contains policies that do not take into account the utilization costs of the services. Then, we consider a 'pruned MDP' in which (i) only optimal actions can be taken, and (ii) only 'good' states are kept. The new MDP is used to find policies that minimize the expected conditional mean-cost while maximizing the probability of staying in good states. Crucially, observe that, by construction, avoiding states in Bad is equivalent to not violating the safety specification$ {\text{Bad}} $ , since bad states have their$ \varphi $ -component to be states of$ Q $ . Moreover, by construction, the optimal policy of the pruned MDP guarantees that no better probability of staying safe can be achieved, since sub-optimal actions for the first objective are pruned. In this case, the difference between our scenario and Busatto-Gaston et al.[35] is that they consider reward maximization, rather than cost minimization, as the second objective function. Nevertheless, it is easy to see that we can minimize costs by maximizing negative rewards. Note that the two subproblems can be solved efficiently using standard planning algorithms, e.g., value iteration or linear programming.$ {\cal{A}}_\varphi $ Step 4. Once an optimal policy is found, we can obtain its equivalent
as in step 4 of the reachability case.$ {\gamma} $ Now we are going to establish a relationship between optimal orchestrators according to Eq. (21), and optimal policies for
according to Eq. (25). Note that Lemma 3 and Lemma 4 also hold in this context since the composition MDP construction$ {\cal{M}}' $ is the same as for the reachability case.$ {\cal{M}}' $ Let
be the set of infinite paths following$ {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ on$ \pi $ such that they start with$ {\cal{M}}' $ and never enter in bad state in$ s'_0 $ , i.e.$ {\text{Bad}}$ . Let$ {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) = ((S'\setminus {\text{Bad}}) \times A)^\omega\cap {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(s'_0) $ be the 'dual' set, i.e. the set of finite paths following$ {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ on$ \pi $ such that they start with$ {\cal{M}}' $ and enter in bad state in$ s'_0 $ only at the end of the path and for the first time, i.e.$ {\text{Bad}}$ . Let also$ {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) = ((S'\setminus {\text{Bad}}) \times A)^* {\text{Bad}}\cap {\text{Paths}}_{ {\cal{M}}'_\pi}(s'_0) $ , i.e. the set of infinite paths such that at least once they visited a bad state, and$ {\text{Paths}}^\omega_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) = \bigcup_{\rho'\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0)} {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho') $ the set of finite paths that never visit a$ {\text{Paths}}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s_0') $ state. It is easy to see that the two sets$ {\text{Bad}} $ and$ {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ are disjoint and complete. Intuitively, regarding disjointness, a finite path$ {\text{Paths}}^\omega_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ cannot be the prefix of any infinite path$ \rho'\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ , since$ \rho\in {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ and$ {\text{last}}(\rho')\in {\text{Bad}} $ does not visit any bad state. Regarding completeness, there are no other cases in between, i.e. a infinite path either belongs to the first set or to the second set:$ \rho $ Lemma 9 The following two propositions hold:
1.
;$ {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) \cap {\text{Paths}}^\omega_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) = \emptyset $ 2.
.$ {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) \cup {\text{Paths}}^\omega_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) = {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(s'_0) $ Another crucial observation is that
is a set of finite paths forming pairwise disjoint cylinder sets. To see this, it is enough to observe that each of these paths has only the last state in Bad, while the previous states are all$ {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0) $ ; this means that they cannot share a common prefix.$ \lnot {\text{Bad}} $ Lemma 10 Let
be a finite path on$ \rho = s_0a_1\dots s_m\in {\text{Paths}}_{ {\cal{M}}'_\pi} $ for some policy$ {\cal{M}}'_\pi $ , and let$ \pi $ be its associated history. Then,$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ iff for all$ {\text{legal}}(h) $ we have$ k\in [0,m] $ .$ s_k\not\in {\text{Bad}} $ Proof We prove that
iff$ \lnot {\text{legal}}(h) $ . By definition of$ \exists k. s_k\in {\text{Bad}} $ , if a state$ {\text{Bad}} $ is in Bad, it means that (a)$ s = (q,\sigma_1,\dots,\sigma_n) $ or (b) there exists$ q\not\in F $ such that$ i $ . Let us consider a generic time step$ \sigma_i\not\in F_i $ and the associated state$ k $ in$ s_k $ . For$ \rho $ , Proposition (a) holds iff the sequence of$ s_k $ -components of the states occurring in$ Q $ , i.e.$ \rho[0:k] $ , is a non-accepting run over$ r = q_0,\dots,q_k $ for trace$ {\cal{A}}_\varphi $ . By definition of acceptance of a trace, the above holds iff$ a_1\dots a_{k} $ . By the correctness of the construction of$ a_1\dots a_{k}\in {\cal{L}}( {\cal{A}}_\varphi) $ , this is true iff$ {\cal{A}}_\varphi $ . By definition of$ a_1\dots a_{k}\not\models \varphi $ , we can rewrite the above expression as (c)$ {\tilde \tau} $ , where$ {\text{actions}}(h')\not\models\varphi $ . Finally, by definition of$ h' = h[0:k] $ , we have that for some$ {\text{legal}} $ (c) or (b) holds iff$ k $ $ \lnot {\text{legal}}(h) $ □ Lemma 11 Let
be a policy for$ \pi $ and let$ {\cal{M}}' $ be its equivalent orchestrator. Moreover, let$ {\gamma} $ be a finite path on$ \rho = s_0a_1s_1\cdots s_m\in {\text{Paths}}_{ {\cal{M}}'_\pi} $ and$ {\cal{M}}'_\pi $ . Then,$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ iff$ \rho\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi} $ .$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ Proof First, note that by Lemma 3,
iff$ \rho\in {\text{Paths}}_{ {\cal{M}}'_\pi} $ is an execution of$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ .$ {\gamma} $ By Lemma 10, since$ (\Rightarrow) $ , we have$ \rho[m]\in {\text{Bad}} $ . Moreover, for all$ \lnot {\text{legal}}(h) $ , all states in$ j = 0,\dots,m-1 $ are not in$ \rho[0:j] $ , and therefore again by Lemma 10, for each prefix$ {\text{Bad}} $ ,$ h[0:j]\in {\text{prefixes}}(h) $ holds. By definition of$ {\text{legal}}(h[0:j]) $ , this implies that$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ .$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ If$ (\Leftarrow) $ , then by definition of$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ we have that (1)$ \overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ and that (2) for all$ \lnot {\text{legal}}(h) $ such that$ h'\in {\text{prefixes}}(h) $ , we have$ h'\neq h $ . By Lemma 10, proposition (2) implies that for all$ {\text{legal}}(h') $ , all prefixes$ j = 0,\dots,m-1 $ are such that their states are not in$ \rho[0:j] $ . Moreover, from (1), and again by Lemma 10, it follows that there exists$ {\text{Bad}} $ such that$ k\in[0,m] $ , but from the previous argument,$ \rho[k]\in {\text{Bad}} $ must be equal to$ k $ since states at positions smaller than$ m $ in$ m $ are not in$ \rho $ . In other words, the last state in$ {\text{Bad}} $ is in$ \rho $ , while all the other states along$ {\text{Bad}} $ are not. Therefore, by definition of$ \rho $ , we have$ {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi} $ .$ \rho\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi} $ □ Lemma 12 Let
be a policy for$ \pi $ and let$ {\cal{M}}' $ be its equivalent orchestrator. Moreover, let$ {\gamma} $ be a finite path on$ \rho = s_0a_1s_1\cdots s_m\in {\text{Paths}}_{ {\cal{M}}'_\pi} $ and$ {\cal{M}}'_\pi $ . Then,$ h = {\tilde \tau}_{\varphi, {\cal{C}}}(\rho) $ if$ \rho\in {\text{Paths}}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi} $ .$ h\in {{\cal{H}}^{\varphi,{\text{safe}}}}_{ {\gamma}, {\cal{C}}} $ Proof By definition of
,$ {{\cal{H}}^{\varphi,{\text{safe}}}}_{ {\gamma}, {\cal{C}}} $ iff$ h\in {{\cal{H}}^{\varphi,{\text{safe}}}}_{ {\gamma}, {\cal{C}}} $ . By Lemma 10,$ {\text{legal}}(h) $ iff for all$ {\text{legal}}(h) $ ,$ k = 0\dots m $ . By definition of$ \rho[k]\not\in {\text{Bad}} $ , this holds iff$ {\text{Paths}}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi} $ .$ \rho\in {\text{Paths}}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi} $ □ Theorem 7 Let
be an instance of Problem 2, and let$ ( {\cal{C}}, \varphi) $ be the composition MDP for$ {\cal{M}}' $ and$ {\cal{C}} $ . We have that$ \varphi $ is optimal (w.r.t. Eq. [25]) iff its equivalent orchestrator$ \pi $ is optimal (w.r.t. Eq. [21]).$ {\gamma} $ Proof First, we show that
iff$ \pi = \arg\max_{\pi'} {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}(\Box \lnot {\text{Bad}}) $ . For any pair$ {\gamma} = \arg\max_{ {\gamma}'} {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}') $ and its equivalent$ \pi $ , we have:$ {\gamma} $ $ {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}(\Box \lnot {\text{Bad}}) = {\mathbb{P}}_{ {\cal{M}}'_{\pi},s'_0}( {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s'_0)) $ (26) $ = 1 - {\mathbb{P}}_{ {\cal{M}}'_{\pi},s'_0}( {\text{Paths}}^\omega_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0)) $ (27) $ = 1 - \sum\limits_{\rho'\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi}(s'_0)} {\mathbb{P}}_{ {\cal{M}}'_{\pi},s'_0}( {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho')) $ (28) $ = 1 - \sum\limits_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\mathbb{P}}_{ {\gamma}, {\cal{C}}}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h)) $ (29) $ = 1 - {\mathbb{P}}_{ {\gamma}, {\cal{C}}}\bigg(\bigcup\limits_{h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}}} {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) \bigg) $ (30) $ = {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) $ (31) where step 26 is by definition of probabilistic safety, step 27 is by Lemma 9, step 28 is by disjointness of all finite paths
that end in a$ \rho'\in {\text{Paths}}_{ {\text{Bad}}, {\cal{M}}'_\pi} $ state, step 29 is by Lemma 4 and 11, step 30 is by disjointness of all$ {\text{Bad}}$ for$ {\cal{T}}_{ {\gamma}, {\cal{C}}}(h) $ , and step 31 is by definition of$ h\in\overline{ {\cal{H}}^{\varphi}_{ {\gamma}, {\cal{C}}}} $ (Eq. [18]) and by Eq. (24). From this, we obtain that$ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi} $ iff$ \pi^* = \arg\max_{\pi'} {\mathbb{P}}_{ {\cal{M}}'_\pi,s'_0}(\Box \lnot {\text{Bad}}) $ .$ {\gamma}^* = \arg\max_{ {\gamma}'} {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}') $ It remains to prove that
is cost-optimal iff$ \pi $ is cost-optimal. In the following, we use$ {\gamma} $ to denote paths over$ {\text{Paths}}^{(m)} $ of length$ {\cal{M}}' $ . By definition of the expected conditional finite horizon cost$ m $ , using the law of total expectation, we get:$ {\mathbb{E}}_{ {\cal{M}}'_\pi,s'_0}[ {{\text{Cost}}_m} \mid \Box\lnot {\text{Bad}}] $ $\begin{split} {\mathbb{E}}_{ {\cal{M}}'_\pi,s'_0}( {{\text{Cost}}_m} \mid \Box\lnot {\text{Bad}}) =\;& \sum\limits_{\rho\in {\text{Paths}}^{(m)}_{ {\cal{M}}'_\pi}} {\mathbb{E}}\left[ {{\text{Cost}}_m} \mid {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho), \Box\lnot {\text{Bad}}\right]\\& {\mathbb{P}}_{ {\cal{M}}'_\pi}\left( {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho) \mid \Box\lnot {\text{Bad}}\right)\end{split} $ Intuitively, we are averaging the total costs of all paths of length
that never visit$ m $ by the probability of those paths being visited under policy$ {\text{Bad}} $ in$ \pi $ . Since$ {\cal{M}} $ depends only on the first$ {{\text{Cost}}_m} $ steps, it is constant on each cylinder; hence,$ m $ . Moreover, by applying the definition of conditional probability ($ {\mathbb{E}}[ {{\text{Cost}}_m}\mid {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho), \Box\lnot {\text{Bad}}] = {{\text{Cost}}_m} = \sum_{k = 0}^{m-1} C'(s_k',a_{k+1}) $ ), and by observing that paths that do not start with$ P(A\mid B) = \frac{P(A\cap B)}{P(B)} $ have probability$ s_0' $ , we get:$ 0 $ $ \sum\limits_{\rho\in {\text{Paths}}^{(m)}_{ {\cal{M}}'_\pi}(s_0')} \Bigg(\sum\limits_{k = 0}^{m-1} C'(s_k',a_{k+1})\Bigg) \frac{ {\mathbb{P}}_{ {\cal{M}}'_\pi}\left( {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho) \cap \Box\lnot {\text{Bad}}\right)} { {\mathbb{P}}_{ {\cal{M}}'_\pi}\left(\Box\lnot {\text{Bad}}\right)} $ Note that the denominator
is a normalization factor due to the condition$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}(\Box\lnot {\text{Bad}} ) $ in the conditional probability. Moreover, note that$ \Box\lnot {\text{Bad}} $ ; hence, the above expression can be rewritten as:$ {\mathbb{P}}_{M'_\pi,s_0'}(\Box\lnot {\text{Bad}}) = {\mathbb{P}}( {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s_0')) $ $ \sum\limits_{\rho\in {\text{Paths}}^{(m)}_{ {\cal{M}}'_\pi}(s_0')} \Bigg(\sum\limits_{k = 0}^{m-1} C'(s_k',a_{k+1})\Bigg) \frac{ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}\left( {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho) \cap {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s_0')\right)}{ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}( {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s_0'))} $ Note that whenever
, the corresponding term in the sum is zero. Therefore, we can rewrite the summation condition as:$ \rho\in( {\text{Paths}}^{(m)}_{ {\cal{M}}'_\pi}(s_0')\setminus {\text{Paths}}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}) $ $ \sum\limits_{\rho\in {\text{Paths}}^{(m)}_{\lnot {\text{Bad}}, {\cal{M}}'_\pi(s_0')}} \Bigg(\sum\limits_{k = 0}^{m-1} C'(s_k',a_{k+1})\Bigg) \frac{ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}( {\text{Paths}}^\omega_{ {\cal{M}}'_\pi}(\rho))}{ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}( {\text{Paths}}^\omega_{\lnot {\text{Bad}}, {\cal{M}}'_\pi}(s_0'))} $ By Lemma 12, there is a bijection between paths in
and histories in$ \rho\in {\text{Paths}}^{(m)}_{ {\cal{M}}'_\pi}(s_0') $ with$ h\in {{\cal{H}}^{\varphi,{\text{safe}}}}_{ {\gamma}, {\cal{C}}} $ . By Lemma 4, the probability of the cylinder sets of$ |h| = m $ and$ \rho $ is the same. Moreover, by construction of$ h $ , for all$ {\cal{M}}' $ , the terms of the form$ k = 0,\dots,m-1 $ in the sum of the costs can be rewritten as$ C'(s_k',a_{k+1}) $ , i.e., the cost of executing action$ C_{o_{k+1}}(\sigma_{o_{k+1},k}, a_{k+1}) $ with the$ a_{k+1} $ -th service in its current state. Finally, by the first part of the proof (steps 26–31), we have that$ o_{k+1} $ and that the probability of a path with no bad states in$ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}(\Box\lnot {\text{Bad}}) = {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma}) $ is the same as the probability of the corresponding trace$ {\cal{M}} $ being legal in$ t $ . Therefore, we can rewrite the expression as follows:$ {\cal{C}} $ $ \begin{split}& \sum\limits_{h\in {{\cal{H}}^{\varphi,{\text{safe}}}}_{ {\gamma}, {\cal{C}}}: |h| = m} \Bigg(\sum\limits_{k = 0}^{m-1} C_{o_{k+1}}(\sigma_{o_{k+1},k},a_{k+1})\Bigg) \frac{ {\mathbb{P}}_{ {\cal{M}}'_\pi,s_0'}( {\cal{T}}_{ {\gamma}, {\cal{C}}}(h))}{ {\cal{P}}^{ {\text{safe}}}_{ {\cal{C}},\varphi}( {\gamma})}\\ =\;& {\mathbb{E}}_{h\sim {\mathbb{P}}_{ {\gamma}, {\cal{C}}}}\Bigg[ {{\text{Cost}}^{{\rm{C}}}_m} \vert\ {\text{legal}}(t) \Bigg] \end{split} $ From the above, it follows that
$\begin{split} {\cal{J}}^{ {{\text{avg}\text-\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}) =\;& \limsup\limits_{m\to\infty} \frac{1}{m}{\mathbb{E}}_{h\sim\mathbb{P}_{ {\gamma}, {\cal{C}}}}\Bigg[ {{\text{Cost}}^{{\rm{C}}}_m} \bigg\vert {\text{legal}}(h)\Bigg] \\=\;& \limsup\limits_{m\to\infty} \frac{1}{m} {\mathbb{E}}_{\rho\sim {\cal{M}}'_{\pi},s'_0}(MC \mid \Box\lnot {\text{Bad}}), \end{split}$ since we are taking the limit of equal terms. Therefore,
if$ \pi\in\arg\inf_{\pi'} \limsup\limits_{m\to\infty} \frac{1}{m} {\mathbb{E}}_{\rho\sim {\cal{M}}'_{\pi'},s'_0}(MC \mid \Box\lnot {\text{Bad}}) $ . Combining both results, we get the thesis.$ {\gamma}\in \arg\inf_{ {\gamma}'} {\cal{J}}^{ {{\text{avg}\text{-}\text{cost}}}}_{ {\cal{C}},\varphi}( {\gamma}') $ □ Computational cost. Theorem 7 guarantees that we can reduce Problem 2 to the problem of finding an optimal policy for the lexicographic bi-objective optimization problem (Eq. [6]) over the composition MDP
, where this time the first objective is to maximize the probability of avoiding a set of states in an MDP, and as a second objective, we minimize the expected conditional mean-cost.$ {\cal{M}}' $ As explained above, the two-stage technique requires solving a planning problem over MDPs. Since it is known that both steps require polynomial time complexity in the number of states and actions of the MDP[24] and that our Composition MDP has a state space that is a single-exponential in the size of the goal specification, we get this result:
Theorem 8: Problem 2 can be solved in at most double-exponential time in the size of the formula, in at most exponential time in the number of services, and in polynomial time in the size of the services.
-
To demonstrate how the proposed approach can be instantiated and applied in a smart manufacturing scenario, we consider a representative production process: the assembly of an electric motor a widely used component in various applications such as industrial machinery, electric vehicles, household appliances, and many others[17]. To function properly, electric motors require certain materials that possess specific electrical and magnetic properties. Therefore, before the manufacturing processes start, the raw materials (i.e., copper, steel, aluminium, magnets, insulation materials, bearings) must be extracted and refined to obtain essential metals and polymers for electric motor parts manufacturing. When the materials are in the manufacturing facility, the effective manufacturing process can start. For the sake of brevity, in the following, we focus on the main aspects of the manufacturing process, skipping the provisioning, but the formalization can be easily extended to cover more details.
Goal: Fig. 2 depicts the declare formalization[12] of the electric motor manufacturing process. The main components of an electric motor are the stator, the rotor, and, in the case of alternate current motors with direct current power (e.g., in the case of electric cars). These three components are built or retrieved in any order (no precedence declare constraints between these tasks) and then eventually assembled to build a motor (alternate succession constraint between Build/Retrieve tasks and the Assemble Motor task). After the motor is assembled, a running-in test must be performed (alternate succession constraint between the Assemble Motor task and the Running In task), and at most one (not coexistence constraint) between an electric test and a full static test (the latter comprises the former). In addition, the motor can be painted optionally. The Painting, Electric Test and Static Test tasks optionally follow the Assemble Motor task (alternate precedence constraints). The process depicts the manufacturing tasks involved in producing a single motor as indicated by the existence constraints. Machines and/or human operators can perform all these operations. Each declare pattern can be transformed into an ʟᴛʟf formula[42]. Therefore, the entire process can be encoded into an ʟᴛʟf formula
, which we omit. Then, we will consider the reachability task$ \varphi $ as the goal of the service composition.$ \varphi $ Services: The behaviour of each process actor can be described as a stochastic service, i.e., a state machine with a probabilistic behaviour used to model two types of actors involved in the manufacturing process, namely machines and human operators, shown in Fig. 3. Each transition edge has a label which indicates the operation, the probability of transition and the associated cost. Figure 3a depicts a simple stochastic service of human workers. Such services have an initial accepting state in which they are READY and accept operations, and a sink failure state from where no action can be taken. The transition triggered by the
action has a probability of$ [OP] $ of remaining in the same ready state (success), but a$ p^s $ probability of failing. The transition is associated with a certain cost$ 1-p^s $ to perform the action (preceded by a$ c_{i}^{[Op]} $ , i.e., the cost is thought of as a negative value, using the reward-based representation). Figure 3b depicts a generic stochastic service of machines. The machine is initially in the READY state, which is also the unique accepting state, where it can receive the CONFIG[DEV] command (the reader, in the following, can imagine the [DEV] trailer to be replaced with the name of the specific manufacturing actor). This action takes the service to the CONFIGURATION state where the actor is set up or warmed up. At the end of this phase, the CHECKED[DEV] action is performed. If the configuration is unsuccessful, with a probability$ - $ representing the possibility of finding the actor unemployable, the actor goes into the BROKEN state. If the configuration is successful, with a probability$ p_i^u $ , the actor goes to the EXECUTING state, where a family of operations, denoted with [OP] in the picture, can be executed. For the sake of compactness, we only show a single operation, but the service can be easily generalized to the case where a single actor can perform multiple operations. The action [OP] represents one of those operations defined in Fig. 2. Executing an operation implies a certain cost$ 1 - p_i^u $ . In some cases, the execution of [OP] may take the actor$ c_i^{[Op]} $ to the BROKEN state with probability$ i $ and also, in this case, the operation implies a high cost$ p_i^b $ . To take the actor back to the READY state, a RESTORE[DEV] task must be executed on the actor, which has a repair cost$ c_i^{Bad[Op]} $ depending on the actual conditions of the actor, and that takes the actor to the REPAIRING state. When the actor is repaired, a REPAIRED[DEV] event is received, making the actor available again for manufacturing. Noteworthy, the CONFIG[DEV], CHECKED[DEV], RESTORE[DEV], REPAIRED[DEV] operations do not leave any trace on the target process. Noticeably, we can imagine that some of these operations are triggered, in reality, as exogenous events, i.e., they should be reflected in the controller, but the actor will wait for these events instead of autonomously enacting them.$ c_i^r $
Figure 3.
The two types of service we consider for the electric motor case study. (a) 'Human operator' service. (b) 'Machine' service.
We are interested in the problem of maximising the probability that the smart factory succeeds in producing electric motors at a minimum utilization cost. A two-stage approach can achieve this: in the first stage, we aim to find the maximally permissive strategy that (i) determines the equally optimal sequences of actions to satisfy the goal specification and (ii) the equally optimal dispatching strategy that decides which services should perform the operation. The optimization must also consider configuration/checking/repairing action to bring the service back to a final configuration. This might require limiting the use of services with certain probability of leading to a failing configuration. In the second stage, we select, among the available strategies, those that also minimize the utilization cost. Crucially, the optimal solution might vary depending on the service available and their capabilities, as well as the probabilities
,$ p_i^s $ ,$ p_i^u $ , and costs$ p_i^b $ ,$ c_i^[{OP}] $ , and$ c_{i}^{Bad[Op]} $ . Given the high degrees of freedom, it is paramount to use a technique, such as the one proposed in this work, that can automatically handle such a complex scenario.$ c^r_i $ -
This paper proposes a novel stochastic composition framework in which we aim to maximize the satisfaction probability of a goal specification, expressed as a high-level logic formalism such as ʟᴛʟf, and conditioned on this, minimize the utilization costs of the available services. We formalized two variants of the problem, one for reachability tasks and the other for safety tasks, and proposed a solution based on a reduction to a bi-objective optimization over MDPs, proving the correctness. Finally, we highlighted the relevance of our contribution by providing an industrial case study considered in the literature. In future work, we would like to study the process-oriented variant of our framework, namely, to maximize the probability of realizing all traces that are compatible with the specification, and conditioned on this, maximize as much as possible the average expected reward coming from the utilization of the services. This would allow us to consider a hierarchy of target specifications (either goal-oriented or process-oriented), hence delivering a rich framework suitable for several applications. The same kind of generalization can be considered for the service reward/cost function, where we can consider more than one reward regarding service utilization. Moreover, we would like to implement our approach using state-of-the-art probabilistic model checkers such as PRISM[43] and Storm[44]. Another interesting direction for future work is incorporating internal-action hiding and parallel composition, which are standard in compositional modeling frameworks.
This work is supported in part by the ERC Advanced Grant WhiteMech (Grant No. 834228), the PRIN project RIPER (Grant No. 20203FFYLK), the PNRR MUR project FAIR (Grant No. PE0000013), and the UKRI Erlangen AI Hub on Mathematical and Computational Foundations of AI (Grant No. EP/Y028872/1). This work has been carried out while Luciana Silo was enrolled in the Italian National Doctorate on Artificial Intelligence run by Sapienza University of Rome.
-
The authors confirm contributions to the paper as follows: study conception and design: De Giacomo G, Favorito M, Silo L; analysis and interpretation of results: De Giacomo G, Favorito M, Silo L; draft manuscript preparation: De Giacomo G, Favorito M, Silo L. All authors reviewed the results and approved the final version of the manuscript.
-
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
-
The authors declare that they have no conflict of interest.
- Copyright: © 2026 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
-
About this article
Cite this article
De Giacomo G, Favorito M, Silo L. 2026. Stochastic service composition for ʟᴛʟf reachability and safety tasks. The Knowledge Engineering Review 41: e003 doi: 10.48130/ker-0026-0003
Stochastic service composition for ʟᴛʟf reachability and safety tasks
- Received: 06 January 2025
- Revised: 08 November 2025
- Accepted: 04 February 2026
- Published online: 30 April 2026
Abstract: Service composition à la Roman model consists of realizing a virtual service by suitably orchestrating a set of already available services. In this paper, we consider a variant where available services are stochastic systems, and the target specification is goal-oriented and specified in Linear Temporal Logic on finite traces (ʟᴛʟf). In this setting, we are interested in synthesizing a controller (policy) that maximizes the probability of satisfying the temporal logic objective (either reachability or safety) while minimizing the expected cost of using the available services. To do so, we combine techniques from ʟᴛʟf synthesis, service composition à la Roman Model, and bi-objective lexicographic optimization on Markov Decision Processes (MDPs). This framework has several interesting applications, including Smart Manufacturing and Digital Twins.
-
Key words:
- Service composition /
- Linear temporal logic on finite traces /
- MDPs





