Note, that while no coffee breaks are formally scheduled, coffee and other refreshments will be available outside the room throughout the day. There are also regular short breaks between sessions, which are intended to allow people to go in and out of the room.
Monday, Sep 16, 2024
Starting with McNaughton’s pioneering technical report of 1965 in which he initiated the automata theoretic study of infinite games, we pursue his view that plays of an infinite game should terminate in finite time with the correct declaration of the winner. Such a reduction of regular infinite games to reachability games or safety games is well-known, for example, in the study of parity games. We focus on this reduction for Muller games (reporting on work of D. Neider, R. Rabinovich, M. Zimmermann, Aachen) which offers a so far unexploited player-dependent approach for solving Muller games, based on a new kind of memory structure and avoiding the step through parity games. We conclude with a more general discussion of determinacy proofs for infinite games, motivated by Büchi’s difficult last paper that has not received attention since it appeared in 1983.
A word equation is a formal equation using words (also called strings), variables (representing words) and concatenation as the only allowed operation, i.e. they are of the form u = v, where u, v consists of letters and variables. A solution substitutes variables with words so that this formal equality is turned into true equality of strings. Often we also allow usage of additional constraints, say we require that a substitution for a variable is from a certain regular language or that lengths of substitutions satisfy some linear inequality, etc.
In this talk I will present state of the art, some recent results and directions on word equations:
- what is known about satisfiability and major techniques that are used, this will include some restricted classes of equations, including some recent upper bounds of NP for subclasses of quadratic equations.
- what is known about the solution sets, this includes recent result bounding the number of solutions of equations using one variable only.
- detail a new trend in application of word equations in verification, usually referred as string solving, this will also include some gentle introduction to fragments of logics over word equations
- and some smaller, particular results.
A two-dimensional configuration is a coloring of the infinite grid Z^2 using a finite number of colors. For a finite subset D of Z^2, the D-patterns of a configuration are the patterns of shape D that appear in the configuration. The number of distinct D-patterns of a configuration is a natural measure of its complexity. We consider low-complexity configurations where the number of distinct D-patterns is at most |D|, the size of the shape. We use algebraic tools to study periodicity of such configurations [1]. We show, for an arbitrary shape D, that a low-complexity configuration must be periodic if it comes from the well-known Ledrappier subshift, or from a wide family of other similar algebraic subshifts [2]. We also discuss connections to the well-known Nivat's conjecture: In the case D is a rectangle - or in fact any convex shape - we establish that a uniformly recurrent configuration that has low-complexity with respect to shape D must be periodic [3]. This implies an algorithm to determine if a given collection of mn rectangular patterns of size mxn admit a configuration containing only these patterns. Without the complexity bound the question is the well-known undecidable domino problem.
References
[1] J. Kari, M. Szabados. An Algebraic Geometric Approach to Nivat’s Conjecture. Information and Computation 271, pp. 104481 (2020).
[2] J. Kari, E. Moutot. Nivat’s conjecture and pattern complexity in algebraic subshifts. Theoretical Computer Science 777, pp. 379–386 (2019).
[3] J. Kari, E. Moutot. Decidability and Periodicity of Low Complexity Tilings. Theory of Computing Systems 67, pp- 125-148 (2023).
Up-To Techniques are coinductive proof techniques that provide small witnesses for lower bounds for greatest fixpoints (respectively upper bounds for least fixpoints). They have been introduced for bisimilarity checks, but can also be fruitfully be used for language equivalence or behavioural metrics, typically by exploiting an algebraic structure on the state space.
This talk will start by introducing the lattice-theoretic foundations of up-to techniques and review results by Bonchi and Pous on checking language equivalence for NFAs. We will then generalize these results to a coalgebraic setting and to behavioural metrics, where the use of up-to techniques sometimes allows to resort to finite witnesses (rather than infinite ones).
This is joint work with Filippo Bonchi, Daniela Petrisan, Keri D'Angelo, Sebastian Gurke, Johanna Maria Kirss, Matina Najafi, Wojciech Rozowski and Paul Wild.
A set of configurations H is a home-space for a set of configurations X of a Petri net if every configuration reachable from (any configuration in) X can reach (some configuration in) H. The semilinear home-space problem for Petri nets asks, given a Petri net and semilinear sets of configurations X, H, if H is a home-space for X. In 1989, David de Frutos Escrig and Colette Johnen proved that the problem is decidable when X is a singleton and H is a finite union of linear sets with the same periods. In this presentation, we show that the general (semilinear) problem is decidable. This result is obtained by proving a duality between the reachability problem and the non-home-space problem. In particular, we prove that for any Petri net and any semilinear set of configurations H we can effectively compute a semilinear set C of configurations, called a non-reachability core for H, such that for every set X the set H is not a home-space for X if, and only if, C is reachable from X. We show that the established relation to the reachability problem yields the Ackermann-completeness of the (semilinear) home-space problem. For this we also show that, given a Petri net with an initial marking, the set of minimal reachable markings can be constructed in Ackermannian time.
Zielonka's construction of asynchronous automata for regular, commutation-closed languages is a prime example of distributed synthesis. In this talk we present the latest construction, as well as simpler variants for restricted cases, and we discuss potential applications of this seminal result to distributed monitoring of concurrent programs.
Tuesday, Sep 17, 2024
We define Büchi automata over concrete domains, where transitions are labelled with constraints. We then give an overview over the decidability status of the nonemptiness problem for such automata over typical concrete domains, and we explain the symbolic approach as a tool to prove decidability of the nonemptiness problem.
A weighted finite automaton (WFA) computes a function that maps each input word to an output in an underlying semiring of weights. Two WFA are equivalent if they compute the same function. While every boolean (i.e., unweighted) finite automaton is equivalent to a deterministic one, this is no longer true for weighted automata. In general, there are strict inclusions between the classes of functions computable by deterministic (=sequential), unambiguous, finitely ambiguous, polynomial ambiguous, and exponentially ambiguous WFA, giving rise to a natural ambiguity hierarchy.
While it is classically known to be decidable whether a given automaton is deterministic, unambiguous, etc., it is much harder to decide whether a WFA is equivalent to a deterministic, unambiguous, etc. WFA. In particular, deciding determinizability for WFA is a classical problem, typically considered over tropical (min/max-plus) semirings or over fields. The tropical case is still not fully resolved.
This talk is about the field case. I will discuss joint work with Jason Bell (Waterloo), which shows that equivalence with deterministic, respectively, unambiguous WFA is decidable. The approach is based on a characterization of the semantics, i.e., the functions computed by such WFA, using arithmetic properties and a theorem from Diophantine number theory. I will also discuss current work together with Antoni Puch (Warsaw). Restricting to the case in which all transition matrices are invertible, we obtain a complete correspondence between the ambiguity hierarchy and arithmetical properties. This shows that also equivalence with finitely ambiguous and polynomial ambiguous WFA is decidable in the invertible case.
The complexity of solving parity games has been a long standing open problem for decades. Progress has been very slow until a breakthrough came in 2017, improving the state of the art from mildly subexponential to quasipolynomial. Since then, many more quasipolynomial time algorithms have been constructed, with distinct flavours and algorithmic paradigms, with many exciting insights into the structure of parity games. The flow of new algorithms is slowing down, and one can now ask: how close are we to the ultimate breakthrough - a polynomial time algorithm -, or possibly to impossibility / complexity hardness results? What are the lessons learned since 2017 on quasipolynomial time algorithms?
Given a class of regular languages, the C-membership problem asks whether a given regular language belongs to C. While most questions about automata are well understood today, the C-membership problem remains open for several significant classes, in particular for most levels of the quantifier alternation hierarchy in first-order logic. This talk will present approaches to solving this problem, aimed at simultaneously capturing several variants of a specific level.
Regular language membership is a key primitive in myriads of applications, from web scraping to bioinformatics. The quick growth of data volume and the presence of noise in the applications demand efficient and robust methods for membership testing, and this talk will provide a survey of such methods and open questions.
It is a classic phenomenon in formal language theory that inclusion between languages is hard to decide: For context-free languages, it is undecidable, and already for NFAs, the problem is PSPACE-complete. However, recent work by various authors has shown that under some restrictions on the input languages, deciding inclusion is surprisingly tractable and admits interesting new techniques. The talk will survey such results (and open problems), with a focus on non-regular languages.
Wednesday, Sep 18, 2024
The polyregular functions are string-to-string functions that have polynomial size outputs, and which can be computed by finite-state devices. This class of functions can be described using many different models that pop up in different areas, and its seems to be a plausible (unique?) candidate for the notion of regularity for functions that have polynomial outputs size. I will report on some recent progress on this topic, regarding questions about the expressive power, as well as some algorithmic problems.
Session 1: Formal Languages
This paper investigates the state complexities of subword-closed and superword-closed languages, comparing them to regular languages. We focus on the square root operator and the substitution operator. We establish an exponential lower bound for superword-closed languages for the k-th root. For subword-closed languages we analyze in detail a specific instance of the square root problem for which a quadratic complexity is proven. For the substitution operator, we show an exponential lower bound for the general substitution. We then find some conditions for which we prove a quadratic upper bound.
Transducers are a formalism based on finite state automata, where transitions are labelled with words to output alongside the letter to read. Deterministic transducers allow to characterise functions on words: they assign an output word to each word they recognise. The regular functions are the one computed by a two-way deterministic transducer [4]. A transducer is reversible if it is both deterministic and co-deterministic. Intuitively, it means that we can rewind the computation on a given word. This allows for a polynomial composition of reversible two-way transducers. The class of reversible two-way transducers was shown to be as expressive as that of deterministic two-way transducers [2]. In this talk, we present recent results on reversibility for infinite words. More precisely, we show that, over infinite words as well, the class of two-way reversible transducers is as expressive as that of two-way deterministic transducers, which characterise regular functions on infinite words [1]. We also show that two way reversible transducers over infinite words without accepting conditions are as expressive as deterministic two-way transducers with a Büchi condition. This work is joint with L. Dartois, P. Gastin, R. Govind, and S. Krishna. It is under reviewing, and is available at [3]. References: [1] Olivier Carton et al. “Deterministic Regular Functions of Infinite Words”. In: ICALP. 2023. doi: 10.4230/lipics.icalp.2023.121. [2] Luc Dartois et al. “On Reversible Transducers”. In: ICALP. 2017. doi: 10.4230/lipics.icalp.2017.113. [3] Luc Dartois et al. Reversible Transducers over Infinite Words. 2024. doi: 10.48550/arXiv.2406.11488. [4] Joost Engelfriet and Hendrik Jan Hoogeboom. “MSO definable string trans- ductions and two-way finite-state transducers”. In: ACM Trans. Comput. Log. 2.2 (2001). doi: 10.1145/371316.371512.
A transducer is finite-valued if for some bound k, it maps any given input to at most k outputs. For classical, one-way transducers, it is known since the 80s that finite valuedness entails decidability of the equivalence problem. This decidability result is in contrast to the general case, which makes finite-valued transducers very attractive. For classical transducers, it is also known that finite valuedness is decidable and that any k-valued finite transducer can be decomposed as a union of k single-valued finite transducers. In this paper, we extend the above results to copyless streaming string transducers (SSTs), answering questions raised by Alur and Deshmukh in 2011. SSTs strictly extend the expressiveness of one-way transducers via additional variables that store partial outputs. We prove that any k-valued SST can be effectively decomposed as a union of k (single-valued) deterministic SSTs. As a corollary, we obtain equivalence of SSTs and two-way transducers in the finite-valued case (those two models are incomparable in general). Another corollary is an elementary upper bound for checking equivalence of finite-valued SSTs. The latter problem was already known to be decidable, but the proof complexity was unknown (it relied on Ehrenfeucht's conjecture). Finally, our main result is that finite valuedness of SSTs is decidable. The complexity is PSpace, and even PTime when the number of variables is fixed.
The conjugacy problem asks if a given pair of elements in a finitely presented monoid (typically infinite) is conjugate, i.e., a cyclic shift of one another. Conjugacy problem over free monoids is solvable in polynomial time. We consider a generalisation of the problem to a finitely-presented possibly infinite set of pairs (or relation). A relation over the free monoid is conjugate if each pair of words in the relation is conjugate. We address the conjugacy problem of a relation. It is particularly interesting when the relation is automata-definable.
We show that conjugacy of rational relations, those accepted by a finite state transducer, is decidable. Towards our decision procedure, we establish a new result that is of independent interest to word combinatorics. We identify a necessary and sufficient condition for the set of pairs given by (u_0,v_0)G_1*(u_1,v_1) ... G_k*(u_k,v_k), k > 0 to be conjugate, where G_1, ..., G_k are arbitrary sets of pairs of words, and (u_0,v_0),...,(u_k,v_k) are arbitrary pairs of words. This is similar to, and a nontrivial generalisation of, a characterisation given by Lyndon and Schützenberger in 1962 for the conjugacy of a pair of words. Finally, we discuss towards conjugacy of regular relations, those accepted by a two-way finite state transducer.
This is a joint work with C. Aiswarya (CMI) and Amaldev Manuel (IIT Goa). It is based on a paper accepted in DLT 2024. The full version of the paper can be found in arXiv ( https://arxiv.org/abs/2307.06777 ).
Indexed languages are a classical notion in formal language theory. As the language equivalent of second-order pushdown automata, they have received considerable attention in higher-order model checking. Unfortunately, counting properties are notoriously difficult to decide for indexed languages: So far, all results about non-regular counting properties show undecidability. In this paper, we initiate the study of slice closures of (Parikh images of) indexed languages. A slice is a set of vectors of natural numbers such that membership of u,u+v,u+w implies membership of u+v+w. Our main result is that given an indexed language L, one can compute a semilinear representation of the smallest slice containing L's Parikh image. We present two applications. First, one can compute the set of all affine relations satisfied by the Parikh image of an indexed language. In particular, this answers affirmatively a question by Kobayashi: Is it decidable whether in a given indexed language, every word has the same number of a's as b's. As a second application, we show decidability of (systems of) word equations with rational constraints and a class of counting constraints: These allow us to look for solutions where a counting function (defined by an automaton) is not zero. For example, one can decide whether a word equation with rational constraints has a solution where the number of occurrences of a differs between variables X and Y.
The talk will present a paper accepted for LICS 2024, which is joint work with Laura Ciobanu. The paper is available at https://arxiv.org/abs/2405.07911
A language L is said to be regular measurable if there exists an infinite sequence of regular languages that “converges” to L. This notion was introduced by the speaker in 2021 and used for classifying non-regular languages by using regular languages. In this talk, we describe why this notion was introduced, and give a brief overview of decidability results relating to the measurability.
Well-structured transition systems (WSTS) are an abstract family of systems that encompasses a vast landscape of infinite-state systems. By requiring a well-quasi-ordering (wqo) on the set of states, a WSTS enables generic algorithms for classic verification tasks such as coverability and termination. However, even for systems that are WSTS like vector addition systems (VAS), the framework is notoriously ill-equipped to analyse reachability (as opposed to coverability). Moreover, some important types of infinite-state systems fall out of WSTS' scope entirely, such as pushdown systems (PDS).
Inspired by recent algorithmic techniques on VAS, we propose an abstract notion of systems where the set of runs is equipped with a wqo and supports amalgamation of runs. We show that it subsumes a large class of infinite-state systems, including (reachability languages of) VAS and PDS, and even all systems from the abstract framework of valence systems, except for those already known to be Turing-complete.
Moreover, this abstract setting enables simple and general algorithmic solutions to unboundedness problems, which have received much attention in recent years. We present algorithms for the (i) simultaneous unboundedness problem (which implies computability of downward closures and decidability of separability by piecewise testable languages), (ii) computing priority downward closures, (iii) deciding whether a language is bounded, meaning included in $w_1^\cdots w_k^$ for some words $w_1,\ldots,w_k$, and (iv) effective regularity of unary languages. This leads to either drastically simpler proofs or new decidability results for a rich variety of systems.
Session 2: MSO Logic
We prove that the bisimilar-invariant fragment of MSO over finite transition systems is equivalent to mu-calculus. This is a joint word with Amina Doumane and Denis Kuperberg.
We provide an algorithm to decide whether a class of finite graphs that has bounded linear clique width is well-quasi-ordered by the induced subgraph relation in the presence of a labelling of the vertices, where the class is given by an $\mathsf{MSO}$-transduction from finite words. This study leverages tools from automata theory, and the proof scheme allows to derive a weak version of the Pouzet conjecture for classes of bounded linear clique-width. We also provide an automata based characterization of which classes of $\mathsf{NLC}$ graphs are well-quasi-ordered by the induced subgraph relation, where we recover the results of Daligault Rao and Thomass\'e [10.1007/s11083-010-9174-0, Theorem 3] by encoding the models into trees with the gap embedding relation of Dershowitz and Tzameret.
This work is available on arxiv: https://arxiv.org/abs/2405.10894.
We investigate the decidability of the monadic second-order (MSO) theory of the structure $\langle \mathbb{N};<,P_1, \ldots,P_k \rangle$, for various unary predicates $P_1,\ldots,P_k \subseteq \mathbb{N}$. We focus in particular on `arithmetic' predicates arising in the study of linear recurrence sequences, such as fixed-base powers $\mathsf{Pow}_k = \{k^n : n \in \mathbb{N}\}$, $k$-th powers $\mathsf{N}_k = \{n^k : n \in \mathbb{N}\}$, and the set of terms of the Fibonacci sequence $\mathsf{Fib} = \{0,1,2,3,5,8,13,\ldots\}$ (and similarly for other linear recurrence sequences having a single, non-repeated, dominant characteristic root). We obtain several new unconditional and conditional decidability results, a select sample of which are the following:
- The MSO theory of $\langle \mathbb{N};<,\mathsf{Pow}_2, \mathsf{Fib} \rangle$ is decidable;- The MSO theory of $\langle \mathbb{N};<, \mathsf{Pow}_2, \mathsf{Pow}_3, \mathsf{Pow}_6 \rangle$ is decidable
- The MSO theory of $\langle \mathbb{N};<, \mathsf{Pow}_2, \mathsf{Pow}_3, \mathsf{Pow}_5 \rangle$ is decidable assuming Schanuel's conjecture
- The MSO theory of $\langle \mathbb{N};<, \mathsf{Pow}_4, \mathsf{N}_2 \rangle$ is decidable
- The MSO theory of $\langle \mathbb{N};<, \mathsf{Pow}_2, \mathsf{N}_2 \rangle$ is Turing-equivalent to the MSO theory of $\langle \mathbb{N};<,S \rangle$, where $S$ is the predicate corresponding to the binary expansion of $\sqrt{2}$. (As the binary expansion of $\sqrt{2}$ is widely believed to be normal, the corresponding MSO theory is in turn expected to be decidable.) These results are obtained by exploiting and combining techniques from dynamical systems, number theory, and automata theory.
This is joint work with Valérie Berthé, Toghrul Karimov, Joël Ouaknine, Mihir Vahanwala, and James Worrell.
We focus on formulae $ \exists X. \varphi (\vec{Y}, X) $ of monadic second-order logic over the full binary tree, such that the witness $ X$ is a well-founded set. The ordinal rank $\mathit{rank}(X) < \omega_1$ of such a set $X$ measures its depth and branching structure. We search for the least upper bound for these ranks, and discover the following dichotomy depending on the formula $ \varphi$. Let $ \eta_{\varphi }$ be the minimal ordinal such that, whenever an instance $\vec{Y} $ satisfies the formula, there is a witness $ X$ with $ \mathit{rank} (X) \leq \eta_{\varphi }$. Then $ \eta_{\varphi }$ is either strictly smaller than $ \omega^2$ or it reaches the maximal possible value $\omega_1$. Moreover, it is decidable which of the cases holds. The result has potential for applications in a variety of ordinal-related problems, in particular it entails a result about the closure ordinal of a fixed-point formula.
A function f uniformizes a relation R(X,Y) if R(X,f(X)) holds for every X in the domain of R. The uniformization problem for a logic L and a structure M asks whether for every L-definable relation there is an L-definable function that uniformizes it. A logic L has the uniformization property (in a structure M) if every relation definable by L formula has L definable uniformizer.
Gurevich and Shelah (1983) proved that no Monadic Second-Order (MSO) definable function uniformizes relation ``Y is a one element subset of X'' in the full binary tree. In other words, there is no MSO definable choice function in the full binary tree, and the uniformization property fails for MSO. We provide sufficient conditions for an MSO definable relation to have an MSO definable uniformizer.
Descriptive complexity grew up as a theory drawing ideas and problems from several different sources. We list four of these sources: (1) finite model theory, (2) automata theory, (3) relational database theory, and (4) computational complexity theory. The leading thread of this overview talk is to present five open problems in descriptive complexity theory coming from these areas. These are problems that I consider particularly important or beautiful and none appears to be impossible to solve. At least two of these problems ask to give a new proof of a known fact just using different methods. However, I hope that their solution may be useful to eventually achieve the promise that descriptive complexity has aspired to from its early days: to make new progress on the fundamental questions of computational complexity theory by using the methods of mathematical logic.
Session 3: Reachability, Invariants, Infinite-State Systems
We study the reachability problem for one-counter automata in which transitions can carry disequality tests. A disequality test is a guard that prohibits a specified counter value. This reachability problem has been known to be NP-hard and in PSPACE, and characterising its computational complexity has been left as an open question by Almagor, Cohen, Pérez, Shirmohammadi, and Worrell. This problem has proven over the years to be very challenging. We present some new results on this topic, and in particular a new complexity upper bound at the third level of the polynomial hierarchy. This is a joint work with Dmitry Chistikov, Jerome Leroux and Henry Sinclair-Banks, to be published at CONCUR'24 under the name "Invariants for One-Counter Automata with Disequality Tests".
We investigate the model of Petri nets with data, where tokens carry values from some fixed data domain, and executability of transitions is conditioned by relations between data values involved. We study Petri nets with equality data, i.e., an infinite data domain with equality as the only relation. The most fundamental decision problem for Petri nets, the reachability problem, asks, given a net together with source and target configurations, if there is a run from source to target. It is well known that the reachability problem is undecidable for Petri nets with ordered data, while the decidability status of this problem for equality data still remains an intriguing open question. On the other hand, the coverability problem (where we ask if there is a run from source to a configuration that possibly extends target by some extra tokens) is decidable for both equality and ordered data. As widely known, coverability easily reduces to reachability. Furthermore, the reachability problem is decidable, also for equality and ordered data, in the special case of reversible Petri nets (where transitions are closed under reverse), as recently shown by Ghosh and Lasota. We study a relevant decision problem, sandwiched between reachability and the two latter decidable ones: the bi-reachability problem (called also reversible reachability problem or mutual reachability problem). It asks, for a net and two its configurations, if each of the configurations is reachable from the other one. In other words, the problem asks if two given configurations are in the same bi-reachability equivalence class. As our main result we prove decidability of this problem for equality data domain. This pushes further the decidability border for Petri nets with equality data, subsuming decidability of coverability, and of reachability in reversible Petri nets with equality data. The decision procedure for bi-reachability is inspired by the classical decomposition approach used to decide reachability in plain Petri nets.
A linear dynamical system is a pair:
- a point $\mathbf x \in\mathbb Q^d$,
- a $d\times d$ matrix $M$,
where the positive integer $d$ is called the ambient dimension. The orbit of the system is defined as the infinite sequence of points in $\mathbb Q^d$ $$ \mathbf x, M\mathbf x, M^2 \mathbf x, \ldots $$ that we obtain by repeatedly applying the linear function which is given in the form of the matrix $M$. We are interested in designing algorithms for deciding whether the orbit of a given system intersects some given target. The latter may be a hyperplane (i.e. the set of points that is a solution to a linear equation), a halfspace (i.e. the set of points that is a solution to a linear inequality), or a more general set of points given via polynomial (in)equalities. It is still unknown whether there exist a procedure for deciding even hyperplane reachability.
The short talk will be based on a recent paper, joint work with Toghrul Karimov, Joel Ouaknine, and James Worrell.
In that paper we considered the slightly more general question of multiple reachability, namely not merely reaching the target, but reaching it at least $k$ times; where $k$ is given as input. This problem is surprisingly difficult. We prove that it is undecidable in ambient dimension $d=10$, even when $k$ is fixed to $k=9$. The main theorem however is about solving this problem on the plane, $d=2$, which seems to be the only reasonable subclass to treat. For this we use some new techniques based on intersections of varieties with all algebraic subgroups of certain dimension. Notably, the main tool of this area, Baker's lower bounds on linear combinations of logarithms, does not seem to work.
In the talk I will present the problem and sketch the solution in case $d=2$, using those new techniques. The problem and solution lend themselves to a nice visual illustration which is appropriate for a short talk.
We study existence and computability of finite bases for ideals of polynomials over infinitely many variables. In our setting, variables come from a countable logical structure A, and embeddings from A to A act on polynomials by renaming variables. We give a sufficient and necessary condition for A to guarantee the following generalisation of Hilbert’s Basis Theorem: every polynomial ideal which is equivariant, i.e. invariant under renaming of variables, is finitely generated.
Co-authors: Philipp Czerner, Javier Esparza, Christoph Welzel-Mohr
Regular transition systems (RTS) are a popular formalism for modeling infinite-state systems in general, and parameterised systems in particular. In a CONCUR 22 paper, Esparza et al. introduce a novel approach to the verification of RTS, based on inductive invariants. The approach computes the intersection of all inductive invariants of a given RTS that can be expressed as CNF formulas with a bounded number of clauses, and uses it to construct an automaton recognising an overapproximation of the reachable configurations. The paper shows that the problem of deciding if the language of this automaton intersects a given regular set of unsafe configurations is in EXPSPACE and PSPACE-hard.
We introduce regular abstraction frameworks, a generalisation of the approach of Esparza et al., very similar to the regular abstractions of Hong and Lin. A framework consists of a regular language of constraints, and a transducer, called the interpretation, that assigns to each constraint the set of configurations of the RTS satisfying it. Examples of regular abstraction frameworks include the formulas of Esparza et al., octagons, bounded difference matrices, and views. We show that the generalisation of the decision problem above to regular abstraction frameworks remains in EXPSPACE, and prove a matching (non-trivial) EXPSPACE-hardness bound.
In a seminal paper, Alon et al. showed that one can test membership in a regular language using a constant number of queries, more precisely with $O(\log^3 (1/\epsilon)/\epsilon)$ queries, where $\epsilon > 0$ is a precision parameter. More recently, Bathie and Starikovskaya showed that $O(\log (1/\epsilon)/\epsilon)$ queries suffice and that there exists a language that requires that many queries. However, some easier languages can be tested with $O(1/\epsilon)$ queries, and some ``trivial'' languages require only $O(1)$ queries. This leads to the following questions:
- Does every regular language have one of the above complexities?
- Is there a characterization of the regular languages in each complexity class?
We answer both questions, showing that these are the only three classes, and give a combinatorial characterization based on the set of minimum blocking factors of the language.
Based on a joint work with Nathanaël Fijalkow and Corto Mascle.
Session 4: Tree Automata
Over words, nondeterministic Büchi automata are as expressive as parity automata with any number of priorities. Over trees, the Büchi acceptance condition is strictly weaker and the more priorities we allow, the more languages parity automata can recognise. We say that on words, the parity-index hierarchy of nondeterministic automata collapses to the Büchi level, while it is infinite over trees. Here, we ask when is Büchi enough?, that is, exactly on which classes C of trees can any parity automaton be turned into a Büchi automaton that recognises the same language, when restricted to trees in C. We work in the setting of unordered trees, that is, trees in which there is no order among the successors of nodes. We find that both for alternating and nondeterministic automata, the parity-index hierarchy collapses to the Büchi level for any class of trees of bounded Cantor-Bendixson rank, a topological measure of tree complexity. We show that this characterisation is exact for nondeterministic automata, that is, there is a language that is not recognised by any nondeterministic Büchi automaton on any class of trees of unbounded Cantor-Bendixson rank.
This is joint unpublished work with Nathan Lhote.
The dynamic membership problem for a formal language L asks the following: given some input data D, check whether D belongs to the language L, and then receive updates to apply to D and maintain after every update the information of whether the current version of D belongs to L or not.
Among other contexts, dynamic membership has been studied for regular languages over words (in the RAM model with logarithmic word size), for substitution updates that change one letter of the word each time. Specifically, our ICALP'21 article with Louis Jachiet and Charles Paperman has identified three main complexity regimes: languages for which dynamic membership can be solved in constant time per update; languages with a complexity of Theta(log n / log log n) on words of length n; and languages with an intermediate complexity of O(log log n) and (conditionally) no O(1) algorithm. However, this works leaves open the complexity of the dynamic membership problem over different kinds of data.
One interesting setting is that of trees. In particular, dynamic membership for trees generalizes the existential marked ancestor problem that has already received some attention: we have a tree whose nodes can be marked or unmarked by updates, and we receive queries that ask whether a specific node has a marked ancestor. It is known that updates and queries must take time at least Omega(log n / log log n) for this problem, which gives a similar bound for dynamic membership for some specific languages. What is striking is that the word languages corresponding to existential marked ancestor are very tame: over words, the analogous language would be aperiodic and enjoy a O(log log n) algorithm. Hence, the complexity landscape for trees is different than for words: it cannot be understood simply by looking at the horizontal and vertical monoids in isolation, and requires new algebraic tools.
In this work-in-progress talk, we will present our ongoing efforts to understand the complexity of the dynamic membership problem for fixed regular languages on trees (rooted, ordered, and unranked). We focus on the simple case of relabeling updates (the tree structure is fixed), and on languages featuring a neutral letter (corresponding to tree nodes that can be replaced by the ordered list of their children). In this setting, we aim at understanding, for each fixed regular tree language, what is the best achievable worst-case complexity to maintain membership to the language after each relabeling. We will present our current results giving a conditional characterization of languages maintainable in constant time, and our ongoing efforts to understand which languages can be maintained in sublogarithmic time. The talk will in particular present some of the tantalizing open problems and specific tree languages that prevent us from achieving a complete classification.
This work is a collaboration which has involved (at various degrees and at various times) Corentin Barloy, Paweł Gawrychowski, Louis Jachiet, and Charles Paperman.
Similar to the case of finite string automata, there is a long tradition of adding counting mechanisms to finite tree automata in order to increase their expressiveness, resulting in two interesting models (among others): Parikh tree automata (PTA) increment a number of counters when computing a tree and eventually test their membership in a semilinear set. On the other hand, one-counter tree automata (OCTA) use only one counter that can be incremented, decremented and tested for zero during a computation. The calculation principles for the counting mechanisms work orthogonally in both approaches – while OCTA split their computations at each node and execute them pathwise, PTA allow a global view: their counters are incremented over the whole input tree. When comparing the expressive power of the two models, one will find differences that stem not only from the specific storage types used, but also from the global or non-global view of the input.
This talk will report on two recent works, where we exchanged the computation mode of both tree automata models:
We introduce global one-counter tree automata (GOCTA) deviating from OCTA by working on one counter which is passed through the tree in lexicographical order, rather than duplicating the counter at every branching position. When comparing the capabilities of GOCTA and OCTA, we obtain that their classes of recognizable tree languages are incomparable. Moreover, we show that the non-emptiness problem of GOCTA is undecidable while, in stark contrast, their membership problem is in P.
In a similar spirit, we define a non-global version of Parikh tree automata. Here, we copy and distribute the current configuration at each node to all its children, incrementing the counters pathwise, and check the arithmetic constraint at each leaf. Again we obtain that, in contrast to the original definition of PTA, non-emptiness becomes undecidable if we allow the automaton to use at least three counters. However, we can prove decidability of non-emptiness for a (still powerful) restriction of the model.
This talk is based on joint work with Richard Mörbitz [1] and Johannes Osterholzer [2].
[1] L. Herrmann, R. Mörbitz: Global One-Counter Tree Automata. Proceedings of CIAA 2024, to appear [2] L. Herrmann, J. Osterholzer: Non-Global Parikh Tree Automata. Proceedings of NCMA 2024, to appear
The Mostowski index problem for tree languages is a long-standing problem: given a tree language, what is the minimal number of priorities required to recognize it with a non-deterministic parity automaton? In their 2008 paper, Colcombet and Löding reduced this problem to a boundedness question on distance-parity automata. This proof used complex machinery which makes it less accessible than ideal, and uses the less standard model of distance parity automata. In this joint work with Karoliina Lehtinen, we revisit this result, and propose a simplified proof. We combined tools developed by Lehtinen to solve parity games in quasi-polynomial time, and hierarchical counters as used by Colcombet and Löding. We hope to use this result to relate the Mostowski index problem with the theory of universal trees, as developed to solve parity games.
Joint work with Daniela Petrişan. This paper provides a unifying category-theoretic framework for minimization and learning algorithms for bottom-up tree automata with effects. Our aim is two-fold: encompass existing algorithms for various forms of tree automata – deterministic bottom-up tree automata, residual finite tree automata, tree automata weighted over a field – and instantiate the abstract framework in order to obtain new results – tree automata weighted over prinicipal ideal domains (PIDs).
We study a model of computation where executions are represented as trees, with branches corresponding to local executions of processes and branching indicating the spawning of new threads. Processes communicate by acquiring and releasing locks, as well as by reading from and writing to shared variables.
We will discuss the restrictions that enable us to achieve a decidable verification problem against regular tree specifications. For instance, we can require that processes acquire and release locks in a nested fashion, similar to how items are managed in a stack, or limit the number of times the process writing to the variables switches. The problem is decidable when processes follow this nested lock policy and when the number of writer switches is bounded. However, the problem becomes undecidable if either of these restrictions is lifted.
Session 5: Protocols & Agents
Population protocols are a model of distributed computation intended for the study of networks of independent computing agents with dynamic communication structure. Each agent has a finite number of states, and communication opportunities occur nondeterministically, allowing the agents involved to change their states based on each other's states. Population protocols are often studied in terms of reaching a consensus on whether the input configuration satisfied some predicate.
A desirable property of a computation model is modularity, the ability to combine existing simpler computations in a straightforward way. In the present paper we present a more general notion of functionality implemented by a population protocol in terms of multisets of inputs and outputs. This notion allows to design multiphase protocols as combinations of independently defined phases. The additional generality also increases the range of behaviours that can be captured in applications (e.g. maintaining the role distribution in a fleet of servers).
We show that composition of protocols can be performed in a uniform mechanical way, and that the expressive power is essentially semilinear, similar to the predicate expressive power in the original population protocol setting.
Population protocols are a well-studied model of distributed computation in which a crowd of anonymous finite-state agents communicate via pairwise interactions. Together they decide whether their initial configuration, i.e., the initial distribution of agents in each state, satisfies a property. Population protocols with unordered data (PPUD) is an extension introduced by Blondin and Ladouceur (ICALP'23) to express properties of multisets over an infinite data domain. In PPUD, each agent carries a fixed data value, and the interactions between agents depend on the equality of their data. Blondin and Ladouceur also identified the interesting subclass of immediate observation PPUD (IOPPUD), where in every transition one of the two agents remains passive and does not move, and they characterised its expressive power. We consider the decidability and complexity of formally verifying these protocols. We show that checking if a PPUD is well-specified, i.e., whether it correctly computes some function, is undecidable in general. By contrast, for IOPPUD, we exhibit a large yet natural class of problems, which includes well-specification among other classic problems, and establish that these problems are decidable in EXPSPACE. We also provide a lower complexity bound, namely coNEXPTIME-hardness.
This is joint work with S. van Bergerem, R. Guttenberg, S. Kiefer, C. Mascle, and N. Waldburger. It is the subject of an article accepted at ICALP 2024.
In this talk I would like to present the results of my recent paper ``Learning Broadcast Protocols" published in AAAI’24, which provides several results regarding the learnability of fine broadcast protocols.
The abstract of the AAAI'24 follows:
The problem of learning a computational model from examples has been receiving growing attention. For the particularly challenging problem of learning models of distributed systems, existing results are restricted to models with a fixed number of interacting processes. In this work we look for the first time (to the best of our knowledge) at the problem of learning a distributed system with an arbitrary number of processes, assuming only that there exists a cutoff, i.e., a number of processes that is sufficient to produce all observable behaviors. Specifically, we consider fine broadcast protocols, these are broadcast protocols (BPs) with a finite cutoff and no hidden states. We provide a learning algorithm that can infer a correct BP from a sample that is consistent with a fine BP, and a minimal equivalent BP if the sample is sufficiently complete.
On the negative side we show that (a) characteristic sets of exponential size are unavoidable, (b) the consistency problem for fine BPs is NP hard, and (c) that fine BPs are not polynomially predictable.
The paper can be accessed via : https://ojs.aaai.org/index.php/AAAI/article/view/29089
In the modeling of fair exchange protocols between several agents, the intended behavior of each agent should yield an execution where everyone obtains all the messages (e.g. signatures on a contract). Malicious agents may however try to prevent others to reach their goals, as long as they themselves do achieve their objective. Furthermore, nothing prevents several agents to team up together to the detriment of another. In this presentation, we argue that a "safe" protocol can be adequately be modeled using the concept of Strong Secure Equilibria (SSE): Strong because it is immune to a coalition of agents, and Secure because agents would harm others even if they get no benefit from it, as long as themselves are not harmed. Under the assumption that agents are rational and would not discard their benefit to harm others, no one should deviate from the intended behavior.
We study SSE in the context of games on graphs when the objective of each player is given by an $\omega$-regular language. We are concerned with two (groups of) problems: checking whether a given set of strategy profiles only contains SSEs, and deciding the existence of an SSE profile. Several variants are studied depending on how objectives and strategy profiles are provided. We provide tight complexity bounds and provide complexities for a fixed number of players.
This presentation is based on submitted joint work with Léonard Brice (ULB), Jean-François Raskin (ULB), Guillaume Scerri (ENS Paris-Saclay), and Marie Van Den Bogaard (UGE).
Full-information protocols are an idealistic representation of distributed systems in which one process gets access to the entire observation history of another process whenever they communicate. However, communication may be intermittent; with gaps of increasing length, an unbounded amount of information can accumulate before being revealed. In the talk, we outline a game-theoretic model of full-information protocols for synchronous distributed systems and sketch a solution for the basic strategy synthesis problem together with a non-elementary lower bound. Further, we relate the question of deciding whether a full-information protocol can be implemented with bounded shared memory to the monadic decomposability problem for regular relations.
This is joint work with Laurent Doyen and Thomas Soullard, a preliminary presentation on the synthesis results are available at https://arxiv.org/abs/2307.01063.
Parc des Sports Saint-Michel in Bordeaux
Thursday, Sep 19, 2024
I present three case studies from my collaborative research that highlight the essential role of logic in enhancing our understanding of modern machine learning architectures. The first two examples focus on the expressive capabilities of two prominent architectures: Transformers, which have revolutionized NLP applications, and Graph Neural Networks, a leading approach for classifying graph-structured data. We employ temporal logic techniques to analyze the properties that Transformers can recognize, and modal logics to examine the properties discernible by Graph Neural Networks. The third example addresses the pursuit of explainable AI, demonstrating how first-order logic can be used to design languages that declare, evaluate, and compute explanations for decisions made by machine learning models.
Session 6: Automata
Given an NFA, we can assing to each word $w$ the zero-one matrix describing the action of $w$ on the states of this NFA. The rank of $w$ is then defined as the Boolean rank of its matrix. In particular, for DFAs, the rank of a word $w$ is just the size of the image of the whole set of states under the action of $w$, a natural notion in the theory of synchronising automata. Unambiguous NFAs can be seen as a generalisation of DFAs that preserves many of their nice properties. I will discuss the space and time complexity of finding the minimum rank of a word in a given unambiguous NFAs. This is a joint work with Stefan Kiefer.
Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A_n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A_n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question -- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA -- was resolved in \cite{ACJR19}. The FPRAS due to \cite{ACJR19} relies on the interreducibility of counting and sampling, and computes, for each pair of state q and natural number $i \leq n$, a set of $O(\frac{m^7 n^7}{epsilon^7})$ many uniformly chosen samples from the set of words of length i that have a run ending at q (\epsilon is the error tolerance parameter of the FPRAS). This informative measure -- the number of samples maintained per state and length -- also affects the overall time complexity with a quadratic dependence. Given the prohibitively high time complexity, in terms of each of the input parameters, of the FPRAS due to \cite{ACJR19}, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: Is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we demonstrate that significant improvements in time complexity are achievable. Specifically, we have reduced the number of samples required for each state to be independent of m, with significantly less dependence on n and ϵ, maintaining only $\widetilde{O}(\frac{n^4}{epsilon^2})$ samples per state.
This is joint work with Kuldeep S. Meel (University of Toronto) and Sourav Chakraborty (ISI Kolkata) and will appear in PODS 2024.
In this presentation, we introduce a novel method for active learning of a minimal-sized Visibly one-counter automaton ($VOCA$) using SAT solver. We follow Angluin's $L^*$ algorithm. The algorithm uses only a polynomial number of membership or equivalence queries. All the existing literature, including the algorithm presented by Neider and Löding for learning $VOCA$, requires exponentially many queries relative to the size of a minimal $VOCA$ recognising the given language. Our approach eliminates the necessity to observe the $VOCA$'s behaviour up to exponentially large counter values. This sets it apart from existing techniques that rely on identifying repetitive patterns in behaviour graphs of exponential size.
Michaliszyn and Otop observed that the problem of finding the minimal $VOCA$ is ${NP}$-Complete. If a polynomial time algorithm for learning the minimal $VOCA$ exists, it could be employed to minimise $VOCA$ in polynomial time. Consequently, a learning algorithm for $VOCA$ that achieves minimal representation in polynomial time does not exist.
In the latter part of the work, we looked at learning Deterministic real-time one-counter automata ($DROCA$). However, we make use of an additional query, called counter-value query - the student gives the teacher a word, and the teacher returns the counter-value reached on traversing the word in the $DROCA$. We show that learning $DROCA$ can done using polynomially many queries within this framework. This improves the existing result by Bruy`ere et al. that needed an exponential number of queries, space and time for learning $DROCA$.
This is a joint work with Dr. Vincent Penelle (Univ. Bordeaux) and Dr. Sreejith A.V (IIT Goa).
We talk about the class of data languages recognised by deterministic finite-memory automata (DMA). We provide an RPNI-based passive learning algorithm that learns DMAs from positive and negative examples. Additionally, we provide a polynomial-sized characteristic sample that learns canonical representation for any DMA-recognizable language. This is a joint work with Emmanuel Filiot and Raffaella Gentilini.
Parikh’s Theorem is a fundamental result in automata theory with numerous applications in computer science. These include software verification (e.g. infinite-state verification, string constraints, and theory of arrays), verification of cryptographic protocols (e.g. using Horn clauses modulo equational theories) and database querying (e.g. evaluating path-queries in graph databases), among others. Parikh’s Theorem states that the letter-counting abstraction of a language recognized by finite automata or context-free grammars is definable in Linear Integer Arithmetic (a.k.a. Presburger Arithmetic). In fact, there is a linear-time algorithm computing existential Presburger formulas capturing such abstractions, which enables an efficient analysis via SMT-solvers. Unfortunately, real-world applications typically require large alphabets (e.g. Unicode, containing a\u2004\u200dmillion of characters) — which are well-known to be not amenable to explicit treatment of the alphabets — or even worse infinite alphabets.
Symbolic automata have proven in the last decade to be an effective algorithmic framework for handling large finite or even infinite alphabets. A symbolic automaton employs an effective boolean algebra, which offers a symbolic representation of character sets (i.e. in terms of predicates) and often lends itself to an exponentially more succinct representation of a language. Instead of letter-counting, Parikh’s Theorem for symbolic automata amounts to counting the number of times different predicates are satisfied by an input sequence. Unfortunately, naively applying Parikh’s Theorem from classical automata theory to symbolic automata yields existential Presburger formulas of exponential size. In this paper, we provide a new construction for Parikh’s Theorem for symbolic automata and grammars, which avoids this exponential blowup: our algorithm computes an existential formula in polynomial-time over (quantifier-free) Presburger and the base theory. In fact, our algorithm extends to the model of parametric symbolic grammars, which are one of the most expressive models of languages over infinite alphabets. We have implemented our algorithm and show it can be used to solve string constraints that are difficult to solve by existing solvers.
This is joint work with Anthony Widjaja Lin and Artur Jeż.
Automatic sequences are a family of sequences described by finite automata with output. The sequence (a_n) is automatic in base b, or b-automatic, if there is an automaton with output A such that A outputs a_n when given the base-b representation of n as an input. Such sequences were introduced by Büchi in 1960 and provide a notion of ease of definability with regards to base b, in the sense that properties related to base-b expansions correspond to b-automatic sequences. Additionally, the level of complexity of a sequence can be measured by the number of states of the smallest automaton that produces it.
Periodic sequences are another family of sequences that occupy a low level on the scale of complexity. For instance, they are exactly the sequences that have bounded factor complexity (a result by Morse and Hedlund; on the other hand, automatic sequences have at most linear factor complexity). These are linked to automatic sequences by Cobham's theorem (1972) that characterises sequences simultaneously automatic in multiple bases.
In this work, we further investigate the link between periodicity and description by automata, by characterising the sizes of smallest base-l automata that generate periodic sequences of period k. This enables a better understanding of sequences on the lower end of the complexity scale.
Joint work with Manon Stipulanti, Luca Prigioniero and Eric Rowland. See Magic Numbers in Periodic Sequences, in: Anna Frid, Robert Mercas (eds), Combiatorics on Words. WORDS 2023. LNCS vol.13899, Springer, Cham.
Higher-dimensional automata (HDAs) provide a general and automata-like model for non-interleaving concurrency. Some years ago we have set out to create a proper automata theory for HDAs. We now have a Kleene theorem, a Myhill-Nerode theorem, and a Büchi-Elgot-Trakhtenbrot theorem, providing a solid basis for a higher-dimensional automata theory. In my talk I will start with a gentle introduction to HDAs, then give an overview of these three results, and finally hint at applications and future work.
Session 7: Modal & Temporal Logic
We present our recent work in the intersection of two significant fields in graph theory: random graphs and model checking. We randomly generate a Kripke structure $\mathbb{K}{n, {p}} $ on $n$ states based on a tuple of probabilities $\boldsymbol{p}$ and investigate the probability that $\mathbb{K}{n, \boldsymbol{p}} \models \varphi$ as $n$ tends to infinity for any LTL or CTL formula $\varphi$. We show several important results:
-
For an LTL or a CTL formula $\varphi$, $\lim_{n \rightarrow \infty} {\bf P}(\mathbb{K}_{n, \boldsymbol{p}} \models \varphi)$ is either $0$ or $1$, known as a 0-1 law.
-
For an LTL formula $\varphi$, the asymptotic probability of $\mathbb{K}_{n, \boldsymbol{p}} \models \varphi$ is $1$ if and only if $\varphi$ is a tautology. The corresponding decision problem is PSPACE-complete.
-
We establish a polynomial-time algorithm to decide whether $\lim_{n \rightarrow \infty} {\bf P}(\mathbb{K}_{n, \boldsymbol{p}} \models \varphi)$ is $0$ or $1$ for any CTL formula $\varphi$.
Result 1 for CTL can be directly derived from a corresponding statement for FO(IFP) [1, Corollary 3.2]. Notably, the algorithm presented in [1, Corollary 4.2] operates in exponential time. In contrast, our Result 3 offers a substantial improvement over this by providing a more efficient solution.
Here are the practical implications of our work. One way to test a newly developed model checker is to feed it some randomly generated Kripke structures. These results suggest that using random structures as inputs for a model checker may not be necessary as almost all inputs yield identical outcomes. Consequently, employing random models to evaluate a newly developed model checker proves to be of limited utility.
[1] Andreas Blass, Yuri Gurevich, and Dexter Kozen. A zero-one law for logic with a fixed-point operator. Information and Control, 67(1–3):70–90, October–December 1985.
Modal Logic ML is a fragment of First-order logic known for its good model-theoretic and algorithmic properties. We will consider ML interpreted in ordinal models. These are structures with some unary predicates and a single binary relation which is a descending, well-founded, strict linear order > on the universe. Since every such order is isomorphic to an ordinal number, we call these models ordinal models. I will show that, assuming finiteness of the vocabulary, ML over ordinal models is compact. That is, if T is a set of ML formulae over finitely many symbols and every finite fragment of T has an ordinal model then so does the entire T. Both the result and the proof (using Higman's Lemma as its technical heart) are arguably surprising.
We consider the problem of learning temporal logic formulas from examples of system behavior. Learning temporal properties has crystallized as an effective mean to explain complex temporal behaviors. Several efficient algorithms have been designed for learning temporal formulas. However, the theoretical understanding of the complexity of the learning decision problems remains largely unexplored, although a recent publication on arXiv studies specifically different questions regarding the complexity of learning Linear Temporal Logic (LTL) formulas [1]. To address this, we study the complexity of the passive learning problems of three prominent temporal logics, Linear Temporal Logic cited above, Computation Tree Logic (CTL) and Alternating-time Temporal Logic (ATL) and several of their fragments. We show that learning formulas using any binary operator is $\mathsf{NP}$-complete for all of these logics. On the other hand, when investigating the complexity of learning formulas using only unary operators, we exhibit discrepancies between the complexity of learning LTL, CTL and ATL formulas (with a varying number of agents). This presentation is based on a work submitted to the conference Knowledge Representation 2024, and it is a joint work with Rajarshi Roy and Daniel Neider.
[1] Corto Mascle, Nathanaël Fijalkow, and Guillaume Lagarde. Learning temporal formulas from examples is hard. CoRR, abs/2312.16336, 2023.
In LTL learning we have positive and negative traces (finite words) and we want to find the shortest LTL formula that satisfies all the positive traces but none of the negatives ones. To help with LTL learning, we study the Boolean set cover problem: it is a generalization of the set cover problem where we allow union and intersection and try to obtain a set with all positives without the negatives. We use it to solve approximately but efficiently LTL learning instances. We first try to solve the instances by generating all LTL formulas of a bounded size, and then use the boolean set cover heuristics to find a short formula. We present different approaches and heuristics - principally a greedy one using D&C and a notion of 'score' and another using set covers - and compare their efficiency with Scarlet, a state-of-the-art tool for LTL learning. We also show how to generate a type of benchmarks called patterns (formulas, positive and negative traces) to test the limits of our algorithms.
This is based on joint work with Gabriel Bathie, Nathanaël Fijalkow, Théo Matricon and Pierre Vandenhove.
In this work we studied the complexity of classical modal logics and their extension with fixed-point operators, using translations to transfer results across logics. In particular, we showed several complexity results for multi-agent logics via translations to and from the mu-calculus and modal logic, which allow us to transfer known upper and lower bounds. We also use these translations to introduce a terminating tableau system for the logics we study, based on Kozen’s tableau for the μ-calculus, and the one of Fitting and Massacci for modal logic. In the talk I will focus on our method for tackling tableau-based derivations that may not terminate, for achieving decidability and upper complexity bounds, and discuss how we used this method to prove the decidability of certain modal logics that do not have a finite model property. This is joint work with Luca Aceto, Antonis Achilleos, Adrian Francalanza, and Anna Ingolfsdottir.
Session 8: Probabilistic Systems
Partially observable Markov decision processes (POMDPs) form a prominent model for uncertainty in sequential decision making. We are interested in constructing algorithms with theoretical guarantees to determine whether the agent has a strategy ensuring a given specification with probability 1. This well-studied problem is known to be undecidable already for very simple omega-regular objectives, because of the difficulty of reasoning on uncertain events. We introduce a revelation mechanism which restricts information loss by requiring that almost surely the agent has eventually full information of the current state. Our main technical results are to construct exact algorithms for two classes of POMDPs called weakly and strongly revealing. Importantly, the decidable cases reduce to the analysis of a finite belief-support Markov decision process (MDP). This yields a conceptually simple and exact algorithm for a large class of POMDPs.
Unobservable Markov decision processes (UMDPs) serve as a prominent mathematical framework for modeling sequential decision-making problems and they are equivalent to Probabilistic Finite Automata. A key aspect in computational analysis is decidability. In general, the computation of the exact and approximated values is undecidable for UMDPs, even with reachability objectives. We look into a decidable subclass for the long-run average objective.
Building on matrix product theory and ergodic properties, we introduce a novel subclass of UMDPs, termed ergodic UMDPs. In this class, as in general ergodic systems, the initial state of the dynamic is irrelevant after a long time. Our main result is that approximating the value within this subclass is decidable. Moreover, at the same time, the exact problem remains undecidable. Finally, we discuss the primary challenges of extending these results to partially observable Markov decision processes.
Co-authors: Krishnendu Chatterjee, David Lurie, Bruno Ziliotto.
Markov decision processes (MDPs) are extensively used to model and verify systems characterized by both probabilistic and nondeterministic behavior. Using probabilistic model checking for the automated verification of such models has shown to be successful for their analysis. However, the state-space explosion problem constitutes a significant obstacle when applying these methods in practice. In order to tackle this problem, multiple abstraction-refinement techniques have been established where the goal is to create a smaller abstract model by excluding irrelevant parts from the concrete system therefore simplifying its analysis. In this work, we generalize the abstraction-refinement framework introduced in [1] to MDPs with multiple reachability objectives. We use stochastic two-player games for the abstractions. The main idea is to keep a clear distinction between the nondeterminism of the original MDP and the nondeterminism introduced during the abstraction process, assigning each type to a separate player in the resulting game. This enables the computation of lower and upper bounds for the Pareto frontiers of the achievable values. Indeed, depending on the player handling the nondeterminism from the abstraction, we obtain different values if the players cooperate or are antagonistic. The difference between these bounds offers a numerical evaluation of the quality of the abstraction and serves as the foundation for the refinement method that we introduce.
This is based on joint work with Nicolas Lecomte, Mickael Randour and Petr Novotný. This work was supported by the Fonds de la Recherche Scientifique – FNRS under Grant n° T.0188.23 (PDR ControlleRS). Mickael Randour is an F.R.S.-FNRS Research Associate and a member of the TRAIL Institute.
[1] M. Kattenbelt, M. Kwiatkowska, G. Norman and D. Parker, A game-based abstraction-refinement framework for Markov decision processes, FMSD, 2010.
We study strategy requirements for multi-objective queries in Markov decision processes (MDPs). A multi-objective query is defined by finitely many real payoff functions over plays and a threshold for each function. A query asks whether there is a strategy such that, for all payoffs, the expectancy of the payoff is strictly greater than its respective threshold when playing according to the strategy.
Pure strategies, i.e., strategies that do not resort to randomisation, do not suffice for such queries in general. In this work, we analyse the randomisation requirements for multi-objective queries. We identify a strict subclass of randomised strategies that suffices to satisfy multi-objective queries. Given a query with $d$ payoffs, we show that it can be satisfied by mixing at most $d$ pure strategies, i.e., by randomly selecting one of $d$ pure strategies and committing to it for the whole play. We also provide sufficient conditions that extend the previous property to queries with non-strict inequalities.
This talk is based on ongoing joint work with Mickael Randour.
James C. A. Main is a Fonds de la Recherche Scientifique – FNRS Research Fellow and a member of the TRAIL institute. Mickael Randour is a Fonds de la Recherche Scientifique – FNRS Research Associate and a member of the TRAIL institute. This work has been supported by the Fonds de la Recherche Scientifique – FNRS under Grant n° T.0188.23 (PDR ControlleRS).
The population control problem asks whether an arbitrarily large population of finite state machines can be successfully controlled using a discrete time control where at every step, the chosen action is uniformly applied to all machines. The goal of the controller is to eventually put all machines in a final state.
This is typically what happens if you organize a conference and have to chose every day a unique program for the whole audience which is interesting enough so that all participants to attend all talks. The larger the audience, the harder it is.
The case where the machines are non-deterministic is called the population control problem. It has been tackled by Bertrand et al. in https://arxiv.org/abs/1807.00893.
The case where the machines are MDPs is called the random population control problem. It has been shown decidable by Colcombet et al. in https://arxiv.org/abs/1911.01195 using Dickson's Lemma, the min-cut max-flow duality and distance automata. The exact complexity is still opened. A lower bound is EXPTIME, as shown by Mascle at al. https://arxiv.org/abs/1909.06420.
We show that Corto's conjecture about the random case: an arbitrarily large random population can be controlled using solely 0,1, \omega's and small quadratic constants. This almost-surely closes the complexity gap.
Partially Observable Markov Decision Processes (POMDPs), which combine non-determinism, probability and partial observability, have gained popularity in various applications as a model of planning in an unsafe and only partially observable environment. Unfortunately, typical objectives of interest such as reachability or total reward already result in undecidable problems. Finding a strategy cannot be done algorithmically while guaranteeing optimality w.r.t. the objective. Consequently, heuristics to synthesize practically well-performing strategies became of significant interest.
We design a highly scalable postprocessing step, which improves the quality and the representation of the strategy. Our procedure learns a compact representation of the given strategy as a Mealy machine using automata-learning techniques. First, through learning the complete strategy, we get its automaton representation, which is fully equivalent and thus achieves the same value. Second, we provide heuristics learning small modifications of the strategy, specifically when the strategy is not defined, and also when it explicitly states that it is unsure about its choice (such as at the cut-off points).
The talk is based on the same-titled paper at TACAS, 2024. This is a joint work with Alexander Bork, Kush Grover, Jan Křetı́nský and Stefanie Mohr.
The classic halting problem asks if a given computer program, featuring expressive repetition constructs such as while loops or recursion, eventually terminates on some given input. For probabilistic programs, which involve randomness in the form of coin flipping or sampling numbers from probability distributions, various notions of termination exist.
In this talk, I focus on positive almost sure termination (PAST), i.e., halting in finite expected time. The PAST problem typically becomes decidable if restricted to programs that can run on a machine using finite or otherwise restricted memory, e.g. structured as an unbounded stack. Indeed, Esparza et al.'s probabilistic pushdown automata (pPDA) can model precisely the class of procedural probabilistic programs over finite data types with unbounded recursion depth.
I will present a novel technique for proving PAST of pPDA. It relies on synthesizing a certificate: an independently checkable proof of termination, not too dissimilar from a classic ranking function. Checking these certificates boils down to verifying that some given rational numbers satisfy a set of polynomial inequalities. This can be done in polynomial time (in the size of the pPDA and the certificate) and is moreover straightforward to implement. The numbers in the certificate may, however, require exponential precision. The main result is that these certificates are not only sound but also complete - they are guaranteed to exist for every pPDA that is PAST. The latter result is achieved by linking qualitative aspects of termination (e.g. expected finite or infinite running time) to algebraic properties of the automaton's underlying equation systems (e.g. whether a matrix is singular or not).
The presentation is based on two papers published at LICS '23 and TACAS '23 [1,2].
[1] https://doi.org/10.1109/LICS56636.2023.10175714
[2] https://doi.org/10.1007/978-3-031-30820-8_24
Session 9: Vector Addition Systems
Vector Addition Systems with States (VASS), equivalent to Petri nets, are a well-established model of concurrency. A d-VASS can be seen as directed graph whose edges are labelled by d-dimensional integer vectors. While following a path, the values of d nonnegative integer counters are updated according to the integer labels. The reachability problem asks whether there is a run from a given starting configuration to a given target configuration. When the input is encoded in binary, reachability is computationally intractable: even in dimension one, it is NP-hard.
This presentation with detail the tractability border of the problem when the input is encoded in unary. As a main result, we prove that reachability is NP-hard in unary encoded 3-VASS, even when structure is heavily restricted to be a Simple Linear Path Scheme (SLPS). The underlying graph structure of an SLPS is just a path with self-loops at each node. This lower bound improves upon a recent result of Czerwiński and Orlikowski [LICS 2022], in both the number of counters and expressiveness of the considered model. It also answers open questions of Englert, Lazić, and Totzke [LICS 2016] and Leroux [PETRI NETS 2021].
This presentation will also showcase an exceedingly weak model of computation that is SPLS with counter updates in {−1,0,+1}. Here, we show that reachability is NP-hard when the dimension is bounded by O(α(k)), where α is the inverse Ackermann function and k bounds the size of the SLPS. To further trace the tractability border, the presentation will also touch on a neighbouring upper bound. Extending upon the main result of Englert, Lazić, and Totzke [LICS 2016], we present a polynomial-time algorithm for reachability in unary 2-SLPS when the initial and target configurations are specified in binary.
This presentation concerns work with Dmitry Chistikov, Wojciech Czerwiński, Filip Mazowiecki, Łukasz Orlikowski, and Karol Węgrzycki that will appear in FOCS'24.
A pushdown vector addition system with states (PVASS) extends the model of vector addition systems with a pushdown stack. The algorithmic analysis of PVASS has applications such as static analysis of recursive programs manipulating integer variables. Unfortunately, reachability analysis, even for one-dimensional PVASS is not known to be decidable. We relax the model of one-dimensional PVASS to make the counter updates continuous and show that in this case reachability, coverability, and boundedness are decidable in polynomial time. In addition, for the extension of the model with lower-bound guards on the states, we show that coverability and reachability are in NP, and boundedness is in coNP.
Affine vector addition system with states (Affine VASS) is an expressive model of computation that finds applications in many areas of computer science. Affine VASS consists of finitely many control states and a set of counters which hold values over the natural numbers. These counters can be updated by means of affine operations, i.e., during a transition, the current vector of counter values can be incremented/decremented or it can be multiplied by means of a matrix. The expressivity of affine VASS comes at a cost, as all interesting verification problems are either undecidable or have very high complexity. Hence, this necessitates the study of tractable over-approximations of the reachability relation of affine VASS.
Here, we study one such over-approximation called affine continuous VASS in which counters are allowed to have non-negative rational values and whenever an increment/decrement is performed, it is first scaled by an arbitrary non-zero fraction. We investigate the tractability of affine continuous VASS with respect to the reachability, coverability and state-reachability problems for different classes of affine operations and we prove an almost-complete classification of the decidability of these problems. Namely, except for the coverability problem for a single family of classes of affine operations, we completely determine the decidability status of these problems for all classes. Furthermore, except for this single family, we also complement all of our decidability results with tight complexity-theoretic upper and lower bounds. \t This work has been accepted for publication at LICS 2024.
Recently there is a lot of research on unambiguous systems, but still Vector Addition Systems with States (VASSs) have not been fully understood in this context. For a given VASS, accepting by a set of states, it is easy to decide whether it is unambiguous (for each accepted word there is exactly one accepting run). However, up to now there were no techniques developed for proving that given language cannot be accepted by any unambiguous VASS. In this talk I would like to present results of my Master Thesis done under supervision of Wojciech Czerwiński. In this presentation I plan to talk about two main results of my Thesis and describe shortly crucial techniques which had to be developed.
The first main result can be formulated as follows: Given VASS $V$ it is undecidable whether language of this VASS is recognised by an unambiguous VASS.
In order to prove this result I had to develop techniques for showing, that given language is not recognised by any unambiguous VASS. These techniques are based on pumping techniques for unambiguous VASSs and can be used for showing, that languages such as $\{a^nba^mba^k | n \geq m \lor n \geq k \}$ or $a^nba^{\leq 2^n}$ are not recognised by any unambiguous VASS.
The second result can be defined as follows: Given unambiguous VASS $A$ and nondeterministic VASS $B$ it is undecidable whether $L(A)=L(B)$.
The motivation for studying this problem comes from the fact, that language equivalence, although undecidable in general, is decidable for unambiguous VASSs. Therefore one can hope, that language equivalence is decidable if one VASS is unambiguous and the second one is nondeterministic. The above result, unfortunately, states the opposite. The result can be actually generalised to a case in which we restrict one of the VASSs to be deterministic and the other one to be history-deterministic. Moreover one can generalise it further to any equivalence relation between language equivalence and two-sided simulation (that means VASSs $A$ and $B$ are in the relation if and only if VASS $A$ can simulate VASS $B$ and also VASS $B$ can simulate VASS $A$). The proof is actually a modification of the famous proof of Jančar.
Vector addition systems (VAS), also known as Petri nets, are a popular model of concurrent systems. Configurations are tuples in $\mathbb{N}^d$, and a step consists of adding a given vector in $\mathbb{Z}^d$. Many problems from many areas reduce to the reachability problem for VAS, which consists of deciding whether a target configuration of a VAS is reachable from a given initial configuration. Many interesting techniques have been developed for normal VAS, for example Leroux has shown that VAS admit semilinear inductive invariants, which leads to an algorithm for reachability based on two semi-algorithms: If the target is reachable, guess the path, otherwise guess the separating semilinear inductive invariant and verify it. Recently interest has emerged in trying to extend these properties to VAS with additional mechanisms, like VAS with a pushdown stack, (unordered) data or, as considered in this talk, with nested zero tests, called Priority VAS. We give a new characterization of reachability relations of Priority VAS as regular expressions over reachability relations of standard VAS and use it to reprove that also VAS with nested zero tests admit semilinear inductive invariants. Furthermore, we prove that every semilinear Priority VAS is flattable, i.e. intuitively that nested loops can be removed if the Priority VAS is semilinear. This result was unknown even for a single zero test. The corresponding paper was published in ICALP 2024 and is available at https://doi.org/10.4230/LIPIcs.ICALP.2024.141
Vector Addition Systems with States (VASS) are finite-state machines equipped with counters over nonnegative integers that can be incremented and decremented. The main decision problem associated with VASS is reachability: given a VASS and a target configuration (a state and a value for each counter), decide whether the VASS can reach the target configuration starting from its initial state with all counters set to zero. This problem has been known to be decidable for over forty years, and its complexity has recently been precisely characterized. However, reachability remains underexplored in several notable extensions of VASS. Among these extensions are Branching VASS (BVASS), which are VASS with special branching transitions that merge two configurations. Reachability in BVASS of dimension 1 is known to be PSPACE-complete (with counters encoded in binary), but its decidability status remains open in higher dimensions.
One of the earliest decidability results for reachability in VASS is due to Hopcroft and Pansiot. They showed that the reachability set of a two-dimensional VASS is semilinear and provided an algorithm for computing a semilinear representation of it. We have extended this algorithm to BVASS, demonstrating that two-dimensional BVASS also have semilinear and computable reachability sets. The main challenge in our work was proving the termination of the generalized algorithm.
This is joint work with Thibault Hilaire, Jérôme Leroux and Grégoire Sutre.
We consider the reachability problem for Vector Addition Systems with States (VASS) in dimension three. In the dimension two the problem is known to be PSpace-complete since 2015. However, already for three-dimensional VASS (3-VASS) there is a huge gap in understanding of the complexity and currently the problem is only known to be PSpace-hard and in Tower.
We show that the reachability problem in binary 3-VASS (a 3-VASS with numbers on transitions encoded in binary) can be decided in exponential space. This does not fix the complexity, but substantially improves the upper bound. Our main contribution states that if there is a path between two its configurations then there is also a path of at most doubly-exponential length, which immediately implies an ExpSpace algorithm.
It is challenging to obtain below-Tower complexity as there exists 3-VASS of finite, but almost Tower-big reachability sets (concretely speaking k-fold exponential, for any k ∈ N). We introduce a novel technique of sandwiching reachability sets between two semilinear sets, which have small representation and behave similarly. This intuitively allows us to deal with big, finite sets in time faster than their size. We also make use of recent results about the form of reachability paths in 2-VASS, sequential cones and other techniques.
The presentation is based on a joint work with Ismael Jecker, Sławomir Lasota and Łukasz Orlikowski.
Session 10: Games
This talk addresses complexity problems in rational verification and synthesis for multi-player games played on weighted graphs, where the objective of each player is to minimize the cost of reaching a specific set of target vertices. In these games, one player, referred to as the system, declares his strategy upfront. The other players, composing the environment, then rationally make their moves according to their objectives. The rational behavior of these responding players is captured through two models: they opt for strategies that either represent a Nash equilibrium or lead to a play with a Pareto-optimal cost tuple. This is joint work with Véronique Bruyère et Jean-François Raskin, part of a paper that will be published in CONCUR 2024.
We study multiplayer reachability games played on a finite graph. However, instead of studying the classical notion of strategy, we consider the concept of multi-strategy. A multi-strategy for Player i prescribes a set of possible actions when it is Player i's turn to play, instead of a single action. Thus, once a multi-strategy is fixed for each player, there are several paths in the game graph that are consistent with these multi-strategies from a given initial vertex. In this setting, we aim at synthesizing the most permissive multi-strategies. The permissiveness of multi-strategies may be compared in different ways. We here extend the concept of penalty of a multi-strategy, already defined in a two-player zero-sum setting in Bouyer at al. [1], to the multiplayer setting. This penalty depends on weights associated with edges not chosen by the multi-strategy, and we prefer a multi-strategy with a penalty as small as possible.
Once the notions of permissive Nash equilibrium and subgame perfect equilibrium are properly defined, our aim is to decide the existence of a permissive equilibrium that satisfies some constraints on the penalties (one upper-bound penalty is fixed per player). Since the existence of such permissive equilibrium does not provide any certainty about the satisfaction of the reachability objective of the players, we also study permissive equilibria with bounded penalties and which satisfy some properties on the set of players who satisfy their objective.
This is a joint work with Benjamin Monmege.
[1] Patricia Bouyer, Marie Duflot, Nicolas Markey, Gabriel Renault: Measuring Permissivity in Finite Games. CONCUR 2009
Sequential decision-making tasks often require satisfaction of multiple, partially-contradictory objectives. Existing approaches are monolithic, where a single policy (a function that chooses an action in each state) fulfills all objectives.
We present auction-based scheduling, a decentralized framework for multi-objective sequential decision making. Each objective is fulfilled using a separate and independent policy. Composition of policies is performed at runtime, where at each step, the policies simultaneously bid from pre-allocated budgets for the privilege of choosing the next action. The framework allows policies to be independently created, modified, and replaced.
We study path planning problems on finite graphs with two temporal objectives and present algorithms to synthesize policies together with bidding policies in a decentralized manner. \t\t We consider three categories of decentralized synthesis problems, parameterized by the assumptions that the policies make on each other. We identify a class of assumptions called assume-admissible for which synthesis is always possible for graphs whose every vertex has at most two outgoing edges. \t\t This is a joint work with Guy Avni and Kaushik Mallik, a version of which has been published at TACAS'24.
We consider two-player games over finite graphs in which both players are restricted by fairness constraints on their moves. Given a two player game graph $G=(V,E)$ and a set of fair moves $E_f\subseteq E$ a player is said to play fair in $G$ if they choose an edge $e\in E_f$ infinitely often whenever the source node of $e$ is visited infinitely often. Otherwise, they play unfair. We equip such games with two $\omega$-regular winning conditions $\alpha$ and $\beta$ deciding the winner of mutually fair and mutually unfair plays, respectively. Whenever one player plays fair and the other plays unfair, the fairly playing player wins the game. The resulting games are called fair $\alpha/\beta games. %Further, if $\alpha$ and $\beta$ are given by a parity condition over $G$ they are called fair parity/parity games.
We formalize fair $\alpha/\beta$ games and show that they are determined. For fair parity/parity games, i.e., fair $\alpha/\beta$ games where $\alpha$ and $\beta$ are given each by a parity condition over $G$, we provide a polynomial reduction to (normal) parity games via a gadget construction inspired by the reduction of stochastic parity games to parity games. We further give a direct symbolic fixpoint algorithm to solve fair parity/parity games. On a conceptual level, we illustrate the translation between the gadget-based reduction and the direct symbolic algorithm which uncovers the underlying similarities of solution algorithms for fair and stochastic parity games, as well as for the recently considered class of fair games in which only one player is restricted by fair moves.
This presentation is based on joint work with Daniel Hausmann, Nir Piterman and Anne-Kathrin Schmuck, published in FoSSaCS 2024: https://link.springer.com/chapter/10.1007/978-3-031-57228-9_2.
Strategy improvement is a natural and well-studied family of algorithms for solving various classes of stochastic and deterministic games. We present an improved upper bound of $O(n 2^n )$ on the number of iterations performed by the most natural, and most greedy, variant of the algorithm when applied to $n$-vertex Energy Games. We also obtain a similar upper bound on the expected number of iterations performed by Random-Edge, one of the most natural randomized variants of the algorithm. The analysis is achieved using the "layering technique", a novel analysis framework that applies for several algorithms for solving deterministic games.
Session 11: Verification, Monitoring, Synthesis
Runtime verification consists in checking whether a program satisfies a given specification by observing the trace it produces during its execution. In the regular setting, Hennessy-Milner logic with recursion (recHML), a variant of the modal $\mu$-calculus, provides a versatile back-end for expressing linear- and branching-time specifications. In this talk, I discuss an extension of this logic that allows to express properties over data values (i.e. values from an infinite domain) and examine which fragments can be verified at runtime. Data values are manipulated through equality tests in modalities and through first-order quantification outside of them. They can also be stored using parameterised recursion variables.
I then examine what kind of properties can be monitored at runtime, depending on the monitor model. A key aspect is that the logic has close links with register automata with non-deterministic reassignment, which yields a monitor synthesis algorithm, and allows to derive impossibility results. In particular, contrary to the regular case, restricting to deterministic monitors strictly reduces the set of monitorable properties.
I then conclude by examining whether we can delineate a maximally monitorable fragment for the logic, and the picture turns out to be quite intriguing.
This is joint work with the MoVeMnt team (Reykjavik University): Luca Aceto, Antonis Achilleos, Duncan Paul Attard, Adrian Francalanza, Karoliina Lehtinen.
We study the policy change problem that arises in the runtime verification of long-running systems. The online monitors typically used in this context are generally $treelike$, in that they maintain substates that monitor subformulae of the target policy. Each of these substates stores a compact representation of the event trace, preserving only the information needed to monitor the corresponding subformula. We consider when and how the policy can be changed while the monitored system is running by only exploiting the information stored in the monitor's state. This is relevant, for example, to account for new system functionality or changes in regulatory requirements.
We formally define the policy change problem in a general setting, independently of any specific (treelike) monitor implementation. We then show that deciding the possibility of policy change for past-time metric temporal logic with continuous semantics is decidable with non-primitive recursive complexity and EXPSPACE-hard, and that the same problem for discrete semantics is EXPSPACE-complete.
Joint work with François Hublet and Joshua Schneider
Event-driven programming is a popular paradigm where multiple handler programs are responsible for executing different tasks. The handlers contain mailboxes used to receive tasks to be executed which are communicated via messages. In addition there may also be shared-memory communication. A central problem in this setting is understanding whether an execution is consistent with the semantics of event driven programs under a particular memory model. Event driven semantics imposes the restriction that messages inside a particular handler cannot be interleaved. We explore the complexity of event driven consistency under the sequential consistency (SC) model for different datastructure semantics of the mailbox such as multiset, queue, stack etc. This is joint work with Parosh Aziz Abdulla, Mohamed Faouzi Atig and Govind Rajanbabu.
We consider the problem of hypersafety verification, i.e. of verifying k-safety properties of a program. While this can, in principle can be addressed by self composition, which reduces the k-safety verification task into a standard (1−) safety verification exercise, verifying self composed programs is not easy. The proofs often require that the functionality of every component program be captured fully, making invariant inference a challenge. Recently, a technique for property directed self composition (or, Pdsc) was proposed to tackle this problem. Pdsc tries to come up with a semantic self-composition function, together with the inductive invariant that is needed to verify the safety of the self-composed program. One of its crucial limitations, however, is that it relies on users to supply a set of predicates in which the composition and the invariant may be expressed. It is quite challenging even for a user to supply such a set of predicates – the set needs to be sufficiently expressive, so that the invariant can be expressed using those predicates (and their boolean combinations), but not overly expressive to increase the search space unnecessarily. This paper proposes a technique to automate Pdsc fully, by discovering new predicates whenever the given set is found to be insufficient. We present three different approaches for obtaining predicates – relying on syntax-guided synthesis, quantifier elimination, and interpolation – and discuss the strengths and limitations of these.
A loop invariant is a relation among variables that holds before and after every iteration of a program loop. Invariants provide inductive arguments that are key for formally verifying recursive programs. Automated generation of reasonable invariants is thus a much-desired step to proving program correctness. To understand what relations can be invariant for a loop with linear updates, we address the inverse problem--finding (synthesising) linear loops that satisfy given invariants. Loops synthesised modulo invariants are correct by design and no longer need to be verified.
In this line of work, we consider invariants specified by polynomial and, more specifically, quadratic equations with arbitrarily many variables. We show that already loops with linear updates (or linear dynamical systems) exhibit behaviours specified by arbitrary polynomial invariants from a broad class: e.g., quadratic equations or conjunctions of binomial equalities. We introduce algorithmic approaches that construct linear loops from invariants by generating linear recurrence sequences that have given algebraic relations amongst their terms. As an example, we provide a procedure that, given a quadratic equation, decides whether a loop satisfying this equation exists. If the answer is positive, the procedure synthesises a loop and ensures its variables achieve infinitely many different values.
This presentation is based on the results published in the proceedings of STACS 2024. This is joint work with S. Hitarth, George Kenison, and Laura Kovács.
Friday, Sep 20, 2024
In this talk, I will review some open problems for weighted automata over general semirings and discuss the reasons why the usual techniques that have been used to solve similar questions, have not been successfully used for those (yet(?)). The talk is aimed to a broad audience - especially to those who know nothing/not much about weighted automata!
Session 12: Timed Systems
In this talk, based on [1], we present an active learning algorithm for a general class of Mealy machines with timers (MMTs) in a black-box context. A Mealy machine is a finite state machine that outputs a sequence of symbols for every processed input word. We then augment it with timers that force certain transitions to occur after a certain amount of time has elapsed.
Our algorithm is an extension of the L# algorithm of Vaandrager et al. [2] to a timed setting, using a polynomial number of queries to infer an MMT from a teacher, when it is possible to figure out which of the preceding transitions caused a timeout. As shown in a previous work [3], this can be done efficiently for a subclass of MMTs that are "race-avoiding": if a timeout is caused by a preceding input then a slight change in the timing of this input will induce a corresponding change in the timing of the timeout. Experiments with a prototype implementation, written in Rust, show that our algorithm is able to efficiently learn realistic benchmarks.
This is a joint work with Véronique Bruyère (UMONS), Bharat Garhewal (Radboud Universiteit), Guillermo A. Pérez (UAntwerpen), and Frits W. Vaandrager (Radboud Universiteit). This work has been partially funded by the Belgian F.R.S-FNRS, the Belgian FWO and the Dutch NWO.
[1]: Véronique Bruyère, Bharat Garhewal, Guillermo A. Pérez, Gaëtan Staquet, and Frits W. Vaandrager. Active Learning of Mealy Machines with Timers. arXiv.
[2]: Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. TACAS 2022.
[3]: Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet, and Frits W. Vaandrager. Automata with timers. FORMATS 2023.
In this work, we revisit the active learning of timed languages recognizable by event-recording automata (ERA). Our framework employs a method known as grey-box learning, which enables the learning of ERA with the minimum number of states. This approach avoids learning the region automaton associated with the language, contrasting with existing methods. However, to achieve the learning of a minimum-state automaton, we must solve an NP-hard optimization problem. To circumvent this, we apply a heuristic, where the requirement for minimality is loosened, that computes a candidate automaton in polynomial time using a greedy algorithm. In our experiments, this algorithm strives to maintain a small (often minimum) automaton size. This is a joint work with Sayan Mukherjee and Jean-François Raskin. This work has been accepted in ATVA 2024.
Learning models for real-time systems is an active area of research. Timed Automata (TA) is a popular model for real-time systems. Learning TA being a difficult problem, several works have been dedicated to learning different subclasses of TA. Event-recording Automata (ERA) is a well-studied, determinizable subclass of TA.
In this talk, we will illustrate a state-merging algorithm -- called LEAP -- designed to learn timed languages definable by ERA. We adapt the classical framework of RPNI -- a passive learning algorithm for inferring deterministic fininite-state automata for regular languages -- to the timed setting. As inputs, we consider a set of positive and a set of negative symbolic words. Symbolic words are a natural formalism for specifying requirements by a human user. The objective is to construct an ERA that accepts every timed word that satisfies a positive symbolic word and rejects every timed word that satisfies a negative symbolic word. We show that, during the execution of the state-merging algorithm, determining when merging of two states is allowed, is an NP-complete problem -- which we solve by encoding it as an SMT formula. We conclude by showing that for every timed language L definable by an ERA, there exists a set of positive and negative symbolic words -- generally referred to as the characteristic set -- such that, whenever these words are provided as input, LEAP constructs an ERA whose language is L.
This is a joint work with Anirban Majumdar and Jean-François Raskin.
Joint work with Benoit Barbot, Catalin Dima and Youssouf Oualhadj.
We consider the synthesis problem on timed automata with Büchi objectives, where delay choices made by a controller are perturbed by a small amount. Usually, the controller needs to avoid punctual guards, such as testing the equality of a clock to a constant. In this work, we extend existing results to a robustness setting that allows for punctual transitions in the automaton to be taken by controller with no perturbation. In order to characterize cycles that resist perturbations in our setting, we introduce a new structural requirement on the reachability relation along an accepting cycle of the automaton. This property is formulated on the region abstraction, and generalizes previous characterization of winning cycles in the absence of punctual guards. We show that the problem remains within PSPACE despite the presence of punctual guards.
Timing leaks in timed automata (TA) can occur whenever an attacker is able to deduce a secret by observing some timed behavior. In execution-time opacity, the attacker aims at deducing whether a private location was visited, by observing only the execution time. It can be decided whether a TA is opaque in this setting. In this work, we tackle control, and show that we are able to decide whether a TA can be controlled at runtime to ensure opacity. Our method is constructive, in the sense that we can exhibit such a controller. This is joint work with Etienne André, Marie Duflot and Engel Lefaucheux.
Model checking for real-time systems is a fundamental problem in formal verification. The goal here is to check whether a system (modelled as a network of automata) satisfies a specification (provided by a logical formula). In the timed setting, Timed Automata serve as a well-established model for real-time systems, with Metric Interval Temporal Logic (MITL) being the standard formalism for specifying properties. One of the central challenges here is an efficient logic-to-automata translation for Metric Interval Temporal Logic (MITL).
Recently, we proposed a new model, called Generalized Timed Automata (GTA), that unifies the features of various models such as timed automata, event-clock automata, and automata with timers. The model comes with several powerful additional features, and yet, the best known zone-based reachability algorithms for timed automata have been extended to the GTA model, with the same complexity for all the zone operations.
In this work, we first propose a logic-to-automata translation from MITL to GTA. Our modular translation leverages the powerful features of GTA to generate automata with fewer states, transitions, and clocks. Further, since the various formalisms used for modelling real-time systems are also captured by GTAs, thanks to this translation, MITL model checking reduces to checking liveness for GTAs. However, no liveness algorithm is known for GTAs. The liveness algorithms for timed automata crucially rely on the presence of a finite time-abstract bisimulation, while such a relation is known not to exist for GTAs. As our second contribution, we propose the first algorithm for checking Büchi non-emptiness in GTAs, which circumvents this fundamental challenge.
Joint work with S Akshay, Paul Gastin and B Srivathsan.
In 2009, Cassez showed that the timed opacity problem, where an attacker can observe some actions with their timestamps and attempts to deduce information, is undecidable for timed automata. Moreover, he showed that the undecidability holds even for subclasses such as event-recording automata. In this talk based on an article written by Étienne André, Sarah Dépernet and Engel Lefaucheux, we consider the same definition of opacity for several other subclasses of timed automata: with restrictions on the number of clocks, of actions, on the nature of time, on a new subclass called observable event-recording automata, or on the number of observations made by the attacker. We show that opacity can mostly be retrieved, except for the notable subclass of one-action timed automata, for which undecidability remains.
Session 13: Logic & Arithmetic
In a recent work, we considered a version of Presburger arithmetic enriched by special variables, called parameters, which can be multiplied as if they were constants, but cannot be quantified in the same way as normal variables. Emulating the notion of semi-linear sets from Presburger arithmetic, we defined parametric semi-linear sets, functions which, given a valuation of the parameters, produce a semi-linear set. Given two parametric semi-linear set, our main problem was to decide whether there exists parameter valuations such that the two produced sets are equal. We showed that, unfortunately, this test is undecidable in general, as the parameters can be used to encode any multivariate polynomial. Moreover this undecidability holds even with strong restrictions on the degree of the polynomials in the parameters involved in the formula. Finally, we showed that under the additional restriction of using a single parameter, one can compute the set of parameter values achieving equality of the two sets.
At Highlights, we intend to present this result through an application to the opacity of one-clock timed automata.
All known quantifier elimination procedures for Presburger arithmetic require doubly exponential time for eliminating a single block of existentially quantified variables. It has even been claimed in the literature that this upper bound is tight. We observe that this claim is incorrect and develop, as the main result of this paper, a quantifier elimination procedure eliminating a block of existentially quantified variables in singly exponential time. As corollaries, we can establish the precise complexity of numerous problems. Examples include deciding (I) monadic decomposability for existential formulas, (ii) whether an existential formula defines a well-quasi ordering or, more generally, (iii) certain formulas of Presburger arithmetic with Ramsey quantifiers. Moreover, despite the exponential blowup, our procedure shows that under mild assumptions, even $\mathsf{NP}$ upper bounds for decision problems about quantifier-free formulas can be transferred to existential formulas. The technical basis of our results is a kind of small model property for parametric integer programming that generalizes the seminal results by von zur Gathen and Sieveking on small integer points in convex polytopes.
Accepted in ICALP 2024,
Arxiv Link: https://arxiv.org/abs/2405.01183
Authors: Christoph Haase$^1$, Shankara Narayanan Krishna $^2$, Khushraj Madnani $^3$, Om Swostik Mishra $^4$, Georg Zetzsche $^3$
$^1$ Department of Computer Science, University of Oxford, UK
$^2$ Department of Computer Science & Engineering, IIT Bombay, India
$^3$ Max Planck Institute for Software Systems (MPI-SWS), Germany
$^4$ Department of Mathematics, IIT Bombay, India
We present a new angle on solving quantified linear integer arithmetic based on combining the automata-based approach, where numbers are understood as bitvectors, with ideas from (nowadays prevalent) algebraic approaches, which work directly with numbers. This combination is enabled by a fine-grained version of the duality between automata and arithmetic formulae. In particular, we employ a construction where states of automaton are obtained as derivatives of arithmetic formulae: then every state corresponds to a formula. Optimizations based on techniques and ideas transferred from the world of algebraic methods are used on thousands of automata states, which dramatically amplifies their effect. The merit of this combination of automata with algebraic methods is demonstrated by our prototype implementation being competitive to and even superior to state-of-the-art SMT solvers.
This is a joint work with Peter Habermehl, Vojtěch Havlena, Lukáš Holík, and Ondřej Lengál that appeared at CAV'24.
We demonstrate that the model-checking problem for first-order logic with modulo counting quantification (FO+Mod) is fixed-parameter tractable on monadically stable classes of graphs. These classes extend the concepts of nowhere dense and structurally nowhere dense classes. A significant contribution of our work is the characterization of FO+Mod, as well as the more general counting logic FO(P), through an Ehrenfeucht-Fraïssé game, which we believe holds independent interest. By carefully adapting the techniques developed by Dreier, Mählmann, and Siebertz (2023), we are working towards achieving the desired results.
This is ongoing work, joint with Peter Rossmanith.
In the logical framework introduced by Grohe and Turán (TOCS 2004) for Boolean classification problems, the instances to classify are tuples from a logical structure, and Boolean classifiers are described by parametric models based on logical formulas. This is a specific scenario for supervised passive learning, where classifiers should be learned based on labelled examples. Existing results in this scenario focus on Boolean classification. This paper presents learnability results beyond Boolean classification. We focus on multiclass classification problems where the task is to assign input tuples to arbitrary integers. To represent such integer-valued classifiers, we use aggregate queries specified by an extension of first-order logic with counting terms called FOC1.
Our main result shows the following: given a database of polylogarithmic degree, within quasi-linear time, we can build an index structure that makes it possible to learn FOC1-definable integer-valued classifiers in time polylogarithmic in the size of the database and polynomial in the number of training examples.
This is joint work with Nicole Schweikardt.
Session 14: Quantitative Systems
Cost register automata (CRAs), like weighted automata, define functions of type $\Sigma^* \to \mathbb{K}$ for some semiring $\mathbb{K} = (K, \oplus, \otimes)$. They can be thought of as finite automata with an additional finite set $X$ of write-only registers holding values from $\mathbb{K}$: register values can be combined and updated with the operations $\oplus$ and $\otimes$, but no value tests are permitted.
Compared to weighted automata, variants of CRAs draw a more nuanced undecidability frontier. In particular, linear CRAs (deterministic) are equally expressive as WAs (inherently nondeterministic), and many other natural classes of CRA are incomparable with known variants of weighted automata.
The class of copyless linear CRAs over the min–plus semiring is a severely restricted yet still remarkably expressive one. Decidability of some of its standard decision problems has been established, a notable exception being the boundedness problem: "is the function given by a CRA bounded from above?". We show it decidable for the class of two-register copyless linear CRAs.
In my presentation, following an introduction to variants of CRAs, I will provide an overview of the challenges associated with boundedness and present the main ideas of our decidability proof.
Authors: Andrei Draghici, Radosław Piórkowski, Andrew Ryzhikov
We study the determinisation and unambiguisation problems of weighted automata over the field of rationals: Given a weighted automaton, can we determine whether there exists an equivalent deterministic, respectively unambiguous, weighted automaton? Recent results by Bell and Smertnig show that the problem is decidable, however they do not provide any complexity bounds. We show that both problems are in PSPACE for polynomially-ambiguous weighted automata.
Joint work with Ismaël Jecker and Filip Mazowiecki.
Safety and liveness are fundamental concepts in computer-aided verification. The safety-liveness classification of Boolean properties characterizes whether a given property can be falsified by observing a finite prefix of an infinite computation trace (always for safety, never for liveness). In the quantitative setting, properties are arbitrary functions from infinite words to partially-ordered domains.
First, we define quantitative safety and liveness, and prove that our definitions induce conservative quantitative generalizations of both the safety-progress hierarchy and the safety-liveness decomposition of Boolean properties. Consequently, like their Boolean counterparts, quantitative properties can be written as the pointwise minimum of a quantitative safety property and a quantitative liveness property. This work has been published in FoSSaCS 2023 Proceedings.
Second, we instantiate our framework with the classes of quantitative properties expressed by automata. These quantitative automata contain finitely many states and rational-valued transition weights, and their common value functions Inf, Sup, LimInf, LimSup, LimInfAvg, LimSupAvg, and DSum map infinite words into the totally-ordered domain of real numbers. In this automata-theoretic setting, we establish a connection between quantitative safety and topological continuity and provide alternative characterizations of quantitative safety and liveness in terms of their boolean analogs. For all common value functions, we provide a procedure for deciding whether a given automaton is safe or live, we show how to construct its safety closure, and we present a decomposition into safe and live automata. This work has been published in CONCUR 2023 Proceedings.
Third, we provide a program that implements the above decide procedures and constructions for Inf, Sup, LimInf, LimSup, LimInfAvg, and LimSupAvg quantitative automata given in an input text file. The performance of our tool is based on a quantitative generalization of the omega-regular language inclusion checker FORKLIFT (published in CAV 2022 Proceedings). This tool is currently unpublished.
Ésik and Kuich extended the theory of weighted formal languages to infinite words in 2005. There, automata are matrices over a semiring and Büchi acceptance is an operation over those matrices. We are extending the theory to other acceptance conditions (Muller, generalized Büchi, etc.) and face several challenges. Mainly, the former operator allows for a disjunction of constraints but now we need conjunction. I will briefly introduce the theory and give an overview of the challenges. Joint work with Uli Fahrenberg.
[see the attached PDF for the same abstract + references]
We study weighted basic parallel processes (WBPP), a nonlinear recursive generalisation of Schutzenberger's weighted finite automata inspired from process algebra and Petri net theory. Formally, they can be defined as communication free Petri nets, which means that each transition can remove a token from exactly one place; tokens can be added without restrictions. Each transition is labelled with a weight from a field, say the rationals $\mathbb Q$. The weight of a run is the product of the weights of the transitions occurring in the run; the weight of an input word is the sum of the weights of the runs accepting the word. (Alternatively, a WBPP can be seen as a weighted context-free grammar where the nonterminal symbols commute with each other.) Like BPP recognise languages $\subseteq \Sigma^$, WBPP recognise series $\Sigma^ \to \mathbb Q$. The WBPP-recognisable series form an interesting class of series generalising the rational series (those recognised by weighted finite automata). For example, 1) they are effectively closed under many operations such as scalar product, sum, shuffle, shuffle-inverse, and derivation; 2) a version of the implicit functions theorem applies to them, which provides a very general way to construct WBPP series from previous ones; 3) the commutative subclass of WBPP series (i.e., those for which the output weight does not depend on the order of the input symbols) corresponds to multivariate constructible differentially finite power series, introduced by Bergeron, Reutenauer, and Sattler in the context of combinatorial enumeration.
Our main result is an algorithm of 2EXPSPACE complexity for the WBPP equivalence problem. While (unweighted) BPP language equivalence is undecidable, we can use this algorithm decide multiplicity equivalence of BPP and language equivalence of unambiguous BPP, with the same complexity. Decidability of the latter problems was not known before. The corresponding problems for weighted context-free grammars are open since a long time.
The complexity analysis of the equivalence algorithm is based on effective bounds from algebraic geometry. More precisely, we study the length of chains of polynomial ideals constructed by repeated application of finitely many, not necessarily commuting derivations of a multivariate polynomial ring. This is obtained by generalising a result of Novikov and Yakovenko in the case of a single derivation. The fact that we obtain an elementary bound is noteworthy, since generic bounds on ideal chains are only non-primitive recursive in general.
There are numerous algorithms which try to enumerate weighted context free grammars as fast as possible as it as many applications such as program synthesis. The weights can be seen as costs of using a specific derivation rule in the grammar. The goal is to enumerate trees described by the grammar in non decreasing order, that is from the least cost to the highest cost. Note that grammars describe infinite numbers of programs. Delay is the complexity of producing the next program between the $i$th and the $i$th$+1$, parameters of the grammar are considered constant. Previous algorithms currently run in delay $O(log(i))$. We will briefly present the problem and then show the fundamental property that enable enumeration with delay $O(1)$. Then we will give a brief overview of the enumeration algorithm and we will show the practical applications.
This is joint work with Nathanaël Fijalkow, Guillaume Lagarde.
In 1909 Borel defined normal bit sequences, i.e., infinite bit sequences where all $k$-bit substrings appear with the same limit frequency $2^{-k}$ (for all $k$). This was the first definition of ``randomness'' for an individual object (bit sequence). Later other definitions were suggested by Mises, Church and others. In 1960s the notion of randomness was systematically studied in the framework of algorithmic information theory (Kolmogorov, Solomonoff, Levin, Schnorr, Chaitin and others), and stronger notions of randomness (and the notion of algorithmic dimension) were discovered.
Independently different results about normal sequences were established, e.g., Wall's theorem (the set of reals with normal binary representations is closed with respect to multiplication by rational factors), characterization of normality in terms of gambling (Agafonov, Schnorr), relation to finite-state compressibility (Lempel, Ziv and others).
In recent years it became clear that these results are different facets of finite-state algorithmic information theory. It turned out that basic notions of algorithmic information theory (complexity, a priori probability, dimension, conditional complexity) have some finite-state counterparts, and the results about normality. as well as some new characterization of normal sequences and finite-state dimension (the latter notion was introduced by Dai, Lathrop, Lutz and Mayordomo) become easy byproducts of this framework.
In the talk I will advertise this framework and related results (see details and references in https://arxiv.org/pdf/1701.09060, https://arxiv.org/abs/2403.01534)
Session 15: Automata on Infinite Structures
We define the class of explorable automata on finite or infinite words. This is a generalization of History-Deterministic (HD) automata, where this time non-deterministic choices can be resolved by building finitely many simultaneous runs instead of just one. We show that recognizing HD parity automata of fixed index among explorable ones is in PTIME, thereby giving a strong link between the two notions. We then show that recognizing explorable automata is EXPTIME-complete, in the case of finite words or parity automata up to index [0,2]. Additionally, we define the notion of $\omega$-explorable automata on infinite words, where countably many runs can be used to resolve the non-deterministic choices. We show EXPTIME-completeness for $\omega$-explorability of automata on infinite words for the safety and co-B\"uchi acceptance conditions. We finally characterize the expressivity of ($\omega$-)explorable automata with respect to the parity index hierarchy. We leave open the decidability of explorability for $[1,3]$-automata and $\omega$-explorability for Büchi automata, both equivalent to the general case of arbitrary acceptance conditions.
Tight automata are useful in providing the shortest counterexample in LTL model checking and also in constructing a maximally satisfying strategy in LTL strategy synthesis. There exists a translation of LTL formulas to tight Büchi automata and several translations of Büchi automata to equivalent tight Büchi automata. This paper presents another translation of Büchi automata to equivalent tight Büchi automata. The translation is designed to produce smaller tight automata and it asymptotically improves the best-known upper bound on the size of a tight Büchi automaton equivalent to a given Büchi automaton. We also provide a lower bound, which is more precise than the previously known one. Further, we show that automata reduction methods based on quotienting preserve tightness. Our translation was implemented in a tool called Tightener. Experimental evaluation shows that Tightener usually produces smaller tight automata than the translation from LTL to tight automata known as CGH.
The talk is based on a paper presented at FoSSaCS 2024 and co-authored by Marek Jankola.
In the literature on automata over infinite words, the acceptance of a run is usually defined based on the set of states visited infinitely often. Another option is to define it in terms of the set of transitions visited infinitely often. In this talk, I will survey a number of results that show the striking difference between the two models, and provide arguments in favour of using transitions as the standard acceptance method. (Talk based on the last chapter of the author's thesis: https://theses.hal.science/tel-04314678v1).
We present a translation from linear temporal logic with past to deterministic Rabin automata. The translation is direct in the sense that it does not rely on intermediate non-deterministic automata, and asymptotically optimal, resulting in Rabin automata of doubly exponential size. It is based on two main notions. One is that it is possible to encode the history contained in the prefix of a word, as relevant for the formula under consideration, by performing simple rewrites of the formula itself. As a consequence, a formula involving past operators can (through such rewrites, which involve alternating between weak and strong versions of past operators in the formula’s syntax tree) be correctly evaluated at an arbitrary point in the future without requiring backtracking through the word. The other is that this allows us to generalize to linear temporal logic with past the result that the language of a pure-future formula can be decomposed into a Boolean combination of simpler languages, for which deterministic automata with simple acceptance conditions are easily constructed.
This is joint work with Shaun Azzopardi and Nir Piterman, to appear in MFCS'24. The full version of the conference paper is available at https://arxiv.org/abs/2405.01178
We give a passive learning algorithm for languages recognized by history-deterministic co-Buchi automata. The algorithm can learn every language in this class in the limit, works in polynomial time with respect to a given sample, and for every language in the class, there is a characteristic sample of size polynomial in the minimal size of an automaton for the language. This is the first algorithm of this kind for any class of omega-languages, except for safety languages, that does not impose some additional restrictions on the congruences of the language or on the shape of automata recognizing the languages.
A nondeterministic parity automaton is history-deterministic if the nondeterminism while reading a word can be resolved based on the prefix read so far. We show that 2-token games characterise history-determinism on parity automata whose priorities are in [0,2] (following the min-even parity condition). This extends the 2-token game characterisation of history-determinism for Büchi automata (Bagnol, Kuperberg; 2018) and coBüchi automata (Boker, Kuperberg, Lehtinen, Skrzypczak; 2020), and results in a polynomial time procedure to recognise [0,2]-parity automata. Based on ongoing work with Karoliina Lehtinen.
Session 16: Logic & Data
We study the problem of minimizing conjunctive regular path queries: given a query $\gamma$ and a natural number $k$, is there a query which is semantically equivalent to $\gamma$ with at most $k$ atoms? The number of possible candidates (CRPQs with at most $k$ atoms) is not finite, since the regular languages used in such a query can be arbitrarily complex, and so the decidability of the problem is not trivial. We show that the problem is in 2ExpSpace and is ExpSpace-hard.
Moreover, we study stability results: a class of queries $\mathcal{C}$ is said to be stable when for every query $\gamma$ in $\mathcal{C}$, if $\gamma$ is not minimal, then there is a strictly smaller query $\gamma'$ in $\mathcal{C}$ which is equivalent to $\gamma$. Amongst other, we show that DAG-shaped queries are stable.
This is joint work with Diego Figueira and Miguel Romero.
A query model for sequence data was introduced in Kleest-Meißner et al. (Proc. ICDT 2022) in the form of subsequence-queries with wildcards and gap-size constraints (swg-queries, for short). These queries consist of a pattern $s$ over an alphabet of variables and types, as well as a global window size $w$ and a number of local gap-size constraints $c$. Intuitively, such queries describe situations of interest (e.g. abnormal job execution) within a certain range in historic event data in form of a query string and different window size constraints.
We propose two new extensions of swg-queries which both enrich the expressive power of swg-queries in different ways: disjunctive subsequence-queries (dswg-queries, for short) and subsequence-queries with generalised gap-size constraints (swgg-queries, for short). This way, we close up on the set of supported operators in the field of Complex Event Recognition (CER), which usually are sequencing, conjunction and disjunction, Kleene closure, negation, different kinds of window constraints, and variables which may be bound to events in a stream.
For dswg-queries, the pattern $s$ may consist of finite sets of types, modelling disjunctions, while for swgg-queries, $s$ is a sequence of single types and variables. In return the local gap-size constraints are extended to a set of tuples of form $(c^-,c^+)_i^r$ for swgg-queries, where $(c^-,c^+)\in (\mathbb{N} \times \mathbb{N} \cup \{\infty\})$ and $1\leq i,r \in\mathbb{N}$. They define lower and upper bounds on the length of a number of corresponding gaps of the subsequence, where $i$ and $r$ denote the starting position and range of consecutive gaps they are ranging over. In the case of swg- and dswg-queries such constraints can only be defined between consecutive positions of $s$.
A query $q$ matches in a sequence of types $t$ if the variables in $s$ can uniformly be substituted by types and, in case of a dswg-query, each disjunction $\chi$ in $s$ can be mapped to a single type $\gamma\in\chi$, such that the resulting string occurs in $t$ as a subsequence that spans an area of length at most $w$, and for each gap-constraint in $c$, the gap of the corresponding subsequence (i.e., the distance between the $i^{\text{th}}$ and $(i{+}1)^{\text{th}}$ or $(i{+}r)^{\text{th}}$, respectively, position of the subsequence) satifies the lower and upper bound defined by the constraint.
We discuss a suitable characterisation of containment, a classical property considered in database theory. Additionally, we adapt results concerning the discovery of swg-queries to both, swgg-queries and dswg-queries. This allows for the discovery of more detailed queries thanks to the increased expressive power of the underlying query language.
The original research was published in Frochaux, Kleest-Meißner, ``Puzzling over Subsequence-Query Extensions: Disjunction and Generalised Gaps'', Alberto Mendelzon International Workshop on Foundations of Data Management (AMW), 2023.
We study Satisfiability Modulo Theories (SMT) enriched with the so-called Ramsey quantifiers, which assert the existence of cliques (complete graphs) in the graph induced by some formulas. The extended framework is known to have applications in proving program termination (in particular, whether a transitive binary predicate is well-founded), and monadic decomposability of SMT formulas. Our main result is a new algorithm for eliminating Ramsey quantifiers from three common SMT theories: Linear Integer Arithmetic (LIA), Linear Real Arithmetic (LRA), and Linear Integer Real Arithmetic (LIRA). In particular, if we work only with existentially quantified formulas, then our algorithm runs in polynomial time and produces a formula of linear size. One immediate consequence is that checking well-foundedness of a given formula in the aforementioned theory defining a transitive predicate can be straightforwardly handled by highly optimized SMT-solvers. We show also how this provides a uniform semi-algorithm for verifying termination and liveness with completeness guarantee (in fact, with an optimal computational complexity) for several well-known classes of infinite-state systems, which include succinct timed systems, one-counter systems, and monotonic counter systems. Another immediate consequence is a solution to an open problem on checking monadic decomposability of a given relation in quantifier-free fragments of LRA and LIRA, which is an important problem in automated reasoning and constraint databases. Our result immediately implies decidability of this problem with an optimal complexity (coNP-complete) and enables exploitation of SMT-solvers. It also provides a termination guarantee for the generic monadic decomposition algorithm of Veanes et al. for LIA, LRA, and LIRA. We report encouraging experimental results on a prototype implementation of our algorithms on micro-benchmarks. This is joint work with Moses Ganardi, Anthony W. Lin, and Georg Zetzsche.
The past decade has witnessed substantial developments in string solving. Motivated by the complexity of existing string solving strategies, we propose as a simple but effective and generic method for solving string constraints: regular constraint propagation. This method repeatedly computes pre- or post-images of regular languages under the string functions present in a string formula, inferring more and more knowledge about the possible values of string variables, until either a conflict is found or satisfiability of the string formula can be concluded. Such a propagation strategy is applicable to string constraints with multiple operations like concatenation, replace, and almost all flavors of string transductions. We demonstrate the generality and effectiveness of the method theoretically and experimentally. On the theoretical side, we show that the method is sound and complete for a large fragment of string constraints, subsuming both straight-line and chain-free constraints, which are the hitherto largest decidable fragments of string constraints supported by string solvers. One common reason for the unsatisiability of a string constraint is due to the interaction between string operations and the regular constraints. Consider, for example, the string constraint $\varphi := x = yy \wedge x \in a^+b^+ \wedge y \in (a^+|b^+)$, which is unsatisfiable because the right hand side of the equation admits only strings of the form $a^+$ or of the form $b^+$, which are not admitted by the left hand side. One simple yet effective method in proving such an unsatisfiability is regular constraint propagation. Essentially, this means propagating regular constraints between the left-hand side and right-hand side of string equations, which could contain various string functions including concatenation, replaceAll, etc. This idea has been formalized in a proof system that is implemented in the OSTRICH string solver, the 2023 winner of the SMT-COMP QF\_S category of pure string constraints. In addition to concatenation and regular constraints, OSTRICH has a good support of various complex string functions, including replaceAll, transducers, reverse, and string functions from JavaScript library related to real-world regular expressions that are not immediately available in most other string solvers.
In this presentation, we demonstrate that even simple strategies can solve a wide range of benchmarks in string solving. We build upon a sound proof system established in previous work, highlighting that the main challenge in such a system is the search for a proof. Joint work with Matthew Hague, Artur Jeż, Anthony W. Lin and Philipp Rümmer
Session 17: Neural Networks
The behavior of neural networks (NNs) on previously unseen types of data (out-of-distribution or OOD) is typically unpredictable. This can be dangerous if the network's output is used for decision making in a safety-critical system. Hence, detecting that an input is OOD is crucial for the safe application of the NN. Verification approaches do not scale to practical NNs, making runtime monitoring more appealing for practical use. While various monitors have been suggested recently, their optimization for a given problem, as well as comparison with each other and reproduction of results, remain challenging.
\t
We present a tool for users and developers of NN monitors. It allows for
(i) application of various types of monitors from the literature to a given input NN,
(ii) optimization of the monitor's hyperparameters, and
(iii) experimental evaluation and comparison to other approaches.
Besides, it facilitates the development of new monitoring approaches. We demonstrate the tool's usability on several use cases of different types of users as well as on a case study comparing different approaches from recent literature.
This joint work with Muqsit Azeem, Marta Grobelna, Sudeep Kanav, Jan Kretinsky, and Stefanie Mohr was accepted at CAV24 as a tool paper.
We lay the foundations for a database-inspired approach to interpreting and understanding neural network models by querying them using declarative languages. Towards this end we study different query languages, based on first-order logic, that mainly differ in their access to the neural network model. First-order logic over the reals naturally yields a language which views the network as a black box; only the input-output function defined by the network can be queried. This is essentially the approach of constraint query languages. On the other hand, a white-box language can be obtained by viewing the network as a weighted graph, and extending first-order logic with summation over weight terms. The latter approach is essentially an abstraction of SQL. In general, the two approaches are incomparable in expressive power, as we will show. Under natural circumstances, however, the white-box approach can subsume the black-box approach; this is our main result. We prove the result concretely for linear constraint queries over real functions definable by feedforward neural networks with a fixed number of hidden layers and piecewise linear activation functions.
This talk is based on joint work with Martin Grohe, Juno Steegmans and Jan Van den Bussche which will be presented at ICDT'25.
Distance invariant machine learning models have found wide applications in fields such as image recognition, molecular property prediction, and natural language processing in recent years. A crucial requirement for these models is their independence from the particular ordering of data points or their specific positions in the Euclidean space. Instead, they should rely on the relative information between data points, that is, the distances between them in the Euclidean space. In other words, these models should be invariant to isometries -- translations, rotations, and symmetries. Such models are often based on ($k$-dimensional) graph neural networks ($k$-GNNs) applied to the distance graph induced by a given point cloud. The expressive power of $k$-GNNs has been shown to be limited by the $k$-dimensional Weisfeiler-Leman ($k$-WL) test. Our focus is on the expressivity of these models, specifically addressing the following question for different correspondences of $k$ and $d$: ``Does $k$-WL recognise all point clouds up to isometry in $\mathbb R^d$?'' The question was answered negatively by Pozdnyakov and Ceriotti (Machine Learning: Science and Technology, 2022) for $k = 1$ and $d = 3$. Furthermore, Rose et al. (NeurIPS 2023) found that $d$-WL identifies all point clouds up to isometry in $\mathbb R^{d+1}$ (for any $d \geq 1$). We show that their result is asymptotically tight by finding two non-isometric point clouds in $\mathbb R^{O(d)}$ that are not distinguished by $d$-WL. In the talk, we present the current state-of-the-art, discuss the basic idea of our construction, and formulate open questions for future research.
Formal language theory has recently been successfully employed to unravel the power of transformer encoders. This setting is primarily applicable in Natural Languange Processing (NLP), as a token embedding function (where a bounded number of tokens is admitted) is first applied before feeding the input to the transformer. On certain kinds of data (e.g. time series), we want our transformers to be able to handle arbitrary input sequences of numbers (or tuples thereof) without a priori limiting the values of these numbers. Here, we initiate the study of the expressive power of transformer encoders on sequences of data (i.e. tuples of numbers). Our results indicate an increase in expressive power of hard attention transformers over data sequences, in stark contrast to the case of strings. In particular, we prove that Unique Hard Attention Transformers (UHAT) over inputs as data sequences no longer lie within the circuit complexity class AC$^0$ (even without positional encodings), unlike the case of string inputs, but are still within the complexity class TC$^0$ (even with positional encodings). Over strings, UHAT without positional encodings capture only regular languages. In contrast, we show that over data sequences UHAT can capture non-regular properties. Finally, we show that UHAT capture languages definable in an extension of linear temporal logic with unary numeric predicates and arithmetics.
This talk is based on joint work with Pascal Bergsträßer, Anthony Widjaja Lin, and Georg Zetzsche. A preprint of this work can be found on arXiv (ID: 2405.16166).
If you find any issues related to how the abstract of your talk is presented, please contact the webmaster. The abstract is expected to be valid markdown, and it is rendered as such. We allow for inline math which is rendered via mathjax (see here for detailed information on compatibility).