Julie Cailler, Philipp Rümmer. October 2023.
With the increasing prevalence and complexity of computer systems, their reliability has become a crucial concern; this is in particular the case for computer systems that implement safety- or security-critical functionality, for instance computer systems embedded in vehicles or airplanes. Formal methods offer a way to build trust in such systems. Program verification, in particular, enables us to demonstrate the absence of bugs in a system or its components. Through rigorous mathematical reasoning, we build a proof of the correctness of a system, affirming that the system performs precisely as intended, with no deviations.
Formal methods rely on techniques for computer-assisted reasoning about logical formulas, an area within Computer Science that has developed rapidly over the last years. Logical reasoning is commonly used in everyday life by individuals, whenever they engage in argumentation or deduction. Reasoning tasks can also be carried out by machines, enabling formal reasoning about problems that are far too complex to be tackled by humans alone. In order to formally verify a system, both the system itself and the desired properties of the system are translated (automatically) to logical formulas, which can then be processed by reasoning tools to establish or falsify the correct behavior of the system. To achieve this, two major approaches are commonly employed: Interactive Reasoning and Automated Reasoning.
Interactive reasoning tools act like assistants, guiding a human user towards a proof while ensuring that no mistakes are introduced in the constructed derivation. Interactive reasoning finds substantial use in mathematics, supporting mathematicians in proving theorems, as well as for verifying critical components in systems with human safety implications. For instance, the B-Method was employed to verify the functionality of Paris Metro Line 14, the seL4 operating system was verified using the Isabelle/HOL proof assistant, and the proof assistant HOL Light was a key tool in settling the long-standing Kepler conjecture in 2017. Latest developments in interactive reasoning were recently presented at the Interactions of Proof Assistants and Mathematics summer school organized jointly by FIDS and the Faculty of Mathematics.
Automated reasoning tools are able to reason fully or partly automatically about logical formulas. Such reasoning tools are also called theorem provers, and are used extensively in areas including program verification and testing, scheduling, and also to solve problems in mathematics. Interactive and automated reasoning tools can be used in combination, for example by outsourcing simpler parts of a proof that is constructed interactively to automated theorem provers. There is a wide range of automated reasoning techniques, but all of them start with a mathematical version of a problem and a given context. Subsequently, they develop the mathematical statement and explore the space of interpretations, trying to find a model (e.g., an interpretation that satisfies the formula) or concluding with its unsatisfiability otherwise.
We illustrate the reasoning applied to verify the correctness of a (drastically simplified) system. Consider the scenario of a crossroad with two traffic lights: one for the cars and one for the pedestrians. These traffic lights are equipped with specifications that describe the interactions between its components. The traffic light can be Green (G), Orange (O), or Red (R), while the pedestrian light has two modes: Walk (W) and Stop (S).
To reason about the system, we need to make certain assumptions (or axioms) about the possible system behaviors. In more real-world scenarios, such assumptions are automatically extracted from the system implementation:
Our objective is to establish that the crossroad is properly managed to avoid accidents. In this model, an interesting property to examine is whether it could ever happen that the traffic light is Green while the pedestrian light is in Walk mode: can G and W be true at the same time, or, in more logical terms, is the formula G ∧ W satisfiable given our assumptions. A theorem prover can evaluate this scenario and prove, step by step, that G ∧ W is an impossibility. Since this proof is machine-checked, it gives high confidence in the correctness of the traffic light implementation.
Virtually, a prover simulates all possible scenarios and demonstrates that the target situation is unreachable in any case due to contradictions. Multiple proofs can be available for the same property: for instance, as depicted in the higher block, if the traffic light is green, then the pedestrian light is in stop mode (G → S). Moreover, stop mode is incompatible with the walk mode (#{S,W} ≤ 1), contradicting the initial statement (¬W). A similar deduction can also be made for the walk mode, allowing us to conclude in two different ways that G ∧ W is unreachable. By reasoning like this on all the features of the crossroad, we can certify that each component works as expected, and thus that the whole system is verified.
Some systems can be easily modeled, whereas others involving complex mechanisms require a higher expressivity level. Luckily, there is not one single type of logic, but rather a multitude, including various different logical constructs (syntax) and different ways to assign meaning to formulas (semantics). Propositional logic, which is the relatively simple logic used in our traffic light example, has the advantage of being decidable; this means that we are able to effectively compute whether a problem has a solution or not. The satisfiability problem in propositional logic is NP-complete, however, which implies that all known algorithms have exponential complexity. Advances in the proof systems, the algorithms, and the data-structures used in implementations over the past years have led to a surprising increase in the performance of theorem provers for propositional logic. Such provers, commonly known as SAT solvers, can today easily reason about formulas encoding industrial systems with millions of propositional variables.
More complex logics, for instance first-order logic or higher-order logic, allow us to express more complex statements (for example, mathematical problems), but proofs are much harder to find. Again, a lot of research effort has been invested in developing efficient automated theorem provers for such logics, and today’s systems are able to handle even very complex problems. In our group in Regensburg, we have in particular worked on methods to parallelize theorem provers, i.e., to make use of modern parallel hardware architectures for speeding up provers.
More generally, new logics emerge or are proposed all the time, leading to research questions such as:
Many applications also require dealing with rules inherent to the domain, called axioms. In our example, there are only a few axioms, but real-world problems may entail thousands of them. To make this situation manageable, certain standard sets of axioms are arranged as theories, corresponding to the data-types or libraries used in programming. A theory encapsulates knowledge from a particular domain (e.g., different kinds of arithmetic, set theory, or arrays) and is governed by specific rules and reasoning mechanisms. The development of dedicated theorem provers that can handle theories, also known as Satisfiability Modulo Theory solvers or SMT solvers, has become a particularly successful branch of research in recent years, and has transformed the formal methods field by making powerful building blocks available from which higher-level tools can easily be constructed.
Our research group in Regensburg has itself developed several SMT solvers tailored to problems in program verification, focusing on some of the most challenging theories in the field: among others, solvers for different variations of the theory of strings, which are related both to the problem of solving word equations over free monoids and to automata theory; solvers for different kinds of arithmetic, including problems in non-linear integer arithmetic and bounded arithmetic; and theories of sequences and arrays. The diversity of theories and logical fragments that can be considered makes SMT a highly interesting area of research, both from a theoretical and practical point of view.
Leonardo Mendonça de Moura, Nikolaj S. Bjørner: Satisfiability modulo theories: introduction and applications. Commun. ACM 54(9): 69-77 (2011)
John Harrison: Handbook of Practical Logic and Automated Reasoning. Cambridge University Press 2009, ISBN 978-0-521-89957-4, pp. I-XIX, 1-681
Interactions of Proof Assistants and Mathematics. International Summer School, Regensburg, Germany, 2023. https://itp-school-2023.github.io/
Raymon M. Smullyan. What is the Name of this Book?: The Riddle of Dracula and Other Logical Puzzles. Prentice-Hall, 1978.
Automated reasoning is a field within the wider area of artificial intelligence. In contrast to the more recent statistical methods in AI, and machine learning in particular, automated reasoning is a symbolic method that can reason about logical or mathematical statements with absolute certainty. Automated reasoning and machine learning seemingly complement each other extremely well, which is why many research groups are today trying to integrate machine learning methods into automated reasoning, or vice versa.
Formal methods are today mainly applied in a niche area of highly critical computer systems. We believe, for several reasons, that it is time for formal methods to leave that niche and become a more mainstream tool in software development. Beyond the reasons given in the article, one further point in case is that more and more computer programs will in the future be machine-generated, leading to the new problem of ensuring that those generated programs indeed behave as expected. In the same way as we should not blindly trust a text generated by ChatGPT, also machine-generated computer programs require careful checking. In contrast to
the case of texts in natural language, however, humans tend to be rather bad at checking program code: machine-generated programs should be machine-checked!
“What Is the Name of This Book?” by Raymond M. Smullyan – A fun way to introduce logical reasoning, the basis of automated reasoning. Through puzzle solving, it gives an insight into how our tools work to deal with problems. This is probably the most recreational and entertaining logic book ever written, and I extend my recommendation to readers of all ages! In the words of Melvin Fitting: “I now introduce Professor Smullyan, who will prove to you that either he doesn’t exist or you don’t exist, but you won’t know which.”