Assurance-based Learning-Enabled CPS

Cyber Physical Systems (CPS) are increasingly using Learning-Enabled Components (LEC) to implement complex functions. A LEC is a component (typically, but not exclusively, implemented in software) that is realized with the help of data-driven techniques, like machine learning. For example, a LEC in an autonomous car can implement a lane follower function such that one trains an appropriate convolutional neural network with a stream of images of the road as input and the observed actions of a human driver as output. The claim is that such LEC built via supervised learning is easier to implement than building a very complex, image processing driven control system that steers the car such that it follows the road. In other words, if the straightforward design and engineering is too difficult, a neural network can do the job – after sufficient amount of training.

For high-consequence systems the challenge is to prove that the resulting system is safe: it does no harm, and it is live: it accomplishes its goals. Safety is perhaps the foremost problem in autonomous vehicles, especially for ones that operate in a less-regulated environment, like the road network. The traditional technology for proving the safety of systems is based on extensively documented but often informal arguments – that are very hard to apply to CPS with LEC.

The team at Vanderbilt Institute for Software Integrated Systems is working on addressing these challenges by

  • Developing new formal verification techniques that can prove properties of the “learned” component,
  • Designing monitoring technology for assurance to indicate when the LEC is not performing well,
  • Formalizing the safety case argumentation process so that it can be dynamically evaluated at runtime,
  • Providing engineering tools to help the system designers and integrators build CPS with LEC with high assurance.

Overview

Research Areas

Assurance Monitors

The problem of assurance monitoring is related to developing mechanisms that can help ascertain the confidence in and suitability of a Learning Enabled Component when it is being used within the context of a CPS like cars. The mechanism is required because online conditions may differ from training distributions and the performance and low average error metrics from training stage are not necessarily a good measure of the correctness of the learning enabled controller online. Our team has developed a number of assurance monitors that can be chosen depending upon the architecture of the LEC and the learning approach being used.

Dynamic Assurance

The problem of dynamic assurance is related to developing mechanisms that can estimate the risk of operations for the CPS in a given environment and scenario. Dynamic assurance methods take the current system state, the output of the assurance monitors, the estimates of fault diagnosers in the system and the estimation of the environmental state. This dynamic assurance approach often utilizes the static design time assurance cases and provides a runtime analysis on wether the design time guarantees are still valid operationally in the given environment context. Our work in this area has led to the development of a framework called Resonate.

Toolchain

The Assurance-based Learning-enabled Cyber-Physical Systems (ALC) toolchain is an integrated set of tools and corresponding workflows specifically tailored for model-based development of CPSs that utilize learning-enabled components (or LECs). Machine learning relies on inferring relationships from data instead of deriving them from analytical models, leading many systems employing LECs to rely almost entirely on testing results as the primary source of evidence. However, test data alone is generally insufficient for assurance of safety-critical systems to detect all the possible edge cases. These tools support the complete design of learning enabled CPS using the other techniques developed by our team including verification, assurance monitoring and dynamic assurance estimation.

Verification

The problem of verification is related to understanding the bounds of safe operation of Learning Enabled Components. The approach requires development of theoretical analysis and numerical computation techniques that can estimate the polygonal approximations of the outputs of the LEC given some bounds on the inputs provided to the LEC. The methods also contribute to the design of runtime reachable set computations that enable calculation of the safe operating regions for the controlled autonomous vehicle in a given bounded time interval. Lastly, these methods can be used to estimate the robustness of the LEC, allowing for analyzing the risk to the system if the inputs of the LEC can be altered.

BlueROV Example

BlueROV2 Standalone package is a complete, fault tolerant autonomous underwater software system, based on UUV Simulator and BlueROV2 ROS simulation base packages, extended excessively by Vanderbilt University, Institute of Software Integrated Systems in the DARPA Assured Autonomy project.

Selected Articles and Talks

ICCPS 2020 Presentation

Towards Assurance-based Learning-enabled Cyber-Physical System

SEAMS 2021 Presentation on ReSonAte

People

...

Gabor Karsai

Lead-PI

Vanderbilt University

...

Taylor Johnson

Co-PI

Vanderbilt University

...

Abhishek Dubey

Co-PI

Vanderbilt University

...

Xenofon Koutsoukos

Co-PI

Vanderbilt University

...

Janos Sztipanovits

Co-PI

Vanderbilt University

...

Theodore Bapty

Co-PI

Vanderbilt University

...

Nagabhushan Mahadevan

Lead Research Engineer

Vanderbilt University

...

Shreyas Ramakrishna

Graduate Research Assistant

Vanderbilt University

...

Patrick Musau

Graduate Research Assistant

Vanderbilt University

...

Daniel Stojcsics

Senior Research Engineer

Vanderbilt University

...

Yogesh Barve

Research Scientist

Vanderbilt University

...

Diego Manzanas Lopez

Graduate Research Assistant

Vanderbilt University