Learning and Control of Hamiltonian System Dynamics

We previously developed a port-Hamiltonian dynamics model, which was trained as a neural ordinary differential equation (ODE) neural network and subsequently used for energy-shaping control to achieve stabilization and tracking. Recently we have extended our method in two major directions.

Figure 1. Quadrotor trajectory tracking using a learned port-Hamiltonian dynamics model.

In work published at ICRA 2024 (Altawaitan et al. 2024) we extended the method to learn robot dynamics directly from sensor observations instead of trajectory data, eliminating the need to estimate the robot states first before recovering the robot dynamics. Our approach learns a Hamiltonian dynamics model directly from point cloud data. We designed an observation-space error function that relates motion prediction from the dynamics model with motion prediction from point-cloud registration to train a Hamiltonian neural ODE. The learned Hamiltonian model enables the design of an energy-shaping model-based tracking controller for rigid-body robots. We demonstrated dynamics learning and tracking control on a real nonholonomic wheeled robot by carefully designing the potential energy shaping to account for the nonholonomic constraints.

In work under review in IEEE Transactions on Robotics (Duong et al. 2024) we extended the method to consider robot coordinates defined on a matrix Lie group and demonstrated dynamics learning and control on real quadrotor robots. The dynamics of many robots are described in terms of generalized coordinates on a matrix Lie group, e.g. on SE(3) for ground, aerial, and underwater vehicles, and associated generalized velocity. We developed a port-Hamiltonian formulation over a Lie group of the structure of a neural ODE network to approximate the robot dynamics. In contrast to a black-box ODE network, our formulation guarantees energy conservation and Lie group constraints and explicitly accounts for energy-dissipation effects such as friction and drag forces in the dynamics model. We developed energy shaping and damping injection control for the learned, potentially under-actuated Hamiltonian dynamics to enable a unified approach for stabilization and trajectory tracking with various robot platforms. We demonstrated the techniques in experiments with real quadrotor robots.

Additionally, in Long et al. (ACC 2024) we focus on enforcing safety constraints for rigid-body mobile robots operating autonomously in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, preventing their applicability to more realistic robot-body geometries. Our work enables CBF designs that capture complex robot and obstacle shapes. We demonstrate the effectiveness of our approach in simulations highlighting real-time obstacle avoidance in constrained and dynamic environments for both mobile robots and multi-joint robot arms.

Finally, in Sebastián et al. (2023), published at IEEE International Symposium on Multi-Robot and Multi-Agent Systems, we continued our work on multi-robot control and reinforcement learning. Robot teams interact through a communication network represented as a graph. When learning multi-robot interactions from demonstration, the communication graph needs to be identified from the state/feature trajectories of its nodes. This problem is challenging because the behavior of a node is coupled to all the other nodes by the unknown interaction model. Current solutions rely on prior knowledge of the graph topology and the dynamic behavior of the nodes, and hence, have poor generalization to other network configurations. To address these issues, we developed a novel learning-based approach that combines (i) a strongly convex program that efficiently uncovers graph topologies with global convergence guarantees and (ii) a self-attention encoder that learns to embed the original state trajectories into a feature space and predicts appropriate regularizers for the optimization program. In contrast to other works, our approach can identify the graph topology of unseen networks with new configurations in terms of number of nodes, connectivity or state trajectories. We demonstrated the effectiveness of our approach in identifying graphs in multi-robot formation and flocking tasks.

The networked nature of multi-robot systems also presents challenges in the context of multi-agent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks.

In work currently under review in IEEE Transactions on Robotics (Sebastián et al. 2024) we develop a physics-informed reinforcement learning approach able to learn distributed multi-robot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics:

  1. It imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions.
  2. It uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph
  3. It uses a soft actor-critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization.

We demonstrate that our method surpasses previous multi-agent reinforcement learning solutions in scalability, while achieving similar or superior performance in different multi-robot problems (with averaged cumulative reward up to 2x greater than the state-of-the-art with robot teams 6x larger than the number of robots at training time).