Deterministic global optimization with trained neural networks: What is the benefit of the envelope of single neurons?

\(\)

Optimization problems containing trained neural networks remain challenging due to their nonconvexity. Deterministic global optimization relies on relaxations that should be tight, quickly convergent, and cheap to evaluate. While envelopes of common activation functions have been established for several years, the envelope of an entire neuron had not. Recently, Carrasco and Mu\~{n}oz (arXiv.2410.23362, 2024) proposed a method to compute the envelope of a single neuron for S-shaped activation functions. However, the computational effectiveness of this envelope in global optimization is still unknown. Therefore, we implemented this envelope in our open-source deterministic global solver MAiNGO and machine-learning toolbox MeLOn, using the hyperbolic tangent and Scaled Exponential Linear Unit as activation functions. We evaluate the benefit compared to combining the separate envelopes of the pre-activation and activation functions using factorable programming techniques in illustrative examples as well as case studies from chemical engineering. The results suggest that using the envelope of a neuron can provide tighter relaxations and reduce both the number of iterations as well as the computational time. However, if performing bound tightening based on subgradients, the separate envelopes often perform better as they allow tightening bounds on the preactivation output.

Article

Download

View PDF