Monitoring With Limited Information

We consider a system with an evolving state that can be stopped at any time by a decision maker (DM), yielding a state-dependent reward. The DM does not observe the state except for a limited number of monitoring times, which he must choose, in conjunction with a suitable stopping policy, to maximize his reward. Dealing with this type of stopping problems, which arise in a variety of applications from healthcare to finance, often requires excessive amounts of data for calibration purposes, and prohibitive computational resources. To overcome these challenges, we propose a robust optimization approach, whereby adaptive uncertainty sets capture the information acquired through monitoring. We consider two versions of the problem–-static and dynamic–-depending on how the monitoring times are chosen. We show that, under certain conditions, the same worst-case reward is achievable under either static or dynamic monitoring. This allows recovering the optimal dynamic monitoring policy by resolving static versions of the problem. We discuss cases when the static problem becomes tractable, and highlight conditions when monitoring at equi-distant times is optimal. Lastly, we showcase our framework in the context of a healthcare problem (monitoring heart transplant patients for Cardiac Allograft Vasculopathy), where we design optimal monitoring policies that substantially improve over the status quo treatment recommendations.

Article

Download

View PDF