We study a class of bi-objective integer programs known as bi-objective knapsack problems (BOKPs). Our research focuses on the development of innovative exact and approximate solution methods for BOKPs by synergizing algorithmic concepts from two distinct domains: multi-objective integer programming and (deep) reinforcement learning. While novel reinforcement learning techniques have been applied successfully to single-objective integer programming in recent years, a corresponding body of work is yet to be explored in the field of multi-objective integer programming. This study is an effort to bridge this existing gap in the literature. Through a computational study, we demonstrate that although it is feasible to develop exact reinforcement learning-based methods for solving BOKPs, they come with significant computational costs. Consequently, we recommend an alternative research direction: approximating the entire nondominated frontier using deep reinforcement learning-based methods. We introduce two such methods, which extend classical methods from the multi-objective integer programming literature, and illustrate their ability to rapidly produce high-quality approximations.