This paper extends the technique of gradient boosting with a focus on using domain-specific models instead of trees. The domain of mortality forecasting is considered as an application. The two novel contributions are to use well-known stochastic mortality models as weak learners in gradient boosting rather than trees, and to include a penalty that shrinks mortality forecasts in adjacent age groups and nearby geographical regions closer together. The proposed method demonstrates superior forecasting performance based on US male mortality data from 1969 to 2019. The proposed approach also enables us to interpret and visualize the results. The boosted model with age-based shrinkage yields the most accurate national-level mortality forecast. For state-level forecasts, spatial shrinkage provides further improvement in accuracy in addition to the benefits of age-based shrinkage. This improvement can be attributed to data sharing across states with large and small populations in adjacent regions and states with common risk factors.