I have an RGB image divided in 4 squares with a bit of overlap between them. Each square is fed to a monocular depth estimator and it estimates its correspondent depth map. Then, I stitch each prediction back together in the final depth estimation. The problem is that each depth map is predicted with an unknown scale and shift factor which means that depth value ranges are different between them and they don't match causing a patchy result.
I know I can just feed the whole RGB image as a whole or reduce resolution but sometimes that causes a loss in geometric detail. I would like to keep it this way. Do you have any ideas on how to account for these miss-alignments between depth maps? Is it possible to somehow estimate the normalization curve the monocular depth estimator applied to each prediction so to bring all together to the same scale?
question from:https://stackoverflow.com/questions/65922808/aligning-grid-of-depth-maps