I have a task to locate an object in 3D coordinate system. Since I have to get almost exact X and Y coordinate, I decided to track one color marker with known Z coordinate that will be placed on the top of the moving object, like the orange ball in this picture:
First, I have done the camera calibration to get intrinsic parameters and after that I used cv::solvePnP to get rotation and translation vector like in this following code:
std::vector<cv::Point2f> imagePoints;
std::vector<cv::Point3f> objectPoints;
//img points are green dots in the picture
imagePoints.push_back(cv::Point2f(271.,109.));
imagePoints.push_back(cv::Point2f(65.,208.));
imagePoints.push_back(cv::Point2f(334.,459.));
imagePoints.push_back(cv::Point2f(600.,225.));
//object points are measured in millimeters because calibration is done in mm also
objectPoints.push_back(cv::Point3f(0., 0., 0.));
objectPoints.push_back(cv::Point3f(-511.,2181.,0.));
objectPoints.push_back(cv::Point3f(-3574.,2354.,0.));
objectPoints.push_back(cv::Point3f(-3400.,0.,0.));
cv::Mat rvec(1,3,cv::DataType<double>::type);
cv::Mat tvec(1,3,cv::DataType<double>::type);
cv::Mat rotationMatrix(3,3,cv::DataType<double>::type);
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
cv::Rodrigues(rvec,rotationMatrix);
After having all matrices, this equation that can help me with transforming image point to wolrd coordinates:
where M is cameraMatrix, R - rotationMatrix, t - tvec, and s is an unknown. Zconst represents the height where the orange ball is, in this example it is 285 mm. So, first I need to solve previous equation, to get "s", and after I can find out X and Y coordinate by selecting image point:
Solving this I can find out variable "s", using the last row in matrices, because Zconst is known, so here is the following code for that:
cv::Mat uvPoint = (cv::Mat_<double>(3,1) << 363, 222, 1); // u = 363, v = 222, got this point using mouse callback
cv::Mat leftSideMat = rotationMatrix.inv() * cameraMatrix.inv() * uvPoint;
cv::Mat rightSideMat = rotationMatrix.inv() * tvec;
double s = (285 + rightSideMat.at<double>(2,0))/leftSideMat.at<double>(2,0));
//285 represents the height Zconst
std::cout << "P = " << rotationMatrix.inv() * (s * cameraMatrix.inv() * uvPoint - tvec) << std::endl;
After this, I got result: P = [-2629.5, 1272.6, 285.]
and when I compare it to measuring, which is: Preal = [-2629.6, 1269.5, 285.]
the error is very small which is very good, but when I move this box to the edges of this room, errors are maybe 20-40mm and I would like to improve that. Can anyone help me with that, do you have any suggestions?
Question&Answers:os