![]() |
![]() |
Efficient Video Inpaiting Based on Object Tracking and Interpolation |
|||||||||||
Jian Zhao, Vijay Venkatesh.M, Sen-ching S. Cheung |
|||||||||||
Object removal is the process of removing a particular objects, moving or stationary, and filling in the missing part (hole) in a visually consistent manner. Such algorithm is of great usage to traditional applications like video applications or new applications such as privacy protection. Most existing object removal techniques are computationally intensive and cannot handle large holes. In this paper, we propose a video in-painting system that utilizes background subtraction and object tracking to extract a set of object templates, followed by fast in-painting using background replacement and optimal object transfer based on dynamic programming. Our object tracker tracks the object shape and color simultaneously in a Kalman Filter framework. To accurately extract object templates, we design a probabilistic framework that incorporates both the tracker as well as estimated depth information. The background portion of the hole is completed based on an adaptively updated background model. The foreground portion is filled by choosing the appropriate object templates that optimizes the likelihood of the entire spatial-temporal hole using dynamic programming. We will demonstrate our systems based on a set of in-door video sequences with different types of occlusions |
|||||||||||
Demos |
|||||||||||
| (click each image to see the demos,click here to download all the demos ) | |||||||||||
The top left video shows a typical surveillance scene. The top right is the subject whose privacy we need to protect. On the bottom left, we use image in-painting techniques to completely remove the presence of the subject. On the bottom right, the image of the subject is embedded back in the image as an invisible watermark. If you compare the bottom left image with the bottom right images, you will notice that the bottom left image is actually smoother with less artifacts. The reason is that the random noise introduced by the watermark actually smoothes out some of the visual artifacts introduced by compression. The drawback is that the video with the watermark is much harder to compress and it produces a file three to four times as big as the same video without any watermark [1].
|
|||||||||||
The top left video shows a more complex case is that there are occlusion between the object to be removed and the foreground object. As shown in the top right video, Tracking is needed to identify the object and replace it with background image. In each step the tracker updated the sates for locomotion and color information, estimated the position in next frame in a probabilistic way, as shown in the bottom right video. We can see from the bottom left video that a well designed tracker can separate the merged objects. Further research is needed to improve the state to characterize the object (color histogram, shape or silhouette), to decide the foreground object and to tolerant more complicated occlusion. |
|||||||||||
In this demo we show how to deal with large static hole in video inpainting. As in the top video, the person is moving through a large obstacle. This created two problems: 1) The video sequence does not have a single frame of complete background. 2) Lots of information is lost when the person walked behind the hole. To solve the first question, we use the technique of image inpainting to fill in the hole and get the background. This can be solved easily if the background is homogeneous, but extraordinarily hard if we have complex background. We use a template based inpainting model to fill in the hole. The foreground is inpainted in the occlusion region by minimizing the total squared error of a three- frame window between the extracted foreground. One drawback is that you can still notice the small jump when entering or leaving the obstacle.
|
|||||||||||
The top video shows a more complicated situation that there are occlusions between two people and the people to be removed are sometimes sheltered by the other people. To complete the inpainting, both tracking and template propagation are needed. Tracking is needed to identify the foreground objects. Then the dynamic matching program will find the closest template and fill in the hole caused by occlusion |
|||||||||||
| [1] Zhang, W., S.-C. Cheung, and M. Chen. 2005. Hiding privacy information in video surveillance system. Accepted to IEEE International Conference on Image Processing, ICIP 2005, September 11-14, Genova, Italy. http://www.vis.uky.edu/~cheung/doc/icip05B.pdf |
|||||||||||