Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens.
Elad Ben-AvrahamRoei HerzigKarttikeya MangalamAmir BarAnna RohrbachLeonid KarlinskyTrevor DarrellAmir GlobersonPublished in: CoRR (2022)
Keyphrases
- key frames
- scene structure
- image measurements
- multiple objects
- input image
- video frames
- image motion
- single image
- video sequences
- image features
- image segmentation
- multiscale
- video data
- ego motion
- region of interest
- feature points
- high resolution
- structure from motion
- d scene
- space time
- target object
- bundle adjustment
- position and orientation
- stereo vision
- frame rate
- least squares
- d objects