Self-supervised video pretraining yields robust and more human-aligned visual representations.

Published in: NeurIPS (2023)

Keyphrases