Publication: Finding Near-Replicas of Documents and Servers on the Web.