Publication: When printed hypertexts go digital: information extraction from the parsing of indices.