Publication: Text string extraction within mixed-mode documents.