Saturday, August 11, 2007

New Search Engine Ranks Tables By Title, Document Content, Text Reference

New Search Engine Ranks Table

Penn State researchers have developed a search engine — TableSeer —
which not only can identify and extract tables from PDF documents but
also can index and rank the search results using factors including the
table’s title, text references to the table, and date of publication.


The engine’s innovative ranking algorithm, TableRank, also can
identify tables found in frequently cited documents and weigh that
factor as well in the search results, said Prasenjit Mitra, an
assistant professor in the Penn State College of Information Sciences
and Technology (IST) and one of the lead researchers in the development
of the search engine.