ArXiv Project
ArXiv.org is a repository of physics, math and computer science articles
with roughly 250,000 documents and a user community of
of over 40,000 researchers. We have done very preliminary experiments in
usage of Kleinberg's burst detection algorithm [1] applied to word
occurrences in arXiv titles, with tantalizing results, and wish to
extend this to an on-line navigational tool.
This project would involve slight refinement of the basic burst
algorithm to the textual case at hand, and extension to use in
conjunction with citation tree data. It will also involve
the development of visualization methods to would produce interactive
output for the web interface. The burst detection algorithm will be
used to compare a large number (tens of thousands) of time series in
parallel to identify clusters of scientific works in time that can be
assembled into a narrative description of progress in a field over
time, and to facilitate navigation of it. The intent is to identify
the most important temporal patterns, and implement visualization
methods that are fun, intuitive and informative.
Contact Person: Paul Houle (ph18@cornell.edu)
Interested Faculty: Paul Ginsparg, Jon Kleinberg
Credit Hours: 3-6 (Negotiable)
[1] http://www.cs.cornell.edu/home/kleinber/kdd02.html
Dostları ilə paylaş: |