About
People
Projects
Publications
Resources
Talks
Videos
|
Example Jigsaw Datafiles
Below is a collection of example Jigsaw datafiles. Jigsaw datafile
format is a simple xml where each document is represented by a
node. Within the node are document metadata, the text of
the document, and finally associated entities. These files use the
".jig" suffix. Please feel free to
use these datafiles in your own research. We simply ask that you
acknowledge where you acquired them.
- Autism webpages - Top ~500 web pages
about autism
- Bible - King James version of
the Bible
- CHI papers - The title,
abstract and metadata for every ACM CHI conference paper from 1999 to
2010.
- InfoVis and VAST
papers - The title, abstract and metadata for every IEEE
InfoVis and VAST conference paper from 1995 to 2017
(List of concept terms identified in
titles and abstracts)
- NSF award information including title, summary, and metadata from
2000-2009 (special thanks to Remco Chang for original scrape of these)
CISE/CCF - CISE Computing
and Communication Foundations (keywords identified within CISE awards)
CISE/CNS - CISE Computer and
Network Systems
CISE/IIS - CISE Interactive and
Intelligent Systems
EHR/DGE - EHR Graduate
Education
EHR/DRL - EHR Research on
Learning in Formal and Informal Settings
EHR/DUE - EHR Undergraduate Education
EHR/HRD - EHR Human Resource Development
ENG/IIP - ENG Industrial
Innovation and Partnership
- 9/11 Report - Each
page as a separate document
|
Last
modified: November 4, 2013 |