This data set is used for studying name disambiguation in digital library. It contains 110 author names and their disambiguation results (ground truth). Each author name corresponds to a raw file in the "raw-data" folder and an answer file (ground truth) in the "Answer" folder.
- Raw file: the raw file is formatted as a XML file. In the XML file, the author name is associated with a number of publications. An example of a publication is as follow:
"
Explanation-based Failure Recovery1987Ajay GuptaAAAI13048null
"
where
denotes the title of the publication;
denotes the publication year;
denotes the publication venue;
denotes the publication id;