LIS Education and Data Science for the National Digital Platform (LEADS-4-NDP)

LEADS Fellows Application Process for Summer 2019

Drexel University’s College of Computing and Informatics (CCI), the Metadata Research Center, and project partners invite doctoral students to participate in the LIS Education and Data Science-4-the National Digital Platform (LEADS-4-NDP) program and become LEADS Fellows.

View PR flyer 
View 2018 LEADS fellow final lightning talks here

APPLICATIONS FOR 2019 LEADS FELLOWS DUE, Sunday, February 17, 2019, 11:59 PM EST

LEADS Fellowships are virtual, and you may participate from anywhere. LEADS Fellowship partners include: California Digital Library, University of California, Berkeley; Digital Curation Innovation Center (DCIC), University of Maryland’s iSchool; Digital Public Library of America (DPLA); Digital Research Services, University Penn Libraries; Free Library of Philadelphia; Historical Society of Pennsylvania; OCLC; and the Academy of Natural Sciences.


LEADS Fellows will receive a $5000 stipend, plus additional financial support (approximately $3,000) for a 3-day Data Science Bootcamp at Drexel University (June 6-8), a visit to your NDP site during the summer, and conference travel during the 2019/2020 academic year to share project outcomes.

Important Dates

      • LEADS-4-NDP Application Deadline: Sunday, February 17, 2019, 11:59 PM EST
      • Notification of acceptance: Mid-March, 2019
      • Data Science Bootcamp at Drexel University: June 6-8, 2019

    LEADS Fellows will:

      • Complete an online, self-paced curriculum of approximately 7 to 10 hours of work (late May 2019).
      • Attend the 3-day Drexel Data Science Bootcamp with other LEADS Fellows.
      • Complete a virtual 10-week summer data science internship coordinated with a selected NDP site. (*Note: some LEADS Fellows may have more on-site interaction, depending on their location. The virtual model follows the NSF DataOne DataNet and RDA-Research Data Alliance model, both of which have been very successful.)
      • Develop a communication plan to connect with mentors on a regular basis.
      • Share the results of their summer experience with their home institution.

    Application requirements

      • Applicants must be a doctoral student who has an interest in data science and library science applications. Their doctoral degree program must be in an institution that also hosts an ALA accredited master’s degree program.
      • Applicants must complete the application form at: and upload the application materials requested.
      • Applicants must rank their top three choices for their data science summer internship placement on the form.

    Application materials requested

      • One-page statement sharing your interest in the LEADS program and the selected NDP sites. Your one page statement must address why you seek to learn more about the intersection of library science and data science, and your career goals related to becoming an educator and researcher.
      • Brief statement of your training and experience with at least one of the following statistical packages: Excel, SPSS, R, MATLAB, or SAS (or identify another package).
      • Brief statement of your training and programming skill/experience with any of the following: HTML, XML, JSON, JavaScript, Python, R, Java, or Scala. (The LEADS program anticipates applicants with a range of skills.)
      • 2-page biosketch (Any format is acceptable; for NSF template, link here.)
      • A letter of reference from your advisor or mentor.

    Criteria for selection

      • Clear interest in data science applications in the LIS domain.
      • Relevant connection to the selected NDP sites.
      • Strong letter of support from advisor or mentor.

    LEADS Fellowship Projects at NDP Sites

    LEADS is a virtual fellowship program; students located near their selected site may have more on site interaction.

    Project Partner/Site

    Project title (link to full description)

    Project outcome


    1. California Digital Library, University of California,
    Office of the President
    Making a Metadata Meritocracy Refined workflows for gaining acceptance of proposed metadata terms
    2. Digital Curation Innovation Center (DCIC), University of Maryland’s iSchool Automating the Detection of Personally Identifiable Information (PII) The project will partner with in Seattle to work on processing and releasing these records to the public using community-vetted and approved access policies
    3. Digital Public Library of America (DPLA) DPLA Resources and Vocabulary Enrichment for Analytics A method and an approach for identifying term variations and applying terminology more consistent terminology to support analytics.
    4. Digital Research Services, University Penn Libraries Semi-automatically assigning keywords to medieval manuscripts on OPenn Will feed into an ongoing project to improve the search providing access to OPenn, and will also make visualization of the data possible in ways not possible now.
    5. Historical Society of Pennsylvania Enhancing access to historic biographical data through visualization tools Improved access to historic data for scholars and other user groups via user-friendly visualization tools. The ability to visualize connections between people, places, and institutions.
    6. OCLC Automatic Identification of Publisher Entities to Support Discovery and Navigation Seeks to advance our understanding of the publisher entity in library bibliographic data.
    7. Academy of Natural Sciences This project seeks to augment existing biodiversity specimen collection data by linking related descriptions found in the rich tradition of natural history literature
    8. Digital Scholarship Center, Temple University Automating keyword assignments for the Nineteenth-Century Knowledge Project Will contribute to an ongoing project involving the automatic indexing of four digitized historical Encyclopedia Britannicas. This aspect of the project seeks to convert digitized historical controlled vocabularies into SKOS format.
    9. RAMP, Montana State University Library Analyzing the RAMP dataset to better understand content, use, and performance of institutional repositories Exploring potential of the RAMP dataset, including: analyzing the scholarly record across institutional repositories (IR); demonstrating level of IR use; evaluating incentives to improving click-through rates; comparing RAMP download reports with vendor and IRUS download reports; automatically predicting disambiguated structured and ontological metadata, etc.