Virtual (or in person) National Science Foundation Research Experience for Undergraduate research opportunity @ the Metadata Research Center, Drexel University, as part of the Harnessing (HDR) Institute for Data Driven Dynamical Design (ID4)
Dates: Mid-June through Mid-September
(Flexibility with start date, and opportunity to continue work over Fall ‘24 term.)
REU stipend: $5,500
Deadline: Rolling basis (Friday, June 1st for first consideration)
Contacts: Interested applicants, please send a resume and brief statement of interest (1 paragraph) indicating why you would like to participate in the REU program. Please send your application to:
- Senior Research Associate John Kunze: jakkbl@gmail.com
- Professor Mat Kelly: mrk335@drexel.edu
- Professor Jane Greenberg: jg3243@drexel.edu
REU Project title: Materials Science Vocabulary Building: Establishing a YAMZ Portal
Project overview and description: Agreement on terminology is critical for human and machine communication supporting scientific research. Additionally, shared vocabulary provides a necessary foundation of data and metadata standards, as well as the basis for labels in machine learning pipelines. This REU project will develop and enhance YAMZ.net by creating a domain-specific portal for materials science and exploring AI integration. YAMZ is a general purpose crowdsourced, online dictionary using reputation-based voting to support community discussion and consensus. Project REUs will:
- Develop and test the domain specific portal in the materials science subdomain
- Explore and pilot integrating ChatGPT for drawing in definitions
- Document project procedures to enable a generalizable model that can, on demand, present users with a constrained view (or portal) restricted just to terms from the materials science subdomain
- Collaborate with project mentors and project staff on a scholarly output (e.g., conference poster, presentation, research paper)
REU applicants for this project should have
- Exposure and instruction in at least one of the following disciplines: computer science, data science, chemistry, engineering, physics, and/or materials science
- Interest in semantic systems (terminology/vocabulary) and their value for representation, machine learning, and AI
- Knowledge of the value of data standards for communicating human to human, human to machine, and machine to machine
- Knowledge of database and data science software (SQL, Tableau, Orange, etc.)
- Python, Flask or similar web framework, or other coding experience
Applicant restrictions
- Must be a non-Drexel undergraduate (not graduated)
- May work remotely or onsite
- Must be a U.S. citizen or permanent resident of the United States or its possessions
Research Goals
- Advance YAMZ.net features supporting domain specific portals (e.g., tagging, group ownership of terms and portals).
- Explore and pilot AI integration into YAMZ.net.
- Develop ways for domain-specific communities to be mostly self-sufficient in creating and managing portals.
Learning Goals
- Gain R&D experience with a working online dictionary, and understand tradeoffs between domain-agnostic and domain-specific portals
- Advance semantic research and data science/computer science skills
- Obtain a better understanding of the complexity of questions surrounding terminology agreement and its importance for scientific communication and research