ID4: Institute of Data Driven Dynamical Design
- DATE: April 15-16, 2024
- LOCATION: Quorum – University City Science Center, 3675 Market Street, Philadelphia PA, 19104
April 15, 2024
8:15 AM-9:00 AM | Coffee/light breakfast |
9:00 AM-9:15 AM | Session 1: Opening Session • Welcome (Professors, Jane Greenberg and Yuan An, CCI/Drexel, ID4) • Workshop goals and ground rules (Jane Greenberg) |
9:15 AM-9:30 AM | Session 2: Ice breaker and breakout group activity (all workshop attendees) • What is AI-ready data? Group definitions • Why/when it’s important to consider (or not consider) metadata and semantically-oriented ontological systems? |
9:30 AM-9:55 AM | Session 3: Keynote • FAIR AI-Ready Data and AI Models in Particle Physics (Mark Neubauer, Professor, University of Illinois at Urbana-Champaign, A3D3) (Session moderator, Christine Kirkpatrick, UC San Diego Supercomputer Center) |
9:55 AM-10:35 AM | Session 4: Research Bottlenecks: The BGNN to Imageomics Case Study • Enumerating and Addressing AI-Readiness Challenges from the Biology Guided Neural Networks HDR Project (Joel Pepper, et al, Drexel University) [Slides] • Our Journey on AIR: Problems and Solutions (Yasin Bakis, Tulane University Biodiversity Research Institute (TUBRI), BGNN/Imageomics) [Slides] • FAIR, Modular and Reproducible ML Workflows for Domain Scientists: An Imageomics Case Study (Hilmar Lapp, Duke University) [Slides] (Session moderator: David Breen, Drexel, BGNN/Imageomics) |
10:35 AM-10:50 AM | AM coffee break |
10:50 AM-11:40 AM | Session 5A: Human in the Loop: Curation, Data Annotation, and Metadata Generation for ML/AI • Curating Human-Robotics Training Datasets for Machine Learning (Maria Esteva, Texas Advanced Computing Center/HDR iHARP) [Slides] • Metadata, LLMs and Materials Synthesis Mining (Xintong Zhao et al., Drexel University, and Univ. Central Florida collaborators, ID4) [Slides] • Ground Truth: Metadata Accuracy Dilemmas in Training AI/ML Models (Bahareh Shakibajahromi, et al., ZF Passive Safety Systems) [Slides] (Session moderator: Richard Marciano, AI-Collaboratory, University of Maryland, College Park) |
11:40 AM-12:15 PM | Session 5B: Human in the Loop: Curation, Data Annotation, and Metadata Generation for ML/AI • Harnessing Generative AI to Support Exploration and Discovery in Library and Archival Collections (Lori Perine, Rajesh Kumar Gnanasekaran, & Richard Marciano, AI-Collaboratory, University of Maryland, College Park) [Slides] • Image Informatics: Automatic Metadata Extraction for ML Applications (Andrew Senin, Susquehanna Int’l Group, BGNN/Imageomics) [Slides] (Session moderators and group discussion/identifying themes: Jianwu Wang, UMBC, iHARP, and Mark Underwood, Information Security Strategic Initiatives Advisor) |
12:15 PM-1:15 PM | Lunch |
1:15 PM-1:45 PM | Session 6: Knowledge Extraction, Ontologies and Semantic Systems for AI • Semantic Technology and Artificial Intelligence Applications in Earth and Environmental Science (Anne Thessen, Anschutz School of Medicine, Univ. of Colorado) [Slides] • Preparing Vocabulary and Text for Algorithms and Machine Learning (Kio Polson, et al, Drexel University, ID4) [Slides] (Session moderator: Mark Neubauer, UIUC, A3D3) |
1:45 PM-2:45 PM | Session 7: AI Empowered Knowledge Graphs • Open Knowledge Networks, Knowledge Graphs and AI-ready Data (Florence Hudson, Northeast Big Data Hub) [Slides] • Large Language Models and Knowledge-Indicative Graphs (Bowen Jin, UIUC, I-GUIDE) [Slides] • Knowledge Graph Question Answering in Materials Science (KGQA4MAT) (Yuan An, et al, Drexel University, ID4) [Slides] • Graphical Materials Histories: Making the Invisible Visible (David Elbert, Johns Hopkins University) [Slides] (Session moderator: Alex Kalinowski, Drexel University) |
2:45 PM-3:00 PM | Afternoon coffee break/snacks |
3:00 PM-3:15 PM | Session 8: Another View of AI-Ready Data • AI-ready Geospatial Data: Structured Hypercube for Geospatial Knowledge Understanding (Wei Hu, UIUC, I-GUIDE) [Slides] (Session moderator: Fernando Uribe-Romo, University of Central Florida, ID4) |
3:15 PM-3:45 PM | Session 9: Group activity: Revising AI definition/s and discussion topics Brief activity • AI-ready definition/s modifications • New ideas/modification on why/when it’s important to consider metadata and semantically-oriented ontological systems? New focus • Identify educational challenges and opportunities specific to metadata and semantic ontologies for AI-ready data • What topic/s are missing from today’s discussion • Prep for brief around-the-room group report-outs |
3:45 PM-4:00 PM | Session 10: Day 1: Closing session • Around-the-room group report-outs • Day 2 plans and preparations • Poster shout-it-outs (Session moderators: Jane Greenberg and Yuan An, Drexel, ID4) |
4:00 PM-5:30 PM | Poster session (College of Computing & Informatics, 10th floor Lobby of 3675, Market Street, Philadelphia, PA—same building as the Quorum) |
April 16, 2024
8:15 AM-9:00 AM | Coffee/light breakfast |
9:00 AM-9:15 AM | Session 1: Opening Session • Welcome and workshop goals Day-2 (Jane Greenberg, Drexel, ID4) • Reflections on Day 1 topics (moderator/s t.b.c.) |
9:15 AM-10:00 AM | Session 2A: Data Management, FAIR practices, and Prepping for AI-Ready Pipelines • Dryad and re-curation (Ryan Scherle, Dryad Repository and CIC/Northeast Big Data Hub) [Slides] • DataFed: Making Science Repeatable with ML Pipelines (Joshua Brown, Oak Ridge National Laboratory) [Slides] • Showcase: Using FishAIR within a Data Production Pipeline (Xiaojun Wang, Bahadir Altintas, Tulane University Biodiversity Research Institute) [Slides] (Session moderator: Megan Force, Clarivate) |
10:00 AM-10:45 AM | Session 2B: Data Management, FAIR practices, and Prepping for AI-Ready Pipelines • FAIR Re-use: Implications for AI-Readiness (Lydia Fletcher, Texas Advanced Computing Center, iHARP) [Slides] • FAIR and ML, AI Readiness and AI Reproducibility (FARR) (Christine Kirkpatrick, UC San Diego Supercomputer Center) [Slides] • AI-ready data and distinctions from FAIR data (Zachary Trautt, National Institute of Standards and Technology) (Session moderator: Juliane Schneider, Pacific Northwest National Laboratory (PNNL)) |
10:45 AM-11:00 AM | AM coffee break |
11:00 AM-12:00 PM | Session 3: Standards Development, Adoption, and Implementation: Realities and Fictions • Standards Development, Adoption, and Implementation: Realities and Fictions (Nettie Lagace, NISO) [Slides] • Research Data Alliance-US (Robert Quick, Univ. of Indiana) [Slides] • Innovating the Standards Process with YAMZ: Yet Another Metadata Zoo and AI Implications (Isabel Moreira de Oliveira, et al., Princeton University, and Scott McClellan, Drexel University, et al., ID4) [Slides] • MaRDA/MaRCN: AI Efforts, Working Groups & Best Practices (Laura Bartolo, MaRCN) [Slides] (Session moderator: David Elbert, John’s Hopkins University) |
12:00 PM-1:15 PM | Lunch and optional tour/s • Drexel’s historical building and Philadelphia skyscape view • On your own, go see ENIAC • Rest/re-set for final session |
1:15 PM-2:15 PM | Session 4 Industry/Government panel • Semion Saikin, Kebotix, ID4 • Mark Underwood, Co-founder, Information Security Strategic Initiatives Advisor • Juliane Schneider, Pacific Northwest National Laboratory (PNNL) (Session moderator: Rachel Frick, OCLC) |
2:15 PM-3:00 PM | Session 5: AI Research Reproducibility, Validity, and Sharing Models • A Field Polarized by AI: How to Navigate the Conclusions and Delusions (Josh Agar, Drexel University, collaborator with A3D3 researchers) • Metadata for Reproducible Big Data Analytics in the Cloud (Jianwu Wang, UMBC, iHARP) [Slides] • AI Data Readiness and Model Sharing in Computational Health and Climate Sciences (Sanjay Purushotham, UMBC, iHARP) [Slides] (Session moderator: Shih-Chieh Hsu, University of Washington, A3D3) |
3:00 PM-3:15 PM | Afternoon coffee break |
3:15 PM-3:40 PM | Session 6: Final Group activity/discussion • Finalize group definition/s on AI-ready data • Final statements on importance of metadata and semantically-oriented ontological systems? • Concrete steps and dream ideas for advancing AI-Ready data approaches with metadata and semantic ontologies |
3:40 PM-4:00 PM | Session 7: Final reporting of groups/around the room |
4:00 PM-4:30 PM | Session 8: Workshop wrap-up • Collective white paper logistics • Workshop thank yous/closure |
NOTE: Agenda is fairly well set, but subject to change.