A crowdsourced metadata dictionary
Would you like experience developing and enhancing a crowdsourced system for building long-lived vocabularies? Here is a chance to significantly improve the dreadful inefficiency in the traditional process of reaching metadata consensus and communicating that consensus. Our system has several facets calling for enhancements that can be taken on as small subprojects.
What is YAMZ?
YAMZ.net is an open source system that enables anyone to use and comment on other people’s terms, to add and edit their own terms, and to rate others’ terms via reputation-based voting.
YAMZ aims to be a domain-agnostic vocabulary service that is as useful and compelling to ontology developers as Stack Overflow is to developers. YAMZ maintains terms in any part of their lifecycle, from evolving, to stable, to deprecated. The goal is to be a high-quality substrate of widely reviewed, well-tested terms.
With every term and definition linked by a unique, actionable persistent identifier (an ARK), YAMZ is well-placed to support applications in linked data, semantic web, and natural language processing (NLP). We’ve already seen use of YAMZ by domain experts interested in reducing duplicative terms (same term string but different, uniquely identified definition), in publicizing project-level divergence from domain norms, and in general interoperability.
The YAMZ term substrate is not itself a standard. Instead, it is a large pool of candidate terms for any part of metadata speech, from which informed selections (external to YAMZ) can be made by anyone designing a formal or informal ontology, whether it’s a 3-element pick list for a web app, a 17-column set of spreadsheet headers, or the 154 fields used by an international metadata standard. The practice of linking to YAMZ terms will improve scholarly and scientific communication (indexing, description, use), from cutting edge research, to historic documents, and everything in between.
What skills we’re looking for
We’re looking for open source developers and analysts with interests and skills in one or more of the following:
- web service UX and API (term import, export)
- database design and manipulation
- social behavior around patterns of consensus, new terminology adoption, language evolution (emerging, archaic)
- creation of controlled vocabularies (from pick lists to column headers to international metadata standards) based on pools of alternate candidate terms
- core technologies used by YAMZ (currently Python, Flask, SQLAlchemy, and PostgreSQL – see the YAMZ codebase at Github.com)
If you are interested please contact jakkbl[at]gmail[dot]com.