The problem

Service delivery systems collect and manage similar data about the same people, but they model it differently. A person in one system has a "gender" field with codes 1/2/3. Another uses "M/F/O". A third uses "male/female/non-binary/prefer_not_to_say".

This is not just a technical annoyance. When systems cannot understand each other's data, organizations cannot identify who is receiving benefits across programs, refer beneficiaries between services, consolidate reporting across agencies, or detect duplication and gaps in coverage.

The usual response is to build integration middleware that translates between systems case by case. This is expensive, fragile, and does not scale.

The insight

The structural differences between systems are surprisingly small. When we mapped 6 major systems across social protection (OpenSPP, openIMIS, SPDCI, DHIS2, FHIR R4, OpenCRVS), we found that all 6 had some version of Person, Enrollment, and Payment. The structures converge. The vocabularies diverge.

The real interoperability challenge is vocabulary alignment: agreeing on what values mean, not what fields exist.

The vision

An open, composable vocabulary of concepts and properties for public services. Think schema.org, but for public service delivery.

Schema.org does not tell websites what data to collect. It provides a shared language: "here is what Person means, here are properties that can describe a Person, here are standard values for gender." Websites adopt the terms that apply to them.

Similarly, PublicSchema provides:

Every element gets a stable URI. Everything is optional. Systems adopt what they need.

Design principles

Semantic, not structural. Concepts carry meaning. A Person is not a bag of fields. Definitions are written for domain practitioners, not for software developers. The vocabulary should be legible to a social protection policy officer.

Properties are independent (until they are not). A property like start_date is defined once and used across multiple concepts. However, when a shared property needs concept-specific value sets (e.g., status means different things on an Enrollment vs. a Grievance), the property definition specializes rather than pretending the differences do not exist.

Temporally grounded. Almost everything in public service delivery is time-bounded: enrollment periods, benefit cycles, eligibility windows, payment schedules. A status snapshot without a validity period is incomplete. Temporal context is a first-class concern, not an afterthought.

Vocabularies reference standards. Never invent what already exists. For gender, marital status, country, currency: adopt existing international standards and map system-specific codes to them. Only define value sets for domain-specific concepts where no standard exists.

Everything is optional. There is no "you must implement these fields to be compliant." Systems adopt the concepts, properties, and vocabularies that apply to them. The vocabulary is descriptive, not prescriptive.

Evidence-based. Convergence data from analysis of 6 systems across 18+ concepts informs which entities, properties, and vocabularies are prioritized. Where systems agree, PublicSchema codifies the consensus. Where they diverge, the vocabulary documents the variation.

VC-ready. Stable URIs, JSON-LD contexts, and vocabulary values designed for use as Verifiable Credential schemas. Credential design accounts for selective disclosure, since credentials often contain sensitive personal data that should not be fully revealed in every presentation.

Incremental. Start with what we know, get it right, extend when ready. No grand architecture upfront.

Starting with social protection

PublicSchema begins with social protection because it is a well-understood domain with active digital transformation across many countries, multiple open-source systems exist alongside emerging standards, and the interoperability pain is immediate and concrete.

Initial concept coverage includes people and identity (Person, Household, Family, Group, Identifier, Address, Location), program delivery (Program, Enrollment, Entitlement, EligibilityDecision, AssessmentFramework, AssessmentEvent), payments (PaymentEvent), and accountability (Grievance, Referral).

As the vocabulary matures, it will extend to adjacent domains: civil registration, health referrals, education, humanitarian response, land administration. The structure is the same; only the concepts and vocabularies change.

Related standards

PublicSchema is designed to complement existing efforts, not compete with them. It sits at the delivery lifecycle vocabulary layer, between identity standards (EU Core Vocabularies), API interoperability (DCI/SPDCI, GovStack), and trust infrastructure (W3C VC, EBSI).

See Related Standards for a detailed comparison with DCI, EU Core Vocabularies, GovStack, FHIR, Schema.org, and other initiatives.

Governance

The project is maintained by a small team to enable fast, opinionated decisions in the early stages. Feedback is actively sought from domain experts, system implementers, and standards bodies.

As adoption grows, governance will expand: first to an advisory group of contributors and domain experts, then to a formal multi-stakeholder structure. The right governance model will emerge from who actually uses and contributes to the vocabulary.

Roadmap

Phase 1: Vocabulary foundation

Define concepts and properties with stable URIs, following the schema.org pattern. Write each definition for domain practitioners. Publish as a reference website. Identify champion systems for early validation.

Phase 2: Vocabulary standards and mappings

Research and adopt existing international standards for each value set. Define canonical value sets where no standard exists. Build cross-system vocabulary mappings. Pilot with at least one country deployment.

Phase 3: Adoption and extension

Formalize governance. Validate mappings with system implementers and country deployments. Document real-world adoption patterns. Extend to adjacent public service domains.