This document defines a mechanism for proof formats that supports selective disclosure without the need for a Holder to obtain a new Verifiable Credential from an Issuer.
Implementers are advised to consult this guide if they are directly involved with the W3C VC Working Group.
We use the term proof
in place of
signature
throughout this document. This is important
because not all cryptographic prooving techniqueas relying exclusively
on a single digital signature.
See [[DID-CORE]] for definitions of commonly-used DID terminology.
See [[VC-DATA-MODEL]] for definitions of commonly-used DID terminology.
Single message signature schemes make generic selective disclosure proofs difficult or impossible to implement on top of standard cryptographic tooling.
Traditional signature and proof formats have focused on single message signature and verification schemes.
For example this JWT encodes a Verifiable Credential, the input to the signature and verification algorithms is:
"base64url(JSON.stringify(header)).base64url(JSON.stringify(payload))"
{ "alg": "EdDSA", "kid": "did:key:z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr#z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr" }
{ "iss": "did:key:z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr", "sub": "did:example:ebfeb1f712ebc6f1c276e12ec21", "vc": { "@context": [ "https://www.w3.org/2018/credentials/v1", "https://w3id.org/security/suites/jws-2020/v1" ], "id": "http://example.edu/credentials/3732", "type": [ "VerifiableCredential" ], "issuer": { "id": "did:key:z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr" }, "issuanceDate": "2010-01-01T19:23:24Z", "credentialSubject": { "id": "did:example:ebfeb1f712ebc6f1c276e12ec21" }, "proof": { "type": "JsonWebSignature2020", "created": "2010-01-01T19:23:24Z", "verificationMethod": "did:key:z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr#z6MkokrsVo8DbGDsnMAjnoHhJotMbDZiHfvxM4j65d8prXUr", "proofPurpose": "assertionMethod", "jws": "eyJhbGciOiJFZERTQSIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..k_7t6h5IGSWFAqIlqru3zyZ0FDPQGo88p9jDeKC1yw8oxd7xj6B70tZNSaspWkMyWbXFmZ5yCO8dlZZ9_kKbAQ" } }, "jti": "http://example.edu/credentials/3732", "nbf": 1262373804 }
In the case of JSON-LD Linked Data Proofs, the input to the signature is typically calculated like this:
async canonize( input, { documentLoader, expansionMap, skipExpansion } ) { return jsonld.canonize(input, { algorithm: 'URDNA2015', format: 'application/n-quads', documentLoader, expansionMap, skipExpansion, useNative: this.useNativeCanonize, }); } async canonizeProof(proof, { documentLoader, expansionMap }) { // `jws` must not be included in the proof proof = { ...proof }; delete proof.jws; return this.canonize(proof, { documentLoader, expansionMap, skipExpansion: false, }); } async createVerifyData({ document, proof, documentLoader, expansionMap, }) { const c14nProofOptions = await canonizeProof(proof, { documentLoader, expansionMap, }); const c14nDocument = await canonize(document, { documentLoader, expansionMap, }); return Buffer.concat([ await sha256(c14nProofOptions), await sha256(c14nDocument), ]); }
While the JSON-LD approach is more complex, it performs the same function as the base64url and string encoding used by JOSE.
At the end of these "payload preparation" steps, a digital signature
sign
or verify
operation is used.
Tampering with a payload breaks an associated signature.
This requires a holder to return to the issuer for a new verifiable credential when attempting to reveal a subset of the claims the issuer has attested to in original verifiable credential.
Requiring a Holder to interact with the original issuer harms privacy and can be expensive in time and bandwith or impossible in offline scenarios.
How can a holder reveal some subset of issuer attested claims to a verifier, without contacting the issuer or asking the verifier to contact the issuer? Solutions to this problem are often referred to as Selective Disclosure
A multi message proof provides cryptographic tamper protection and authentication capabilities for a set of messages.
Because the payload
of the proof is broken up before the
sign
and verify
operations, the
holder can disclose parts
of the `payload` and parts of the `proof` without breaking the
cryptographic assurances.
There are a few examples of this approach under developement:
A multi message proof that is applied to an object will require some
stable transformations between messages
and
object
. See the section .
This suite proposes a solution for selective disclosure of issuer attested claims (verifiable credentials).
Unlike previous solutions such as CL Signatures or BBS+ Signatures 2020, this approach does not rely on Zero Knowledge Proofs, instead it relies on Merkle Proofs.
A key advantage of using merkle proofs is proving set membership by only relying on cryptographic hash functions.
Because a verifier will learn some information about undislosed set members when verifying a proof for disclosed ones, this solution does leak some information. The information a verifier learns is the path from a leaf to a merkle root, which proves a member exists in the set, but this path is built from hashes of members of the set the prover may not be dislosing.
A robust summary of merkle proofs is beyond the scope of this specification. The proof of concept we build relies on this implementation. The diagram below is from the wikipedia page on merkle trees.
The most popular solution to encoding digital signatures that rely on standard cryptography in JSON is [[RFC7515]].
A robust summary of Json Web Signatures is beyond the scope of this specification.
By using a standard digital signature approach to sign the
merkle root
, a holder can then disclose
messages
and proofs
, which can be verified
as originating from the issuer who produced the signature using their
private key.
An advantage of building selective disclosure proofs on top of JWS is that keys already in use for single message proofs can be used with multi message selective dislosure proofs.
[[RFC7515]] has been implemented in many languages. JWS and JWT are used as the foundation of most modern identity assurance systems.
One of the disadvantages of merkle proofs is their size.
As you can see in the merkle tree diagram, the size of a single set membership proof is O(log n). Depending on the size of the associated hashes, this can make sparse disclosures of set members (revealing all but a few members) very expensive in proof size.
Luckily each membership proof share common nodes in the tree, allowing for compression algorithms to provide significant advantage when disclosing most of the members of a set.
In our proof of concept we use this compression implementation, which is essentially the same as gzip.
Compressed encoding of merkle proofs is an area where better standards are needed. The solution we have used is subject to BREAKING CHANGES.
This suite specification describes an approach to selective dislosure proofs that is based on the original [[LD-PROOFS]] specification.
We are working with the community to develop this same proof technique for use without [[LD-PROOFS]] at the DIF Applied Cryptography Working Group. There is currently no registered way to encode multi message proofs in JOSE, but we are working with the community to remedy this.
There are 2 unsupported features which we require to enable multi message disclosure proofs in JOSE.
JSON-LD based proofs already support these requirements as was first demonstrated in [[LDP-BBS2020]]. This suite takes a more generic approach to the problem in order to support normalization that operate on JSON (which might or might not be JSON-LD).
In order to support signing and verifying of objects where object members are dislosed or ommitted, a bi-directional losseless message conversion process is required.
In our proof of concept we name two functions:
It is important that these processes be stable, such that chaining them together does not result in an object that is different than the input.
[[RFC6901]] defines operations over JSON objects, that are sufficient for use with this suite.
Here is some TypeScript codes that implements our required functions:
import pointer from 'json-pointer'; const objectToMessages = (obj: any) => { const dict = pointer.dict(obj); const messages = Object.keys(dict).map(key => { return `{"${key}": "${dict[key]}"}`; }); return messages; }; const messagesToObject = (messages: string[]) => { const obj = {}; messages .map(m => { return JSON.parse(m); }) .forEach(m => { const [key] = Object.keys(m); const value = m[key]; pointer.set(obj, key, value); }); return obj; }; export { objectToMessages, messagesToObject };
[[RDF-DATASET-NORMALIZATION]] defines operations over JSON-LD objects, that are sufficient for use with this suite.
This normalization approach is different from [[LD-PROOFS]] and [[LDP-BBS2020]]. The reason for the diffence is to address a common way to encode object payloads as messages, that is not bound to RDF, but remains compatible with it.
URDNA2015 normalization is not recommended due to its fragility with respect to context changes.
See the source code here.
In our proof of concept with use the sha256 hash algorithm and a binary encoding of merkle proofs.
Standard encodings of merkle proofs is an area for future work.
See the source code here.
In order to mitigate a verifier's ability to brute force set membership, this quite requires a disclosure to be derived with unique nonces deterministically generated from the original credential.
Unlike traditional single message proof schemes such as compact JWTs,
we are only signing the merkle root
.
This allows a Holder to
adjust both messages
and proofs
to selective disclose object members.
Because messages
and proofs
are not signed or verified,
it is critical that the merkle root signature
be verified first,
before verifying merkle proofs
for the individual messages.
As mentioned in , merkle proofs can be large, especially when many proofs must be provided when only a single message is withheld by a Holder.
In order to address this challenge, we rely on a proof representation that makes use of binary compression:
{ "type": "MerkleDisclosureProof2021", "created": "2021-08-22T19:36:43Z", "verificationMethod": "did:example:123#key-0", "proofPurpose": "assertionMethod", "normalization": "jsonPointer", "proofs": "eJzNzjeSq0gAANC7TNpbBQiEIGy8Ny0at7UBIIGEl4Q//f97hAmmauKXvH+/YPwMhwSEJabAxAOLpLiw7pLXZKRKfowbyt2zV/jmSpc9DJ91XMyE5YUIKOvzokey0gaHejnyOJh71jFIDYhC240BvOfb61CsWmw8gtY78UkSNl2S9+gs3T5wsU/LbI3eqFLNBMur8rFfp2hVz+URAUlib5DUlh6eEyLvEvzQvRIvF3wvH1//fEFFK5Vdoje3HsKamlSB7vmLxIrNwptqX3t1nbJP9zgYEv+WMoqf3JOoQPKWrQxbR5dRO1XJt8MYsm1geNGxIl90bo4kQ2egs0s/WznKjBDHQPP147o/QsBt8uJdG1KuqCy+KKNx/smyMwbyKq7c+b7onLJ4g6VxEmeLL25mfCU69vySRY31ljD1W8oomw7pgx0T1qVcG+2x2h9AH/6HtIgXBqwdeZWPPEzgDualrI39mh7Xtzn1804NEpEwN30uUj4XIs4vZUzm1uJVDdRIfhnda9WxBIs30YDZ4zw3nRwAj+KV0/wJr0WuXZLYb75bFmPu/r7xHUcFaEBStKWmJaBMDCJ9OyWYv5ZtrWY2cHT2t5Tv72UJHST54WiZGhd2GU4B9eJFhPiXbQCmkSbNShxPCaCyv2GGWdsPkBNhuhDyzY8Dl5szLzFPwFDcrHBEA8iz8pNlm9r7jI/bIqFP4f0ZRe6UxtG7CqUK1ZVDmew90f8KYzi/pRzngzCh4ipaZtcIDxK8qqQZiHTpk9YVW7/ko/qq3uQVVbA4K6XdiDMWtU0UykEYH52jDEVqFC22SaBKBAyqV+8UKoRvBaoNhXYfyeWiWh7ZEMzEzNQ0xS65T2GqOUASdelo/m8IpcIn161C2r1CmUrpY5Q2KevNsqKxnLZ+Hqrv6pmejuFPNgrn1hAHXcCpS6Y+sdhPOr54fwVEsLzX7ByWMM0raSmdDZq8LFVnJiJD8GRqnd48jXXalVDbnH+ceukerik66TxY1+82TG5Y7Xly2+es7xbCdnbZ6/aASUzbptBsO3PoppuLKNV+rPHfHzccgQI=", "jws": "eyJhbGciOiJFZERTQSJ9.ImkrMUVBbU9mMDJUM2JwdHdTcW5DNG1sNlc5TGNmYUU1cGVSY3JLbHdvUnc9Ig.ZXlKaGJHY2lPaUpGWkVSVFFTSXNJbUkyTkNJNlptRnNjMlVzSW1OeWFYUWlPbHNpWWpZMElsMTkuLmpEUFJMbW9taVJmc1kwX1hFOFdwVVNTZXdOeEUwRHI4LVlxNXBOeGdoZUJmVnhORlQ3aFZlMnBsU3NsT05PLXMwUzlLcGpTcXhqM2I2alowdDFqSERR" }
In order to derive a new Verifiable Credential which discloses a subset of the original, the holder must filter the messages associated with the original object, and the proofs associated with those messages.
Here is a TypeScript example:
const suite = new MerkleDisclosureProof2021(); const derivationResult = await suite.deriveProof({ inputDocumentWithProof: { ...originalDocumentWithProof }, outputDocument: { '@context': [ 'https://www.w3.org/2018/credentials/v1', 'https://w3id.org/security/suites/merkle-jws-2021/v1', { alsoKnownAs: 'https://www.w3.org/ns/activitystreams#alsoKnownAs', // nickName: 'https://www.w3.org/ns/activitystreams#nickName', }, ], id: 'http://example.edu/credentials/3732', type: ['VerifiableCredential'], issuer: 'https://example.edu/issuers/14', issuanceDate: '2010-01-01T19:23:24Z', credentialSubject: { alsoKnownAs: 'did:example:ebfeb1f712ebc6f1c276e12ec21', // nickName: 'Bob', }, }, documentLoader, }); const { document, proof } = derivationResult;
Unlike [[LDP-BBS2020]], our proof of concept does not rely on [[JSON-LD-FRAMING]].
This is due to also not relying exclusively on [[RDF-DATASET-NORMALIZATION]].
Instead we compute the messages
and proofs
by taking the set difference of the of the original
and derived document objects.
This approach works with any stable normalization algorithm, and is the reason for the
difference in our normalization process compared to [[LDP-BBS2020]].
These uses cases are hypothetical.
The GS1 Digital Link https://id.gs1.org/01/9506000134352
is also known as
Dal Giardino Risotto Rice with Mushrooms 411g.
Perhapse not all manufacturing details are necessary to disclose until a recall is issued, at which point sensitive product and supply chain details (costs, locations, times) can be disclosed from associated original credentials.
During an investigation, supply chain participants might be compelled to fully disclose credentials to an auditor or trusted third party.
Sometimes an authority or public registry maintainer may know that a single entity is known as multiple pseudonmous identifiers. For example:
The drivers license Q6780 22812 41253
might be also known as
Pearline Abshire
. During an investigation, her legal councel might want to be
able to prove that she used to be known as Katarina Kozey
with drivers
license number 9375599
when she worked as an informant on narcotics activity
in Alaska before being relocated under witness protection program.
Data processors should not collect sensitive information they do not need.
Data subjects should not need to expose sensitive information they do not need.
The following section describes security considerations that developers implementing this specification should be aware of in order to create secure software.
Per the [[VC Data Model]] issuanceDate is required, and can be used to correlate the subject when disclosed (as is required for data model conformance).
Additionally, the merkle root which is required by this suite to verify the disclosed claims can also be used as a unique identifier for correlating the subject. This issue is also common when working with JWTs.
Need to address second pre-image attacks.
Need to address unbalanced merkle tree attacks .