IEP-0009 |
diff --git a/search/search_index.json b/search/search_index.json
index 9e943a8..5447e33 100755
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"ISCC - Enhancement Proposals (IEPs)","text":""},{"location":"#what-is-an-iep","title":"What is an IEP?","text":"IEP stands for ISCC Enhancement Proposal. An IEP is a design document providing information to the ISCC community, or describing a new feature for the ISCC or its processes or environment.
IEPs are a mechanism for proposing new features, for collecting community input on an issue, and for documenting design decisions. The IEP author is responsible for building consensus within the community and documenting dissenting opinions.
Because the IEPs are maintained as text files in a versioned repository, their revision history is the historical record of the feature proposal.
"},{"location":"#list-of-ieps","title":"List of IEPs","text":"ID Title Type Status IEP-0000 IEP Purpose and Guidelines Process Draft IEP-0001 ISCC Structure and Format Core Draft IEP-0002 ISCC-UNIT Meta-Code Core Draft IEP-0003 ISCC-UNIT Condent-Code Text Core Draft IEP-0004 ISCC-UNIT Condent-Code Image Core Draft IEP-0005 ISCC-UNIT Condent-Code Audio Core Draft IEP-0006 ISCC-UNIT Condent-Code Video Core Draft IEP-0007 ISCC-UNIT Condent-Code Mixed Core Draft IEP-0008 ISCC-UNIT Data-Code Core TBD IEP-0009 ISCC-UNIT Instance-Code Core TBD IEP-0010 ISCC-CODE Core TBD IEP-0011 ISCC-ID Core TBD IEP-0012 ISCC Metadata Core TBD IEP-0013 ISCC Decentralized Content Registry Core Draft IEP-0014 EVM Based ISCC Registries Core TBD IEP-0015 ISCC DID Method Core Draft"},{"location":"iep-0000/","title":"IEP-0: IEP Purpose and Guidelines","text":"IEP: 0000 Title: IEP Purpose and Guidelines Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/2 Status: Draft Type: Process License: BSD-2-Clause Created: 2022-08-28 Updated: 2022-09-23"},{"location":"iep-0000/#1-what-is-an-iep","title":"1. What is an IEP?","text":"An ISCC Enhancement Proposal (IEP) is a design document providing information to the ISCC community, or describing a new feature for the ISCC or its processes or environment. An IEP should provide a concise technical specification of a feature and a rationale for the feature. IEPs have no special status except that accorded by the community.
IEPs are a mechanism for proposing new features, for collecting community input on an issue, and for documenting design decisions. The IEP author is responsible for building consensus within the community and documenting dissenting opinions.
Because the IEPs are maintained as text files in a versioned repository, their revision history is the historical record of the feature proposal.
"},{"location":"iep-0000/#2-iep-audience","title":"2. IEP audience","text":"The typical primary audience for IEPs are the developers of ISCC implementations.
However, other parts of the ISCC community may also choose to use the process (particularly for Informational IEPs) to document expected API conventions and to manage complex design coordination problems that require collaboration across multiple projects.
"},{"location":"iep-0000/#3-iep-workflow","title":"3. IEP workflow","text":"The IEP process begins with a new idea for the ISCC. Each potential IEP must have a champion - someone who writes the IEP using the style and format described below, shepherds the discussions in the appropriate forums, and attempts to build community consensus around the idea.
Small enhancements or patches to a particular piece of software often don't require coordination between multiple projects or implementations; these don't need an IEP and should be injected into the relevant project-specific development workflow with a patch submission to the applicable issue tracker.
After investigating past work, the best way to proceed is by opening issue for discussion about the new idea. Following a discussion, the proposal should be submitted to the IEPs git repository as a pull request. This draft must be written in IEP style as described below, and named with an alias such as \"iep-johndoe-new-semantic-id\" until an editor has assigned it an IEP number (authors MUST NOT self-assign IEP numbers).
When the IEP draft is complete, an IEP editor will assign the IEP a number, label it as Core, Informational, or Process, and merge the pull request to the IEPs git repository. The IEP editors will not unreasonably reject an IEP. Reasons for rejecting IEPs include duplication of effort, disregard for formatting rules, being too unfocused or too broad, being technically unsound, not providing proper motivation or addressing backwards compatibility. For an IEP to be accepted it must meet certain minimum criteria. It must be a clear and complete description of the proposed enhancement. The enhancement must represent a net improvement. The proposed implementation, if applicable, must be solid and must not complicate the protocol unduly.
The IEP author may update the draft as necessary in the git repository. Updates to drafts should also be submitted by the author as pull requests.
"},{"location":"iep-0000/#31-transferring-iep-ownership","title":"3.1 Transferring IEP ownership","text":"It occasionally becomes necessary to transfer ownership of IEPs to a new champion. A good reason to transfer ownership is because the original author no longer has the time or interest in updating it or following through with the IEP process, or is unreachable or not responding to email.
If you are interested in assuming ownership of an IEP, send a message asking to take over, addressed to both the original author and the IEP editors. If the original author doesn't respond to email in a timely manner, the IEP editors will make a unilateral decision.
"},{"location":"iep-0000/#32-iep-editors","title":"3.2 IEP editors","text":"The current IEP editors are:
- Titusz Pan tp@iscc.foundation
"},{"location":"iep-0000/#33-iep-editor-responsibilities","title":"3.3 IEP editor responsibilities","text":"The IEP editors subscribe to the IEP issue tracker. Correspondence outside the issue tracker should be sent (or CC'd) to the IEP editors.
For each new IEP that comes in an editor does the following:
-
Read the IEP to check if it is ready: sound and complete. The ideas must make technical sense.
-
The title should accurately describe the content.
-
Motivation and backward compatibility (when applicable) must be addressed.
-
Licensing terms must be acceptable for IEPs.
-
If the IEP isn't ready, the editor will send it back to the author for revision, with specific instructions.
-
Once the IEP is ready for the repository it should be submitted as a \"pull request\" to the IEPs git repository where it may get further feedback.
The IEP editor will:
- Assign an IEP number in the pull request.
- Merge the pull request when it is ready.
- List the IEP in README.md.
The IEP editors are intended to fulfill administrative and editorial responsibilities. The IEP editors monitor IEP changes, and update IEP headers as appropriate.
"},{"location":"iep-0000/#4-iep-format-and-structure","title":"4. IEP format and structure","text":"IEPs should be written in Markdown format.
Each IEP should have the following parts:
- Preamble - Headers containing metadata about the IEP (see below).
- Abstract - A short (~200 word) description of the issue being addressed.
- Motivation - A motivation section is critical for IEPs that want to change pre-existing specifications. It should clearly explain why the existing specification is inadequate to address the problem that the IEP solves. This section may be omitted if the motivation is evident.
- Specification - The specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations.
- Rationale - The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work. The rationale should provide evidence of consensus within the community and discuss important objections or concerns raised during discussion around the IEP.
- Backwards compatibility - All IEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The IEP must explain how the author proposes to deal with these incompatibilities.
- Reference implementation - A reference implementation must be completed before a Core IEP is given status Stable, but it need not be completed before the IEP is accepted. The stable implementation must include test code and documentation.
- Copyright - The IEP must be explicitly licensed under acceptable copyright terms see below.
"},{"location":"iep-0000/#41-iep-header-preamble","title":"4.1 IEP header preamble","text":"Each IEP must begin with a header preamble. The headers must appear in the following order. Headers marked with \"*\" are optional and are described below. All other headers are required.
IEP: <IEP number, or \"?\" before being assigned>\n Title: <IEP title; maximum 44 characters>\n Author: <list of authors' names and email addresses>\n Comments: <link to issue page for comments>\n Status: <Draft | Deferred | Withdrawn | Proposed | Rejected | Stable | Obsolete>\n Type: <Core | Informational | Process>\n License: <abbreviation for approved license(s)>\n Created: <date created on, in ISO 8601 (yyyy-mm-dd) format>\n Updated: <date updated on, in ISO 8601 (yyyy-mm-dd) format>\n* Replaces: <IEP number>\n* Superseded-By: <IEP number>\n
The Author header lists the names and email addresses of all the authors/owners of the IEP. The format of the Author header value must be
Random J. User <address@dom.ain>\n
If there are multiple authors, each should be on a separate line.
The Type header specifies the type of IEP: Core, Informational, or Process.
The Created header records the date that the IEP was assigned a number. Dates should be in yyyy-mm-dd format, e.g. 2001-08-14.
IEPs may have a Superseded-By header indicating that an IEP has been rendered obsolete by a later document; the value is the number of the IEP that replaces the current document. The newer IEP must have a Replaces header containing the number of the IEP that it rendered obsolete.
"},{"location":"iep-0000/#5-iep-types","title":"5. IEP Types","text":"There are three kinds of IEPs:
- A Core IEP describes any change that affects most or all ISCC implementations, such as a change to the algorithms for creating ISCCs, or any change or addition that affects the interoperability of applications using the ISCC. Core IEPs consist of two parts, a design document and a reference implementation.
- An Informational IEP describes an ISCC design issue, or provides general guidelines or information to the ISCC community, but does not propose a new feature. Informational IEPs do not necessarily represent a ISCC community consensus or recommendation, so users and implementors are free to ignore Informational IEPs or follow their advice.
- A Process IEP describes a process surrounding ISCC, or proposes a change to (or an event in) a process. Process IEPs are like Core IEPs but apply to areas other than the ISCC algorithms itself. Examples include procedures, guidelines, changes to the decision-making process, and changes to the tools or environment used in ISCC development.
"},{"location":"iep-0000/#6-iep-status","title":"6. IEP status","text":"The typical paths of the status of IEPs are as follows:
flowchart LR\n B[Draft]\n B --> C[Poposed]\n B <--> D[Deferred]\n B <--> E[Withdrawn]\n C --> F[Stable]\n C --> G[Rejected]\n F --> H[Obsolete]\n
Champions of an IEP may decide on their own to change the status between Draft, Deferred, or Withdrawn. An IEP editor may also change the status to Deferred when no progress is being made on the IEP.
An IEP may only change status from Draft to Proposed, when the author deems it is complete, has a working implementation (where applicable), and has community plans to progress it to the Stable status.
IEPs should be changed from Draft or Proposed status, to Rejected status, upon request by any person, if they have not made progress in three years. Such an IEP may be changed to Draft status if the champion provides revisions that meaningfully address public criticism of the proposal, or to Proposed status if it meets the criteria required as described in the previous paragraph.
An IEP may change status from Draft to Stable when it achieves rough consensus on the issue tracker and sufficient real-world adoption. Such a proposal is said to have rough consensus if it has been open to discussion on the issue tracker for at least one month, and no person maintains any unaddressed substantiated objections to it. Addressed or obstructive objections may be ignored/overruled by general agreement that they have been sufficiently addressed, but clear reasoning must be given in such circumstances.
Software authors are encouraged to publish summaries of what IEPs their software supports to aid in verification of status changes.
Should an IEP become actually and unambiguously adopted despite not meeting the criteria outlined here, it should still be updated to Stable status.
When a Stable IEP is no longer relevant, its status may be changed to Obsolete. This change must also be objectively verifiable and/or discussed.
"},{"location":"iep-0000/#7-iep-licensing","title":"7. IEP licensing","text":"New IEPs may be accepted with the following licenses. Each new IEP must identify at least one acceptable license in its preamble. The License header in the preamble must be placed befor the Created header. Each license must be referenced by their respective abbreviation given below.
IEPs are not required to be exclusively licensed under approved terms, and may also be licensed under unacceptable licenses in addition to at least one acceptable license. In this case, only the acceptable license(s) should be listed in the License header.
"},{"location":"iep-0000/#71-acceptable-licenses","title":"7.1 Acceptable licenses","text":" - BSD-2-Clause: OSI-approved BSD 2-clause license
- BSD-3-Clause: OSI-approved BSD 3-clause license
- CC0-1.0: Creative Commons CC0 1.0 Universal
- Apache-2.0: Apache License, version 2.0
- CC-BY-4.0: Creative Commons Attribution 4.0 International
- CC-BY-SA-4.0: Creative Commons Attribution-ShareAlike 4.0 International
- MIT: Expat/MIT/X11 license
"},{"location":"iep-0000/#8-history","title":"8. History","text":"This document was derived heavily from Bitcoin\u2019s BIP-0002 which in turn was derived from Python\u2019s PEP-0001. In many places text was simply copied and modified. The original authors of BIP-0002 and PEP-0001 are not responsible for its use in the ISCC Enhancement Proposals, and should not be bothered with technical questions specific to ISCC or the IEPs.
"},{"location":"iep-0001/","title":"ISCC Structure and Format","text":"IEP: 0001 Title: ISCC Structure and Format Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/6 Status: Draft Type: Core License: CC-BY-4.0 Created: 2022-09-23 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0001/#1-abstract","title":"1. Abstract","text":"This document describes the coding scheme for the International Standard Content Code (ISCC).
"},{"location":"iep-0001/#2-motivation","title":"2. Motivation","text":"The ISCC is a similarity preserving identifier for all kinds of digital content. As such the ISCC requires a universal coding scheme to meet a broad set of use cases and support different media types. The coding scheme for all ISCCs should be:
- compact
- interoperable
- modular
- extensible
- self-describing
"},{"location":"iep-0001/#3-general-structure-of-iscc","title":"3. General structure of ISCC","text":" - An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY.
- The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
- An ISCC-UNIT shall be an ISCC based on one specific algorithm.
- An ISCC-CODE shall be an ISCC composed of two or more different ISCC-UNITs.
Figure 1 - General Structure of an ISCC"},{"location":"iep-0001/#4-iscc-header","title":"4. ISCC-HEADER","text":"The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-fields MainType, SubType, Version, Length.
Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value between 0 and 4679 with the following encoding scheme:
- The total bit-length of a header-field shall be determined by its prefix-bits.
- The prefix-bits shall be followed by data-bits.
- The data-bits shall be interpreted as unsigned integer values plus the maximum value of the preceding range.
- If the total length of all header-fields in number of bits is not divisible by 8, the header shall be padded with 4 zero bits (0000) on the right side.
Table 1 \u2013 Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range 0 1 3 0-7 10 2 6 8-71 110 3 9 72-583 1110 4 12 584-4679 Header-field examples
0 = 0000\n1 = 0001\n\u2026\n7 = 0111\n8 = 1000 0000\n9 = 1000 0001\n
The interpretation of the integer value of a header-field shall be context dependent:
- For the MainType and SubType fields it shall be an identifier for the designated type.
- For the Version field it shall be the literal version number.
- For the Length field of ISCC-UNITs it shall be a number used as a multiplier to calculate the bit length of the ISCC-BODY.
- For the Length field of ISCC-CODEs it shall be a bit-pattern encoding the combination of ISCC-UNITs and the bit-length of the ISCC-BODY.
"},{"location":"iep-0001/#41-maintypes","title":"4.1 MainTypes","text":"The MainType header-field shall signify the type of the ISCC.
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by incrementing the version field of the ISCC-HEADER of the respective MainType.
Note
The first edition of the standard specifies initial algorithms (version 0) for all reserved MainTypes except for the SEMANTIC type which is not currently defined.
Table 2 \u2013 Reserved ISCC MainTypes
ID Symbol Bits Definition 0 META 0000 An ISCC-UNIT that matches on metadata similarity 1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity 2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity 3 DATA 0011 An ISCC-UNIT that matches on data similarity 4 INSTANCE 0100 An ISCC-UNIT that matches on data identity 5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for multi-modal matching"},{"location":"iep-0001/#42-subtypes","title":"4.2 SubTypes","text":"The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode.
Table 3 \u2013 Reserved SubTypes for MainTypes ISCC, SEMANTIC, and CONTENT
ID Symbol Bits Definition 0 TEXT 0000 Match on text similarity 1 IMAGE 0001 Match on image similarity 2 AUDIO 0010 Match on audio similarity 3 VIDEO 0011 Match on video similarity 4 MIXED 0100 Match on multi-modal similarity Table 4 \u2013 Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition 5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code 6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code"},{"location":"iep-0001/#43-version","title":"4.3 Version","text":"All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of the standard.
Table 5 \u2013 Reserved ISCC Versions
ID Symbol Bits Definition 0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE"},{"location":"iep-0001/#44-length","title":"4.4 Length","text":"The encoding of the Length header-field shall be specific to the MainType.
"},{"location":"iep-0001/#441-length-of-iscc-units","title":"4.4.1 Length of ISCC-UNITs","text":"For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA, and INSTANCE the length value shall be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits.
Table 6 \u2013 Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition 0 L32 0000 Length of body is 32 bits (minimum length) 1 L64 0001 Length of body is 64 bits (default length) 2 L96 0010 Length of body is 96 bits 3 L128 0011 Length of body is 128 bits 4 L160 0100 Length of body is 160 bits 5 L192 0101 Length of body is 192 bits 6 L224 0110 Length of body is 224 bits 7 L256 0111 Length of body is 256 bits"},{"location":"iep-0001/#442-length-of-iscc-codes","title":"4.4.2 Length of ISCC-CODEs","text":" -
For ISCC-CODEs the length value shall designate the composition of ISCC-UNITs.
-
The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
-
The first data-bit shall designate the presence of a 64-bit Meta-Code.
-
The second data-bit shall designate the presence of a 64-bit Semantic-Code.
-
The third data-bit shall designate the presence of a 64-bit Content-Code.
-
The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits of mandatory data.
Table 7 \u2013 Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition 0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits. 1 CDI 0001 Includes Content-Code. Length of body is 192 bits 2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits 3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits 4 MDI 0100 Includes Meta-Code. Length of body is 192 bits 5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits 6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits 7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits"},{"location":"iep-0001/#5-iscc-body","title":"5. ISCC-BODY","text":" - The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
- The Length field shall determine the number of bits of an ISCC-BODY.
"},{"location":"iep-0001/#6-encoding-of-iscc","title":"6. Encoding of ISCC","text":""},{"location":"iep-0001/#61-canonical-form","title":"6.1 Canonical form","text":"The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without padding and prefixed with \u201cISCC:\u201d.
Canonical ISCC-CODE example
ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY
"},{"location":"iep-0001/#62-uri-encoding","title":"6.2 URI encoding","text":" - An ISCC shall be encodable as a Uniform Resource Identifier (URI) according to RFC 3986.
- The URI representation shall have the format
<scheme>:<path>
. - The URI scheme shall be the string \u201ciscc\u201d.
- The URI path shall be the lower-cased base32 representation of an ISCC without padding.
URI encoded ISCC-CODE example
iscc:kec43hjlpushvazt66ylpuwnvacwypiv533trqmwf2iuqysp5la4cty
"},{"location":"iep-0001/#63-multiformats-encoding","title":"6.3 Multiformats encoding","text":" - The ISCC shall be encodable as a multibase string.
- The multicodec identifier of an ISCC shall be
0xcc01
. - A Multiformat representation of an ISCC shall be prefixed with a Multibase code.
- The encoding scheme shall be
<multibase><multicodec><iscc-header><iscc-body>
.
ISCC shall support the following multibase encodings:
Table 8 \u2013 Supported multibase encodings
Encoding Code Definition base16 f hexadecimal base32 b RFC4648 case-insensitive - no padding base32hex v RFC4648 case-insensitive - no padding - highest char base58btc z base58 bitcoin base64url u RFC4648 no padding Table 9 \u2013 Examples of ISCCs in multiformats encoding
Encoding Example MF base16 fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f MF base32 bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4 MF base32hex vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s MF base58btc z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx MF base64url uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP"},{"location":"iep-0001/#64-readable-encoding","title":"6.4 Readable encoding","text":" - The ISCC shall be encodable in human-readable representation.
- The readable representation shall encode the header-fields with their symbols and the ISCC-BODY in base16 lower-case.
- The header-fields and the ISCC-BODY shall be separated with hyphens.
Example of human-readable ISCC-CODE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
"},{"location":"iep-0001/#7-reference-implementation","title":"7. Reference implementation","text":"The reference implementation of this coding scheme is published in the iscc-core python package in the codec.py module.
"},{"location":"iep-0002/","title":"ISCC-UNIT Meta-Code","text":"IEP: 0002 Title: ISCC-UNIT Meta-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/7 Status: Draft Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0002/#1-general","title":"1. General","text":"The Meta-Code is a similarity hash generated from referent seed metadata as defined in IEP-0012
"},{"location":"iep-0002/#2-purpose","title":"2. Purpose","text":"The Meta-Code shall support the following use cases:
- clustering of digital assets based on their metadata;
- discovery of digital assets with similar metadata;
- verification or manual disambiguation of matching codes.
"},{"location":"iep-0002/#3-format","title":"3. Format","text":"The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 - Data format of the Meta-Code EXAMPLE: 64-bit Meta-Code in its canonical form:
ISCC:AAAUL6P7RMVNT4UJ
EXAMPLE: 256-bit Meta-Code in its canonical form:
ISCC:AADUL6P7RMVNT4UJJ4SMTDXBL5JFZ5XPCDKO42XYPJEVQ4L7PTYDORQ
"},{"location":"iep-0002/#4-inputs","title":"4. Inputs","text":"Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible elements:
- name (required): the name or title of the work manifested by the digital asset;
- description (optional): a disambiguating textual description of the digital asset;
- meta (optional): subject, industry, or use-case specific metadata.
NOTE 1
Because seed metadata is used to construct the Meta-Code, changes to its value may produce different (and therefore no longer matching) Meta-Codes. Seed metadata is stored and carried along unaltered with ISCC Metadata if automated verification of the Meta-Code based on the original seed metadata is required.
NOTE 2
The identifier standards and their schemas defined by ISO/TC 46/SC 9 provide helpful guidance in selecting seed metadata.
"},{"location":"iep-0002/#41-name-element","title":"4.1 name element","text":"The text input for the name element shall be pre-processed before similarity hashing as follows:
- Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https://unicode.org/reports/tr15/#Norm_Forms).
- Remove control characters (see Unicode Character Database https://www.unicode.org/ucd/).
- Strip leading and trailing whitespace.
- Trim the end of the text such that the UTF-8 encoded size does not exceed 128 bytes.
"},{"location":"iep-0002/#42-description-element","title":"4.2 description element","text":"Text input for the description element shall be pre-processed before similarity hashing as follows:
- Apply NFKC Unicode Normalization.
- Remove control characters (as specified by Unicode Character Database) except for the following newline characters:
- U000A - Line Feed;
- U000B - Vertical Tab;
- U000C - Form Feed;
- U000D - Carriage Return;
- U0085 - Next Line;
- U2028 - Line Separator;
- U2029 - Paragraph Separator.
- Collapse more than two consecutive newlines characters to a maximum of two consecutive newlines.
- Strip leading and trailing whitespace characters.
"},{"location":"iep-0002/#43-meta-element","title":"4.3 meta element","text":" - The value of the meta element shall be wrapped in a RFC 2397 Data-URL.
- The value of the meta element may include any conceivable and supportive metadata such as for example:
- JSON serialized metadata (
data:application/json;base64,<data>
); - JSON-LD serialized metadata (
data:application/ld+json;base64,<data>
); - XML serialized metadata (
data:application/xml;base64,<data>
); - MARC21 XML (
data:application/xml;base64,<data>
); - IPTC NewsML (
data:application/vnd.iptc.g2.newsitem+xml;base64,<data>
); - a file header (
data:application/octet-stream;base64,<data>
); - a thumbnail image (
data:image/png;base64,<data>
); - an audio sample (
data:audio/mp4;base64,<data>
).
- If the value of the meta element is JSON or JSON-LD it shall be serialized with RFC 8785 JCS canonicalization before being wrapped in a Data-URL.
- If the value of the meta element is XML it shall be serialized as Canonical XML.
- The Data-URL shall be pre-processed before similarity hashing as follows:
- Decode the base64 encoded data section of the Data URL to a raw bitstream without further interpretation.
"},{"location":"iep-0002/#5-outputs","title":"5. Outputs","text":"Meta-Code processing shall generate the following output elements for inclusion into the produced ISCC metadata:
- iscc (required): the ISCC Meta-Code in its canonical form;
- name (required): the pre-processed value of the name element;
- meta (optional): the unaltered value of the meta element;
- description (optional): the pre-processed value of the description element;
- metahash (required): a cryptographic hash of the seed metadata.
NOTE 1
The reference implementation uses a multihash 1 encoded BLAKE3 2 value for the metahash element.
NOTE 2
An ISCC processor may produce other custom output elements, which are helpful to identify the digital asset.
"},{"location":"iep-0002/#6-seed-metadata","title":"6. Seed metadata","text":""},{"location":"iep-0002/#61-meta-code-processing","title":"6.1 Meta-Code processing","text":"The Meta-Code shall be constructed from 2 similarity hashes interleaved in 32-bit chunks by selecting the elements according to the algorithm illustrated in Figure 3.
Figure 3 - Meta-Code processing logic - If the name element is unavailable, Meta-Code generation shall be skipped.
- The first part of the similarity hash for the Meta-Code shall be generated from the name element.
- The second part of the similarity hash shall be generated from the meta element.
- If the meta element is unavailable, the second part of the similarity hash shall be generated from the description element.
- If the description element is unavailable, the second part of the similarity hash shall also be generated from the name element.
"},{"location":"iep-0002/#62-meta-hash-processing","title":"6.2 Meta-Hash processing","text":"The Meta-Hash shall be constructed from the seed metadata by selecting input elements according to the algorithm illustrated in Figure 4.
Figure 4 - Meta-Hash processing logic - If the name element is unavailable, Meta-Hash generation shall be skipped.
- If the meta element is available, the decoded raw and un-interpreted data of the Data-URL shall be used as sole input to the cryptographic hash function.
- If the meta element is unavailable, but the description element is available, the space-concatenated value of the pre-processed name and description shall be the input to the cryptographic hash function.
- If only the name element is available, its pre-processed value shall be the input to the cryptographic hash function.
"},{"location":"iep-0002/#7-metadata-embedding","title":"7. Metadata embedding","text":" - Seed metadata shall be embedded into the processed digital asset if:
- seed metadata values have been provided explicitly to an ISCC processor;
- the ISCC processor supports metadata embedding for the given media type.
- If the media type supports ISO 16684 XMP metadata-embedding, an ISCC processor shall use the namespace http://purl.org/iscc/schema and embed seed metadata values under the names:
- Xmp.iscc.name
- Xmp.iscc.description
- Xmp.iscc.meta
- If the media type does not support ISO 16684 XMP metadata-embedding the ISCC processor may choose other suitable format-specific fields for embedding seed metadata.
- If seed metadata is to be embedded, it shall be embedded before processing other ISCC-UNITs.
- An ISCC processor should document for which media types it supports metadata-embedding and how it maps seed metadata to format specific elements.
"},{"location":"iep-0002/#8-metadata-extraction","title":"8. Metadata extraction","text":" - An ISCC processor shall try to extract seed metadata from the digital asset if:
- seed metadata has not been provided explicitly to the ISCC processor;
- the ISCC processor supports metadata extraction for the given media type.
- Seed metadata shall be extracted with the following precedence:
- Extract seed metadata from XMP metadata under the namespace http://purl.org/iscc/schema.
- Extract seed metadata from suitable, format-specific embedded metadata.
- Use the filename of the asset as a value for the name element, discarding the file extension and replacing the characters \u201c-\u201d and \u201c_\u201d with spaces.
- An ISCC processor shall document for which media types it supports metadata-extraction and how it maps seed metadata to format specific elements.
Bibliography
-
IETF, draft-multiformats-multihash-05 \u2014 The Multihash Data Format Available at https://datatracker.ietf.org/doc/html/draft-multiformats-multihash-05 \u21a9
-
O\u2019Connor, J., Aumasson, J.P., Neves, S., Wilcox-O\u2019Hearn, Z., BLAKE3: one function, fast everywhere. Version 20211102173700, accessed July 2022. Available at https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf \u21a9
"},{"location":"iep-0003/","title":"ISCC-UNIT Content-Code Text","text":"IEP: 0003 Title: ISCC-UNIT Condent-Code Text Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/8 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0003/#1-general","title":"1. General","text":" - The Content-Code Subtype Text (Text-Code) shall be a Content-Code generated from the plain text content extracted from a digital asset that contains text.
- The Text-Code shall be robust against text document format conversion and minor edits.
"},{"location":"iep-0003/#2-format","title":"2. Format","text":"The Text-Code shall have the data format illustrated in Figure 5.
Figure 5 - Data format of the Text-Code EXAMPLE 1: 64-bit Text-Code in its canonical form with text input \u201cHello World\u201d:
ISCC:EAASKDNZNYGUUF5A
EXAMPLE 2: 256-bit Text-Code in its canonical form with text input \u201cHello World\u201d:
ISCC:EADSKDNZNYGUUF5AMFEJLZ5P66CP5YKCOA3X7F36RWE4CIRCBTUWXYY
"},{"location":"iep-0003/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Text-Code shall be the plain text as extracted from a digital document.
- An ISCC implementation may use any text extraction mechanism, including optical character recognition, to extract plain text from a digital document.
- Plain text used as input for Text-Code calculation shall not include any processing instructions (SGML, HTML, Markdown, and other markup information).
"},{"location":"iep-0003/#4-outputs","title":"4. Outputs","text":"Text-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Text-Code in its canonical form (required);
- characters: The number of characters of the source text after pre-processing (optional);
- Additional metadata extracted from the document (optional).
"},{"location":"iep-0003/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process Text input as follows:
- Apply NFD Unicode Normalization.
- Remove all whitespace characters from the text.
- Convert text to lower case in accordance with Unicode \u2018Case Folding Properties\u2019 (see https://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt).
- Remove all characters from Unicode categories Mark (M), Punctuation (P) and Other (C).
- Apply NFKC Unicode Normalization.
An ISCC processor shall calculate the Text-Code as follows:
- Split the pre-processed text into n-grams of 13 characters by sliding over the text character-wise.
- Create a list of 32-bit unsigned integers by hashing the UTF-8 encoded representation of the n-grams using the XXH32 algorithm.
- Apply the Minhash256 algorithm to the list of integers to calculate the ISCC-BODY of the Text-Code.
"},{"location":"iep-0003/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating a Text-Code is specified only for UTF-8 encoded text input. An implementation of the Text-Code algorithm shall be regarded as conforming to the standard as long as it creates the same Text-Code as the reference implementation for the same UTF-8 encoded text input.
The normative behaviour of an ISCC processor in generating a Text-Code from other sources is not specified and different codes can be generated depending on the tools used for text extraction from the source. Implementers seeking to guarantee interoperability with each other in these circumstances should select the same tool for text extraction.
NOTE
For further technical details see source-code in the modules code_content_text.py and minhash.py of the reference implementation.
"},{"location":"iep-0004/","title":"ISCC-UNIT Content-Code Image","text":"IEP: 0004 Title: ISCC-UNIT Condent-Code Image Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/9 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-31 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0004/#1-general","title":"1. General","text":" - The Content-Code Subtype Image (Image-Code) shall be a perceptual similarity hash of the input image.
- The Image-Code shall be robust against image format conversion, scaling, compression and minor edits.
"},{"location":"iep-0004/#2-format","title":"2. Format","text":"The Image-Code shall have the data format illustrated in Figure 6.
Figure 6 - Data format of the Image-Code EXAMPLE 1: 64-bit Image-Code in its canonical form:
ISCC:EEA4GQZQTY6J5DTH
EXAMPLE 2: 256-bit Image-Code in its canonical form:
ISCC:EED4GQZQTY6J5DTHQ2DWCPDZHQOM6QZQTY6J5DTFZ2DWCPDZHQOMXDI
"},{"location":"iep-0004/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Image-Code shall be an image file.
- An ISCC processor shall at least support the JPEG and PNG image formats.
"},{"location":"iep-0004/#4-outputs","title":"4. Outputs","text":"Image-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Image-Code in its canonical form (required);
- width: width of the original input image in number of pixels (optional);
- height: height of the original input image in number of pixels (optional);
- thumbnail: a thumbnail of the original image encoded as Data-URL (optional);
- Additional metadata extracted from the image (optional).
"},{"location":"iep-0004/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the image file input as follows:
- Transpose the image according to its orientation tag (if available).
- Add white background to image if it contains alpha transparency.
- Crop uniformly colored borders if applicable.
- Convert image to grayscale.
- Resize grayscale image to 32x32 pixels using bicubic interpolation.
An ISCC processor shall calculate the Image-Code as follows:
- Apply discrete cosine transform to the 32x32 grayscale pixel matrix.
- Calculate the median value of the upper left 8x8 pixels of the transformed matrix.
- For each pixel of the upper-left 8x8 square set a 1-bit if the grayscale value is larger than the median and a 0-bit if it is smaller or equal to the median value.
- The collected bits are the first 64 bits of the body of the Image-Code.
- To extend the perceptual hash up to 256 bits repeat steps 2-3 for the top-right, bottom-left and bottom-right 8x8 squares in the given order.
"},{"location":"iep-0004/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating an Image\u2013Code is specified only for the pre-processed 32x32 pixel grayscale input. An implementation of the Image-Code algorithm shall be regarded as conforming to the standard if it creates the same Image-Code as the reference implementation for the same 32x32 grayscale pixel values.
Implementers seeking to guarantee interoperability with each other in these circumstances should select the same tool for image pre-processing.
NOTE
For further technical details see source-code in modules code_content_image.py and dct.py of the reference implementation.
"},{"location":"iep-0005/","title":"ISCC-UNIT Content-Code Audio","text":"IEP: 0005 Title: ISCC-UNIT Condent-Code Audio Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/10 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-31 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0005/#1-general","title":"1. General","text":" - The Content-Code Subtype Audio (Audio-Code) shall be a similarity hash of the audio input.
- The Audio-Code shall be robust against audio format conversion, compression, and minor edits.
"},{"location":"iep-0005/#2-format","title":"2. Format","text":"The Audio-Code shall have the data format illustrated in Figure 7.
Figure 7 - Data format of the Audio-Code EXAMPLE 1: 64-bit Audio-Code in its canonical form:
ISCC:EIAWUJFCEZZOJYVD
EXAMPLE 2: 256-bit Audio-Code in its canonical form:
ISCC:EIDWUJFCEZZOJYVDHJHIRB3KQSQCM2REUITDUTVAQNRGJIRENCCCULY
"},{"location":"iep-0005/#3-inputs","title":"3. Inputs","text":"The input for calculating the Audio-Code shall be the Chromaprint fingerprint (array of 32-bit signed integers) from the original audio data.
"},{"location":"iep-0005/#4-outputs","title":"4. Outputs","text":"Audio-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Audio-Code in its canonical form (required).
- duration: duration of audio in seconds (optional).
- Additional metadata extracted from the audio file (optional).
"},{"location":"iep-0005/#5-processing","title":"5. Processing","text":"An ISCC processor shall calculate the Audio-Code as follows:
- Calculate a 32-bit similarity hash from the input array.
- Additionally, calculate 32-bit similarity hashes separately for each quarter of values from the input array and concatenate them to the result from step 1 in order to extend the similarity hash up to a total of 160 bits.
- Sort the input array by value in ascending order and calculate similarity hashes separately for each third of the values and concatenate them to the result from step 2 in order to extend the similarity hash of the final ISCC-BODY of the Audio-Code up to the maximum of 256 bits.
"},{"location":"iep-0005/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating an Audio\u2013Code is specified only for the Chromaprint input array. An implementation of the Audio-Code algorithm shall be regarded as conforming to the standard if it creates the same Audio-Code as the reference implementation for the same Chromaprint array of 32-bit values.
NOTE
For further technical details see source-code in modules code_content_audio.py and simhash.py of the reference implementation.
"},{"location":"iep-0006/","title":"ISCC-UNIT Content-Code Video","text":"IEP: 0006 Title: ISCC-UNIT Condent-Code Video Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/11 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2024-01-01 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0006/#1-general","title":"1. General","text":" - The Content-Code Subtype Video (Video-Code) shall be a similarity hash of the input video.
- The Video-Code shall be robust against format conversions, scaling, compression, changes of framerate and minor edits.
"},{"location":"iep-0006/#2-format","title":"2. Format","text":"The Video-Code shall have the data format illustrated in Figure 8.
Figure 8 - Data format of the Video-Code EXAMPLE 1: 64-bit Video-Code in its canonical form:
ISCC:EMA7KERCWROEVL6F
EXAMPLE 2: 256-bit Video-Code in its canonical form:
ISCC:EMD7KERCWROEVL6FU3SISZAZPJCBAZKXEZLZTSRQPGASTENCMSYFOAY
"},{"location":"iep-0006/#3-inputs","title":"3. Inputs","text":"The input for calculation of the Video-Code shall be the MPEG-7 video frame signatures, as specified in ISO/IEC 15938, from the original video data.
"},{"location":"iep-0006/#4-outputs","title":"4. Outputs","text":"Video-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Video-Code in its canonical form (required);
- width: width of the original input video in number of pixels (optional);
- height: height of the original input video in number of pixels (optional);
- duration: duration of video in seconds (optional);
- fps: frames per second of the original video (optional);
- thumbnail: a thumbnail from the original video encoded as Data-URL (optional);
- Additional metadata extracted from the video (optional).
"},{"location":"iep-0006/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the video file input as follows:
- Generate the MPEG-7 video signature at 5 frames per second.
An ISCC processor shall calculate the Video-Code as follows:
- For each of the 380 integer values of the MPEG-7 frame signature calculate its sum over all frames.
- Apply WTA hash to the resulting array of 380 integer values to calculate the ISCC-BODY of the Video-Code at the desired bit-length.
The normative behaviour of an ISCC processor in generating a Video\u2013Code is specified only for the MPEG 7 video frame signature input. An implementation of the Video-Code algorithm shall be regarded as conforming to the standard if it creates the same Video-Code as the reference implementation for the same MPEG 7 video frame signature input.
NOTE
For further technical details see source-code in modules code_content_video.py and wtahash.py of the reference implementation.
"},{"location":"iep-0007/","title":"ISCC-UNIT Content-Code Mixed","text":"IEP: 0007 Title: ISCC-UNIT Condent-Code Mixed Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/12 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2024-01-02 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0007/#1-general","title":"1. General","text":" - The Content-Code Subtype Mixed (Mixed-Code) shall be a similarity preserving hash of a collection of assets of the same or different media types combined into a single multimedia file.
- An ISCC processor that supports the creation of Mixed-Codes shall publicly document the supported file formats and the rules by which it divides the different parts of a multimedia file.
- The Mixed-Code shall be robust against format conversions, scaling, compression, and minor edits of the individual parts of the multimedia file.
"},{"location":"iep-0007/#2-format","title":"2. Format","text":"The Mixed-Code shall have the data format illustrated in Figure 9:
Figure 8 - Data format of the Mixed-Code EXAMPLE 1: 64-bit Mixed-Code in its canonical form:
ISCC:EQASD57JXX7U73P7
EXAMPLE 2: 256-bit Mixed-Code in its canonical form:
ISCC:EQDSD57JXX7U73P7HPPH2P3U5OXZM7PL65T3HZ5JZ76H577P77NO5ZY
"},{"location":"iep-0007/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Mixed-Code shall be the Content-Codes of the individual parts of the multimedia file.
- At least two Content-Codes shall be required as input to calculate a Mixed-Code.
"},{"location":"iep-0007/#4-outputs","title":"4. Outputs","text":"Mixed-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Mixed-Code in its canonical form (required);
- parts: the list of Content\u2013Codes used for calculating the Mixed-Code (recommended);
- Additional metadata extracted from the multimedia file (optional).
"},{"location":"iep-0007/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the multimedia file as follows:
- Generate individual Content-Codes for each part of the multimedia file according to the specifications in IEP-0003, IEP-0004, IEP-0005 and IEP-0006.
An ISCC processor shall calculate the Mixed-Code as follows:
- Create a byte sequence from each Content-Code retaining the first byte of the ISCC-HEADER concatenated with the bytes of the ISCC-BODY.
- Apply the similarity hash to the list of byte sequences from step 1 to calculate the ISCC-BODY of the Mixed-Code.
"},{"location":"iep-0007/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating a Mixed Code is specified only for Content-Code inputs. An implementation of the Mixed-Code algorithm shall be regarded as conforming to the standard if it creates the same Mixed-Code as the reference implementation for the same Content-Code inputs.
NOTE
For further technical details see source-code in modules code_content_mixed.py and simhash.py of the reference implementation.
"},{"location":"iep-0008/","title":"ISCC-UNIT Data-Code","text":"IEP: 0008 Title: ISCC-UNIT Data-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/13 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0009/","title":"ISCC-UNIT Instance-Code","text":"IEP: 0009 Title: ISCC-UNIT Instance-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/14 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0010/","title":"ISCC-CODE","text":"IEP: 0010 Title: ISCC-CODE Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/15 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0011/","title":"ISCC-ID","text":"IEP: 0011 Title: ISCC-ID Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/16 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2022-09-28"},{"location":"iep-0012/","title":"ISCC Metadata","text":"IEP: 0012 Title: ISCC Metadata Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/17 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0013/","title":"ISCC Decentralized Content Registry","text":"IEP: 0013 Title: ISCC Decentralized Content Registry Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/18 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30"},{"location":"iep-0013/#1-status-of-this-document","title":"1. Status of This Document","text":"This document is a draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organization.
"},{"location":"iep-0013/#2-introduction","title":"2. Introduction","text":"The purpose of a decentralized content registry is to connect Actors to Digital Content in a permissionless decentralized environment and provide a global and verifiable data space for content identification and matching.
Actors authenticate themselves with their blockchain accounts which they use to sign ISCC-CODE declarations (ledger transactions). Digital Content is identified by ISCC-CODEs. The ISCC-ID is derived from an ISCC-CODE, a blockchain account and the history of previous declarations. ISCC-IDs are globally unique, persistent, authenticated, and resolve to at least exactly one ISCC-CODE and a blockchain account. The ISCC-IDs are not required to be generated or stored on the participating ledgers. ISCC-IDs are the result of processing the history of transactions according to the Minting Protocol.
"},{"location":"iep-0013/#3-protocol-overview","title":"3. Protocol Overview","text":"The protocol to declare an ISCC-CODE and trigger the minting of an ISCC-ID is divided into 3 parts, the Declaration Protocol, the Minting Protocol and the Resolution Protocol.
- The declaration protocol defines how an ISCC-CODE has to be written to a ledger to become a valid input for the off-chain minting protocol.
- The minting protocol defines how a legers history has to be parsed to mint a valid ISCC-ID
- The resolution protocol defines how an ISCC resolver answers queries about ISCC-CODEs and ISCC-ID.
"},{"location":"iep-0013/#4-declaration-protocol","title":"4. Declaration Protocol","text":"To participate in the ISCC declaration protocol, a ledger MUST establish exactly one globally unique Ledger-ID (Variable Length Integer) that will be used as a prefix for ISCC-IDs that are minted from its ISCC declarations.
Note
An ISCC-ID comes into existence only after an ISCC declaration has been confirmed on a ledger that participates in the protocol.
The following minimal information (Declaration-Set) MUST be provided and made publicly available for a valid ISCC declaration:
- An ISCC-CODE (a valid sequence of ISCC-UNITs)
- A blockchain account (actors identifier) of the declaring party
- A valid signature of the declaring party (transaction signature)
We define the party that signs the ISCC declaration as the DECLARER.
Note
The DECLARER is merely the controller of the ISCC-ID minted from the declaration. The declarer is not required to be the creator or a rights-holder of the declared digital content.
An ISCC declaration MAY additionally include:
- A link to external metadata as defined by IEP-0012 - ISCC Metadata
- A processing instruction for the minting protocol
The on-chain link to ISCC metadata SHOULD point to a public and integrity preserving resource (e.g. IPFS CID or a hashlink URI). Permissioned, confidential or mutable data SHOULD be referenced from ISCC metadata via URI.
A ledger that wants to accept ISCC declarations and trigger the minting of valid ISCC-IDs MUST fulfill the following minimum requirements:
- The ledger must provide an immutable, complete, time-ordered, append-only sequence of transactions.
- The legers transaction format must allow for embedding and signing the data required for an ISCC declarations.
- The Declaration-Set MUST be publicly readable (permisionless).
- ISCC declarations on the ledger MAY be write-permissioned.
A participating ledger or framework MUST provide documentaation of its implementation of the declaration protocol.
- The documentation MUST specify how ISCC declarations can be parsed to decode the Declaration-Set
- The documentation MUST provide sufficient information to the public such that third parties can independently verify transactions signatures and implement the Minting Protocol
- The documentation MUST define how a public observer can distinguish between a transaction that declares an ISCC-CODE and other unrelated transactions.
"},{"location":"iep-0013/#5-minting-protocol","title":"5. Minting Protocol","text":"TBD
"},{"location":"iep-0013/#6-resolution-protocol","title":"6. Resolution Protocol","text":"TBD
"},{"location":"iep-0013/#7-reference-implementation","title":"7. Reference Implementation","text":" - ISCC-ID construction: https://github.com/iscc/iscc-core/blob/main/iscc_core/iscc_id.py
- EVM registry: https://github.com/iscc/iscc-evm
- EVM observer: https://github.com/iscc/iscc-observer-evm
- ISCC-ID resolver: https://github.com/iscc/iscc-registry
"},{"location":"iep-0014/","title":"EVM Based ISCC Registries","text":"IEP: 0014 Title: EVM Based ISCC Registries Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/19 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2022-09-28"},{"location":"iep-0015/","title":"ISCC DID Method","text":"IEP: 0015 Title: ISCC DID Method Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/20 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30 Abstract
A DID method that identifies decentralized declarations of digital content using ISCC-IDs.
Status of This Document
This document is a draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organization.
"},{"location":"iep-0015/#1-introduction","title":"1. Introduction","text":""},{"location":"iep-0015/#11-preface","title":"1.1 Preface","text":"The ISCC DID method specification conforms to the requirements specified in the Decentralized Identifiers v1.0 Specification DID-CORE.
"},{"location":"iep-0015/#12-motivation","title":"1.2 Motivation","text":"The need for a universal identifier for digital content has emerged as an increasing amount of dynamic, short-lived and granular digital content is produced, consumed and processed. Commercial interests of many stakeholders depend on proper identification of digital content.
Professionally produced digital content but also semi-professional and user-generated content are the currency of the information age. A variety of specific content identifier standards already exist, but a universal content-dependent identifier for digital media has not yet been developed.
In particular, the structure and management of identifiers for digital content have a substantial impact on the level of possible adoption, automation, and the potential for machine-to-machine communication and innovation within and across different industry sectors.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing its underlying data as it travels through a complex network of actors and systems. These circumstances require a special design for a universal identifier that is capable of matching transcoded or otherwise transformed content.
"},{"location":"iep-0015/#13-the-iscc","title":"1.3 The ISCC","text":"The ISCC (International Standard Content Code) is a universal and open identification system for text, audio, image, and video content. ISCC-CODEs can be created from media assets by anybody using open source software. Similar content can then be matched by comparing ISCC-CODEs only.
Example ISCC-CODE
ISCC:KECYCPU3OKIUDZ7TYBRK5HZ4JGPTILLAT2IW7TY7EYIJI4QSK5I353I\nDecoded: ISCC-IMAGE-V0-MCDI-813e9b729141e7f3c062ae9f3c499f342d609e916fcf1f26109472125751beed\n
Users can also register ISCC-CODEs on any supported public blockchain to obtain a short and globaly unique ISCC-ID. The ISCC-ID is under the control of the registrant and resolves to an ISCC-CODE, on-chain metadata and optional off-chain metadata. ISCC-IDs are globaly unique even if the same ISCC-CODE is registered multiple times by different entities. An ISCC-ID is minted deterministically by observing participating legers and can be reproduced by anybody who observes the public and immutable registration events.
Example ISCC-ID
ISCC:MIAGWPTV4J2Z57CI\nDecoded: ID-ETHEREUM-V0-64-6b3e75e2759efc48\n
"},{"location":"iep-0015/#14-iscc-id-as-did","title":"1.4 ISCC-ID as DID","text":"The ISCC DID method creates a mechanism to reference digital content with a globaly unique persistent identifier that does not require a centralized registration authority. Instead, the ISCC system defines an open and voluntary cross-chain registration protocol using cryptography and distributed ledger technology.
Integrating ISCC with the DID system improves ISCC interoperability. DID documents provide standardized ways to discover services related to the referenced content and its registrant.
Verifiable credentials discovered through the DID document service
property can improve trust in otherwise permissionless content registrations. Additionaly the use of decentralized web nodes allow for interoperable discovery and data sovereignity of hosted verifiable credentials.
At the same time ISCC would bring open content identification to the Decentalized Identifiers ecosystem.
"},{"location":"iep-0015/#2-method-syntax","title":"2. Method Syntax","text":""},{"location":"iep-0015/#21-method-name","title":"2.1 Method Name","text":" - The name that shall identify this DID method is:
iscc
. - A DID that uses this method MUST begin with the following prefix:
did:iscc:
. - According to the DID specification, this string MUST be in lowercase.
"},{"location":"iep-0015/#22-method-specific-identifier","title":"2.2 Method Specific Identifier","text":"The ISCC DID scheme conforms to the DID Syntax and is defined by the follwing ABNF:
ISCC DID scheme ABNF
iscc-did = \"did:iscc:\" iscc\niscc = 10*88(numbers / letters)\nnumbers = %x32-37 ; 2-7\nletters = %x61-7A ; a-z\n
- The data structure of the ISCC is
<MainType><SubType><Version><Length><ISCC-BODY>
- The method specific identifier is a lower-cased base32 representation of the ISCC structure.
- The regular expression for this DID method is
^did:iscc:[2-7a-z]{10,88}$
"},{"location":"iep-0015/#23-iscc-did-example","title":"2.3 ISCC DID Example","text":"DID representation of an ISCC-ID
did:iscc:miagwptv4j2z57ci\n
"},{"location":"iep-0015/#3-method-operations","title":"3. Method Operations","text":""},{"location":"iep-0015/#31-creation","title":"3.1 Creation","text":" - An ISCC DID MUST be created by a signed and confirmed ledger transaction that declares an ISCC-CODE in accordance with the cross-chain declaration protocol.
- The initial controller of a newly created ISCC DID MUST be the did:pkh representation of the blockchain account that signed the declaration transaction.
- The controller MAY set a custom DID Document at declaration time by embedding or referencing it from ISCC Metadata
"},{"location":"iep-0015/#32-read","title":"3.2 Read","text":" - A basic DID document is implicitly created with every ISCC declaration and MUST be deterministicaly derived from on-chain metadata.
- Extended DID document properties MAY be imported from externaly referenced ISCC Metadata
"},{"location":"iep-0015/#33-update-deactivate","title":"3.3 Update & Deactivate","text":"The ISCC DID MAY be updated or deactivated in accordence with the chain specific implementation of the ISCC declaration protocol.
"},{"location":"iep-0015/#4-verifiable-data-registry","title":"4. Verifiable Data Registry","text":"The verifiable data registry or \"target system\" for ISCC DIDs is a federation of existing public ledgers that support the ISCC declaration protocol. The protocol can be implemented on most public ledgers (even without smart contracts) that provide an orderd, immutable, append-only history of signed transactions.
Figure 1 - ISCC Verifiable Data Registry"},{"location":"iep-0015/#5-did-document","title":"5. DID Document","text":"DID documents are sourced from on-chain metadata and optionally from immutably or mutably referenced off-chain metadata.
All information required to construct a minimal valid DID document from an ISCC declaration is available on-chain and can be dynamically transformed and presented as DID document by a DID driver implementation.
Minimal ISCC DID Document example
{\n \"@context\": \"https://www.w3id.org/ns/did/v1\"\n \"id\": \"did:iscc:miagwptv4j2z57ci\",\n \"controller\": \"did:pkh:eip155:1:0x901ee44e3bddf4bc1c08a2ed229498512f8bcfdc\",\n \"alsoKnownAs\": \"iscc:kecycpu3okiudz7tybrk5hz4jgptillat2iw7ty7eyiji4qsk5i353i\",\n \"service\": [{\n \"id\":\"did:iscc:miagwptv4j2z57ci#iscc-metadata\",\n \"type\": \"IsccMetadata\", \n \"serviceEndpoint\": \"ipfs://bafybeiccys7kilr3rynlhoelrdn6ragpbfoti73h4e3oszbgd5inthicja/iscc-metadata/43.json\"\n }]\n}\n
- The DID subject (
id
-property) MUST be the ISCC-ID in DID representation. - The DID controller (
controller
-property) MUST be constructed deterministically by converting the blockchain account that signed the declaration transaction to a did:pkh
. - The
alsoKnownAs
-property MUST be set to the ISCC-CODE registered by the transaction. - If the original ISCC declaration includes a link to off-chain metadata the DID document MUST include the reference via an entry into the
service
-property with type \"IsccMetadata\". The referenced serviceEndpoint
SHOULD return a document of type http://purl.org/iscc/context.
Info
Properties like verificationMethod
, authentication
, assertionMethod
etc. are left out intentionally, as their autoritative values are managed by the DID document associated with the controller
that can be resolved separately.
To be defined
Additional/Optional DID document data MAY be added off-chain in mutable or immutable modes and retrived and incjected by the DID driver in realtime to compose an extended DID document that includes other properties like service
.
"},{"location":"iep-0015/#6-resolving-dids","title":"6. Resolving DIDs","text":" - An ISCC DID can be resolved by querying an instantiation of an ISCC content registry.
- The authenticity of the response can be verified through the referenced on-chain transaction.
Figure 2 - ISCC DID Architecture"},{"location":"iep-0015/#7-security-considerations","title":"7. Security Considerations","text":"Implementers should be aware that ISCC-CODEs are not cryptographic hashes but descriptors or similarity preserving (soft) hashes. As such they leak information about the structure of the identified content. This is by design and necessary to support similarity matching with ISCC-CODEs.
An ISCC DID document does not need to contain a proof property. All operations are authenticated with the signature of the transaction payload sent to the network of the originating ledger. This signature is generated using a key specified in the corresponding DID Document.
"},{"location":"iep-0015/#8-privacy-considerations","title":"8. Privacy Considerations","text":"ISCC declarations do not publish any personal data on-chain. Declarers may optionally reference off-chain metadata related to their content registration. Such metadata may contain personal data such as creator and rightsholder information. The assumption is that creators have an interest in proper attribution. Applications that implement ISCC declarations are advised to inform users about any privacy related matters specific to their application.
"},{"location":"iep-0015/#9-reference-implementation","title":"9. Reference Implementation","text":"An end-to-end reference implementation of the decentralized content registry is manifested by the following modules:
- Codec and Algorithms: https://github.com/iscc/iscc-core
- ISCC Metadata: https://github.com/iscc/iscc-schema
- EVM Smart Contracts: https://github.com/iscc/iscc-evm
- EVM Chain Observer: https://github.com/iscc/iscc-observer-evm
- ISCC Content Registry: https://github.com/iscc/iscc-registry / https://iscc.id
- ISCC DID driver: https://github.com/iscc/iscc-did-driver / https://did.iscc.id
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"ISCC - Enhancement Proposals (IEPs)","text":""},{"location":"#what-is-an-iep","title":"What is an IEP?","text":"IEP stands for ISCC Enhancement Proposal. An IEP is a design document providing information to the ISCC community, or describing a new feature for the ISCC or its processes or environment.
IEPs are a mechanism for proposing new features, for collecting community input on an issue, and for documenting design decisions. The IEP author is responsible for building consensus within the community and documenting dissenting opinions.
Because the IEPs are maintained as text files in a versioned repository, their revision history is the historical record of the feature proposal.
"},{"location":"#list-of-ieps","title":"List of IEPs","text":"ID Title Type Status IEP-0000 IEP Purpose and Guidelines Process Draft IEP-0001 ISCC Structure and Format Core Draft IEP-0002 ISCC-UNIT Meta-Code Core Draft IEP-0003 ISCC-UNIT Condent-Code Text Core Draft IEP-0004 ISCC-UNIT Condent-Code Image Core Draft IEP-0005 ISCC-UNIT Condent-Code Audio Core Draft IEP-0006 ISCC-UNIT Condent-Code Video Core Draft IEP-0007 ISCC-UNIT Condent-Code Mixed Core Draft IEP-0008 ISCC-UNIT Data-Code Core Draft IEP-0009 ISCC-UNIT Instance-Code Core TBD IEP-0010 ISCC-CODE Core TBD IEP-0011 ISCC-ID Core TBD IEP-0012 ISCC Metadata Core TBD IEP-0013 ISCC Decentralized Content Registry Core Draft IEP-0014 EVM Based ISCC Registries Core TBD IEP-0015 ISCC DID Method Core Draft"},{"location":"iep-0000/","title":"IEP-0: IEP Purpose and Guidelines","text":"IEP: 0000 Title: IEP Purpose and Guidelines Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/2 Status: Draft Type: Process License: BSD-2-Clause Created: 2022-08-28 Updated: 2022-09-23"},{"location":"iep-0000/#1-what-is-an-iep","title":"1. What is an IEP?","text":"An ISCC Enhancement Proposal (IEP) is a design document providing information to the ISCC community, or describing a new feature for the ISCC or its processes or environment. An IEP should provide a concise technical specification of a feature and a rationale for the feature. IEPs have no special status except that accorded by the community.
IEPs are a mechanism for proposing new features, for collecting community input on an issue, and for documenting design decisions. The IEP author is responsible for building consensus within the community and documenting dissenting opinions.
Because the IEPs are maintained as text files in a versioned repository, their revision history is the historical record of the feature proposal.
"},{"location":"iep-0000/#2-iep-audience","title":"2. IEP audience","text":"The typical primary audience for IEPs are the developers of ISCC implementations.
However, other parts of the ISCC community may also choose to use the process (particularly for Informational IEPs) to document expected API conventions and to manage complex design coordination problems that require collaboration across multiple projects.
"},{"location":"iep-0000/#3-iep-workflow","title":"3. IEP workflow","text":"The IEP process begins with a new idea for the ISCC. Each potential IEP must have a champion - someone who writes the IEP using the style and format described below, shepherds the discussions in the appropriate forums, and attempts to build community consensus around the idea.
Small enhancements or patches to a particular piece of software often don't require coordination between multiple projects or implementations; these don't need an IEP and should be injected into the relevant project-specific development workflow with a patch submission to the applicable issue tracker.
After investigating past work, the best way to proceed is by opening issue for discussion about the new idea. Following a discussion, the proposal should be submitted to the IEPs git repository as a pull request. This draft must be written in IEP style as described below, and named with an alias such as \"iep-johndoe-new-semantic-id\" until an editor has assigned it an IEP number (authors MUST NOT self-assign IEP numbers).
When the IEP draft is complete, an IEP editor will assign the IEP a number, label it as Core, Informational, or Process, and merge the pull request to the IEPs git repository. The IEP editors will not unreasonably reject an IEP. Reasons for rejecting IEPs include duplication of effort, disregard for formatting rules, being too unfocused or too broad, being technically unsound, not providing proper motivation or addressing backwards compatibility. For an IEP to be accepted it must meet certain minimum criteria. It must be a clear and complete description of the proposed enhancement. The enhancement must represent a net improvement. The proposed implementation, if applicable, must be solid and must not complicate the protocol unduly.
The IEP author may update the draft as necessary in the git repository. Updates to drafts should also be submitted by the author as pull requests.
"},{"location":"iep-0000/#31-transferring-iep-ownership","title":"3.1 Transferring IEP ownership","text":"It occasionally becomes necessary to transfer ownership of IEPs to a new champion. A good reason to transfer ownership is because the original author no longer has the time or interest in updating it or following through with the IEP process, or is unreachable or not responding to email.
If you are interested in assuming ownership of an IEP, send a message asking to take over, addressed to both the original author and the IEP editors. If the original author doesn't respond to email in a timely manner, the IEP editors will make a unilateral decision.
"},{"location":"iep-0000/#32-iep-editors","title":"3.2 IEP editors","text":"The current IEP editors are:
- Titusz Pan tp@iscc.foundation
"},{"location":"iep-0000/#33-iep-editor-responsibilities","title":"3.3 IEP editor responsibilities","text":"The IEP editors subscribe to the IEP issue tracker. Correspondence outside the issue tracker should be sent (or CC'd) to the IEP editors.
For each new IEP that comes in an editor does the following:
-
Read the IEP to check if it is ready: sound and complete. The ideas must make technical sense.
-
The title should accurately describe the content.
-
Motivation and backward compatibility (when applicable) must be addressed.
-
Licensing terms must be acceptable for IEPs.
-
If the IEP isn't ready, the editor will send it back to the author for revision, with specific instructions.
-
Once the IEP is ready for the repository it should be submitted as a \"pull request\" to the IEPs git repository where it may get further feedback.
The IEP editor will:
- Assign an IEP number in the pull request.
- Merge the pull request when it is ready.
- List the IEP in README.md.
The IEP editors are intended to fulfill administrative and editorial responsibilities. The IEP editors monitor IEP changes, and update IEP headers as appropriate.
"},{"location":"iep-0000/#4-iep-format-and-structure","title":"4. IEP format and structure","text":"IEPs should be written in Markdown format.
Each IEP should have the following parts:
- Preamble - Headers containing metadata about the IEP (see below).
- Abstract - A short (~200 word) description of the issue being addressed.
- Motivation - A motivation section is critical for IEPs that want to change pre-existing specifications. It should clearly explain why the existing specification is inadequate to address the problem that the IEP solves. This section may be omitted if the motivation is evident.
- Specification - The specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations.
- Rationale - The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work. The rationale should provide evidence of consensus within the community and discuss important objections or concerns raised during discussion around the IEP.
- Backwards compatibility - All IEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The IEP must explain how the author proposes to deal with these incompatibilities.
- Reference implementation - A reference implementation must be completed before a Core IEP is given status Stable, but it need not be completed before the IEP is accepted. The stable implementation must include test code and documentation.
- Copyright - The IEP must be explicitly licensed under acceptable copyright terms see below.
"},{"location":"iep-0000/#41-iep-header-preamble","title":"4.1 IEP header preamble","text":"Each IEP must begin with a header preamble. The headers must appear in the following order. Headers marked with \"*\" are optional and are described below. All other headers are required.
IEP: <IEP number, or \"?\" before being assigned>\n Title: <IEP title; maximum 44 characters>\n Author: <list of authors' names and email addresses>\n Comments: <link to issue page for comments>\n Status: <Draft | Deferred | Withdrawn | Proposed | Rejected | Stable | Obsolete>\n Type: <Core | Informational | Process>\n License: <abbreviation for approved license(s)>\n Created: <date created on, in ISO 8601 (yyyy-mm-dd) format>\n Updated: <date updated on, in ISO 8601 (yyyy-mm-dd) format>\n* Replaces: <IEP number>\n* Superseded-By: <IEP number>\n
The Author header lists the names and email addresses of all the authors/owners of the IEP. The format of the Author header value must be
Random J. User <address@dom.ain>\n
If there are multiple authors, each should be on a separate line.
The Type header specifies the type of IEP: Core, Informational, or Process.
The Created header records the date that the IEP was assigned a number. Dates should be in yyyy-mm-dd format, e.g. 2001-08-14.
IEPs may have a Superseded-By header indicating that an IEP has been rendered obsolete by a later document; the value is the number of the IEP that replaces the current document. The newer IEP must have a Replaces header containing the number of the IEP that it rendered obsolete.
"},{"location":"iep-0000/#5-iep-types","title":"5. IEP Types","text":"There are three kinds of IEPs:
- A Core IEP describes any change that affects most or all ISCC implementations, such as a change to the algorithms for creating ISCCs, or any change or addition that affects the interoperability of applications using the ISCC. Core IEPs consist of two parts, a design document and a reference implementation.
- An Informational IEP describes an ISCC design issue, or provides general guidelines or information to the ISCC community, but does not propose a new feature. Informational IEPs do not necessarily represent a ISCC community consensus or recommendation, so users and implementors are free to ignore Informational IEPs or follow their advice.
- A Process IEP describes a process surrounding ISCC, or proposes a change to (or an event in) a process. Process IEPs are like Core IEPs but apply to areas other than the ISCC algorithms itself. Examples include procedures, guidelines, changes to the decision-making process, and changes to the tools or environment used in ISCC development.
"},{"location":"iep-0000/#6-iep-status","title":"6. IEP status","text":"The typical paths of the status of IEPs are as follows:
flowchart LR\n B[Draft]\n B --> C[Poposed]\n B <--> D[Deferred]\n B <--> E[Withdrawn]\n C --> F[Stable]\n C --> G[Rejected]\n F --> H[Obsolete]\n
Champions of an IEP may decide on their own to change the status between Draft, Deferred, or Withdrawn. An IEP editor may also change the status to Deferred when no progress is being made on the IEP.
An IEP may only change status from Draft to Proposed, when the author deems it is complete, has a working implementation (where applicable), and has community plans to progress it to the Stable status.
IEPs should be changed from Draft or Proposed status, to Rejected status, upon request by any person, if they have not made progress in three years. Such an IEP may be changed to Draft status if the champion provides revisions that meaningfully address public criticism of the proposal, or to Proposed status if it meets the criteria required as described in the previous paragraph.
An IEP may change status from Draft to Stable when it achieves rough consensus on the issue tracker and sufficient real-world adoption. Such a proposal is said to have rough consensus if it has been open to discussion on the issue tracker for at least one month, and no person maintains any unaddressed substantiated objections to it. Addressed or obstructive objections may be ignored/overruled by general agreement that they have been sufficiently addressed, but clear reasoning must be given in such circumstances.
Software authors are encouraged to publish summaries of what IEPs their software supports to aid in verification of status changes.
Should an IEP become actually and unambiguously adopted despite not meeting the criteria outlined here, it should still be updated to Stable status.
When a Stable IEP is no longer relevant, its status may be changed to Obsolete. This change must also be objectively verifiable and/or discussed.
"},{"location":"iep-0000/#7-iep-licensing","title":"7. IEP licensing","text":"New IEPs may be accepted with the following licenses. Each new IEP must identify at least one acceptable license in its preamble. The License header in the preamble must be placed befor the Created header. Each license must be referenced by their respective abbreviation given below.
IEPs are not required to be exclusively licensed under approved terms, and may also be licensed under unacceptable licenses in addition to at least one acceptable license. In this case, only the acceptable license(s) should be listed in the License header.
"},{"location":"iep-0000/#71-acceptable-licenses","title":"7.1 Acceptable licenses","text":" - BSD-2-Clause: OSI-approved BSD 2-clause license
- BSD-3-Clause: OSI-approved BSD 3-clause license
- CC0-1.0: Creative Commons CC0 1.0 Universal
- Apache-2.0: Apache License, version 2.0
- CC-BY-4.0: Creative Commons Attribution 4.0 International
- CC-BY-SA-4.0: Creative Commons Attribution-ShareAlike 4.0 International
- MIT: Expat/MIT/X11 license
"},{"location":"iep-0000/#8-history","title":"8. History","text":"This document was derived heavily from Bitcoin\u2019s BIP-0002 which in turn was derived from Python\u2019s PEP-0001. In many places text was simply copied and modified. The original authors of BIP-0002 and PEP-0001 are not responsible for its use in the ISCC Enhancement Proposals, and should not be bothered with technical questions specific to ISCC or the IEPs.
"},{"location":"iep-0001/","title":"ISCC Structure and Format","text":"IEP: 0001 Title: ISCC Structure and Format Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/6 Status: Draft Type: Core License: CC-BY-4.0 Created: 2022-09-23 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0001/#1-abstract","title":"1. Abstract","text":"This document describes the coding scheme for the International Standard Content Code (ISCC).
"},{"location":"iep-0001/#2-motivation","title":"2. Motivation","text":"The ISCC is a similarity preserving identifier for all kinds of digital content. As such the ISCC requires a universal coding scheme to meet a broad set of use cases and support different media types. The coding scheme for all ISCCs should be:
- compact
- interoperable
- modular
- extensible
- self-describing
"},{"location":"iep-0001/#3-general-structure-of-iscc","title":"3. General structure of ISCC","text":" - An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY.
- The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
- An ISCC-UNIT shall be an ISCC based on one specific algorithm.
- An ISCC-CODE shall be an ISCC composed of two or more different ISCC-UNITs.
Figure 1 - General Structure of an ISCC"},{"location":"iep-0001/#4-iscc-header","title":"4. ISCC-HEADER","text":"The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-fields MainType, SubType, Version, Length.
Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value between 0 and 4679 with the following encoding scheme:
- The total bit-length of a header-field shall be determined by its prefix-bits.
- The prefix-bits shall be followed by data-bits.
- The data-bits shall be interpreted as unsigned integer values plus the maximum value of the preceding range.
- If the total length of all header-fields in number of bits is not divisible by 8, the header shall be padded with 4 zero bits (0000) on the right side.
Table 1 \u2013 Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range 0 1 3 0-7 10 2 6 8-71 110 3 9 72-583 1110 4 12 584-4679 Header-field examples
0 = 0000\n1 = 0001\n\u2026\n7 = 0111\n8 = 1000 0000\n9 = 1000 0001\n
The interpretation of the integer value of a header-field shall be context dependent:
- For the MainType and SubType fields it shall be an identifier for the designated type.
- For the Version field it shall be the literal version number.
- For the Length field of ISCC-UNITs it shall be a number used as a multiplier to calculate the bit length of the ISCC-BODY.
- For the Length field of ISCC-CODEs it shall be a bit-pattern encoding the combination of ISCC-UNITs and the bit-length of the ISCC-BODY.
"},{"location":"iep-0001/#41-maintypes","title":"4.1 MainTypes","text":"The MainType header-field shall signify the type of the ISCC.
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by incrementing the version field of the ISCC-HEADER of the respective MainType.
Note
The first edition of the standard specifies initial algorithms (version 0) for all reserved MainTypes except for the SEMANTIC type which is not currently defined.
Table 2 \u2013 Reserved ISCC MainTypes
ID Symbol Bits Definition 0 META 0000 An ISCC-UNIT that matches on metadata similarity 1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity 2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity 3 DATA 0011 An ISCC-UNIT that matches on data similarity 4 INSTANCE 0100 An ISCC-UNIT that matches on data identity 5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for multi-modal matching"},{"location":"iep-0001/#42-subtypes","title":"4.2 SubTypes","text":"The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode.
Table 3 \u2013 Reserved SubTypes for MainTypes ISCC, SEMANTIC, and CONTENT
ID Symbol Bits Definition 0 TEXT 0000 Match on text similarity 1 IMAGE 0001 Match on image similarity 2 AUDIO 0010 Match on audio similarity 3 VIDEO 0011 Match on video similarity 4 MIXED 0100 Match on multi-modal similarity Table 4 \u2013 Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition 5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code 6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code"},{"location":"iep-0001/#43-version","title":"4.3 Version","text":"All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of the standard.
Table 5 \u2013 Reserved ISCC Versions
ID Symbol Bits Definition 0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE"},{"location":"iep-0001/#44-length","title":"4.4 Length","text":"The encoding of the Length header-field shall be specific to the MainType.
"},{"location":"iep-0001/#441-length-of-iscc-units","title":"4.4.1 Length of ISCC-UNITs","text":"For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA, and INSTANCE the length value shall be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits.
Table 6 \u2013 Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition 0 L32 0000 Length of body is 32 bits (minimum length) 1 L64 0001 Length of body is 64 bits (default length) 2 L96 0010 Length of body is 96 bits 3 L128 0011 Length of body is 128 bits 4 L160 0100 Length of body is 160 bits 5 L192 0101 Length of body is 192 bits 6 L224 0110 Length of body is 224 bits 7 L256 0111 Length of body is 256 bits"},{"location":"iep-0001/#442-length-of-iscc-codes","title":"4.4.2 Length of ISCC-CODEs","text":" -
For ISCC-CODEs the length value shall designate the composition of ISCC-UNITs.
-
The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
-
The first data-bit shall designate the presence of a 64-bit Meta-Code.
-
The second data-bit shall designate the presence of a 64-bit Semantic-Code.
-
The third data-bit shall designate the presence of a 64-bit Content-Code.
-
The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits of mandatory data.
Table 7 \u2013 Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition 0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits. 1 CDI 0001 Includes Content-Code. Length of body is 192 bits 2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits 3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits 4 MDI 0100 Includes Meta-Code. Length of body is 192 bits 5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits 6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits 7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits"},{"location":"iep-0001/#5-iscc-body","title":"5. ISCC-BODY","text":" - The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
- The Length field shall determine the number of bits of an ISCC-BODY.
"},{"location":"iep-0001/#6-encoding-of-iscc","title":"6. Encoding of ISCC","text":""},{"location":"iep-0001/#61-canonical-form","title":"6.1 Canonical form","text":"The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without padding and prefixed with \u201cISCC:\u201d.
Canonical ISCC-CODE example
ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY
"},{"location":"iep-0001/#62-uri-encoding","title":"6.2 URI encoding","text":" - An ISCC shall be encodable as a Uniform Resource Identifier (URI) according to RFC 3986.
- The URI representation shall have the format
<scheme>:<path>
. - The URI scheme shall be the string \u201ciscc\u201d.
- The URI path shall be the lower-cased base32 representation of an ISCC without padding.
URI encoded ISCC-CODE example
iscc:kec43hjlpushvazt66ylpuwnvacwypiv533trqmwf2iuqysp5la4cty
"},{"location":"iep-0001/#63-multiformats-encoding","title":"6.3 Multiformats encoding","text":" - The ISCC shall be encodable as a multibase string.
- The multicodec identifier of an ISCC shall be
0xcc01
. - A Multiformat representation of an ISCC shall be prefixed with a Multibase code.
- The encoding scheme shall be
<multibase><multicodec><iscc-header><iscc-body>
.
ISCC shall support the following multibase encodings:
Table 8 \u2013 Supported multibase encodings
Encoding Code Definition base16 f hexadecimal base32 b RFC4648 case-insensitive - no padding base32hex v RFC4648 case-insensitive - no padding - highest char base58btc z base58 bitcoin base64url u RFC4648 no padding Table 9 \u2013 Examples of ISCCs in multiformats encoding
Encoding Example MF base16 fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f MF base32 bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4 MF base32hex vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s MF base58btc z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx MF base64url uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP"},{"location":"iep-0001/#64-readable-encoding","title":"6.4 Readable encoding","text":" - The ISCC shall be encodable in human-readable representation.
- The readable representation shall encode the header-fields with their symbols and the ISCC-BODY in base16 lower-case.
- The header-fields and the ISCC-BODY shall be separated with hyphens.
Example of human-readable ISCC-CODE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
"},{"location":"iep-0001/#7-reference-implementation","title":"7. Reference implementation","text":"The reference implementation of this coding scheme is published in the iscc-core python package in the codec.py module.
"},{"location":"iep-0002/","title":"ISCC-UNIT Meta-Code","text":"IEP: 0002 Title: ISCC-UNIT Meta-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/7 Status: Draft Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0002/#1-general","title":"1. General","text":"The Meta-Code is a similarity hash generated from referent seed metadata as defined in IEP-0012
"},{"location":"iep-0002/#2-purpose","title":"2. Purpose","text":"The Meta-Code shall support the following use cases:
- clustering of digital assets based on their metadata;
- discovery of digital assets with similar metadata;
- verification or manual disambiguation of matching codes.
"},{"location":"iep-0002/#3-format","title":"3. Format","text":"The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 - Data format of the Meta-Code EXAMPLE: 64-bit Meta-Code in its canonical form:
ISCC:AAAUL6P7RMVNT4UJ
EXAMPLE: 256-bit Meta-Code in its canonical form:
ISCC:AADUL6P7RMVNT4UJJ4SMTDXBL5JFZ5XPCDKO42XYPJEVQ4L7PTYDORQ
"},{"location":"iep-0002/#4-inputs","title":"4. Inputs","text":"Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible elements:
- name (required): the name or title of the work manifested by the digital asset;
- description (optional): a disambiguating textual description of the digital asset;
- meta (optional): subject, industry, or use-case specific metadata.
NOTE 1
Because seed metadata is used to construct the Meta-Code, changes to its value may produce different (and therefore no longer matching) Meta-Codes. Seed metadata is stored and carried along unaltered with ISCC Metadata if automated verification of the Meta-Code based on the original seed metadata is required.
NOTE 2
The identifier standards and their schemas defined by ISO/TC 46/SC 9 provide helpful guidance in selecting seed metadata.
"},{"location":"iep-0002/#41-name-element","title":"4.1 name element","text":"The text input for the name element shall be pre-processed before similarity hashing as follows:
- Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https://unicode.org/reports/tr15/#Norm_Forms).
- Remove control characters (see Unicode Character Database https://www.unicode.org/ucd/).
- Strip leading and trailing whitespace.
- Trim the end of the text such that the UTF-8 encoded size does not exceed 128 bytes.
"},{"location":"iep-0002/#42-description-element","title":"4.2 description element","text":"Text input for the description element shall be pre-processed before similarity hashing as follows:
- Apply NFKC Unicode Normalization.
- Remove control characters (as specified by Unicode Character Database) except for the following newline characters:
- U000A - Line Feed;
- U000B - Vertical Tab;
- U000C - Form Feed;
- U000D - Carriage Return;
- U0085 - Next Line;
- U2028 - Line Separator;
- U2029 - Paragraph Separator.
- Collapse more than two consecutive newlines characters to a maximum of two consecutive newlines.
- Strip leading and trailing whitespace characters.
"},{"location":"iep-0002/#43-meta-element","title":"4.3 meta element","text":" - The value of the meta element shall be wrapped in a RFC 2397 Data-URL.
- The value of the meta element may include any conceivable and supportive metadata such as for example:
- JSON serialized metadata (
data:application/json;base64,<data>
); - JSON-LD serialized metadata (
data:application/ld+json;base64,<data>
); - XML serialized metadata (
data:application/xml;base64,<data>
); - MARC21 XML (
data:application/xml;base64,<data>
); - IPTC NewsML (
data:application/vnd.iptc.g2.newsitem+xml;base64,<data>
); - a file header (
data:application/octet-stream;base64,<data>
); - a thumbnail image (
data:image/png;base64,<data>
); - an audio sample (
data:audio/mp4;base64,<data>
).
- If the value of the meta element is JSON or JSON-LD it shall be serialized with RFC 8785 JCS canonicalization before being wrapped in a Data-URL.
- If the value of the meta element is XML it shall be serialized as Canonical XML.
- The Data-URL shall be pre-processed before similarity hashing as follows:
- Decode the base64 encoded data section of the Data URL to a raw bitstream without further interpretation.
"},{"location":"iep-0002/#5-outputs","title":"5. Outputs","text":"Meta-Code processing shall generate the following output elements for inclusion into the produced ISCC metadata:
- iscc (required): the ISCC Meta-Code in its canonical form;
- name (required): the pre-processed value of the name element;
- meta (optional): the unaltered value of the meta element;
- description (optional): the pre-processed value of the description element;
- metahash (required): a cryptographic hash of the seed metadata.
NOTE 1
The reference implementation uses a multihash 1 encoded BLAKE3 2 value for the metahash element.
NOTE 2
An ISCC processor may produce other custom output elements, which are helpful to identify the digital asset.
"},{"location":"iep-0002/#6-seed-metadata","title":"6. Seed metadata","text":""},{"location":"iep-0002/#61-meta-code-processing","title":"6.1 Meta-Code processing","text":"The Meta-Code shall be constructed from 2 similarity hashes interleaved in 32-bit chunks by selecting the elements according to the algorithm illustrated in Figure 3.
Figure 3 - Meta-Code processing logic - If the name element is unavailable, Meta-Code generation shall be skipped.
- The first part of the similarity hash for the Meta-Code shall be generated from the name element.
- The second part of the similarity hash shall be generated from the meta element.
- If the meta element is unavailable, the second part of the similarity hash shall be generated from the description element.
- If the description element is unavailable, the second part of the similarity hash shall also be generated from the name element.
"},{"location":"iep-0002/#62-meta-hash-processing","title":"6.2 Meta-Hash processing","text":"The Meta-Hash shall be constructed from the seed metadata by selecting input elements according to the algorithm illustrated in Figure 4.
Figure 4 - Meta-Hash processing logic - If the name element is unavailable, Meta-Hash generation shall be skipped.
- If the meta element is available, the decoded raw and un-interpreted data of the Data-URL shall be used as sole input to the cryptographic hash function.
- If the meta element is unavailable, but the description element is available, the space-concatenated value of the pre-processed name and description shall be the input to the cryptographic hash function.
- If only the name element is available, its pre-processed value shall be the input to the cryptographic hash function.
"},{"location":"iep-0002/#7-metadata-embedding","title":"7. Metadata embedding","text":" - Seed metadata shall be embedded into the processed digital asset if:
- seed metadata values have been provided explicitly to an ISCC processor;
- the ISCC processor supports metadata embedding for the given media type.
- If the media type supports ISO 16684 XMP metadata-embedding, an ISCC processor shall use the namespace http://purl.org/iscc/schema and embed seed metadata values under the names:
- Xmp.iscc.name
- Xmp.iscc.description
- Xmp.iscc.meta
- If the media type does not support ISO 16684 XMP metadata-embedding the ISCC processor may choose other suitable format-specific fields for embedding seed metadata.
- If seed metadata is to be embedded, it shall be embedded before processing other ISCC-UNITs.
- An ISCC processor should document for which media types it supports metadata-embedding and how it maps seed metadata to format specific elements.
"},{"location":"iep-0002/#8-metadata-extraction","title":"8. Metadata extraction","text":" - An ISCC processor shall try to extract seed metadata from the digital asset if:
- seed metadata has not been provided explicitly to the ISCC processor;
- the ISCC processor supports metadata extraction for the given media type.
- Seed metadata shall be extracted with the following precedence:
- Extract seed metadata from XMP metadata under the namespace http://purl.org/iscc/schema.
- Extract seed metadata from suitable, format-specific embedded metadata.
- Use the filename of the asset as a value for the name element, discarding the file extension and replacing the characters \u201c-\u201d and \u201c_\u201d with spaces.
- An ISCC processor shall document for which media types it supports metadata-extraction and how it maps seed metadata to format specific elements.
Bibliography
-
IETF, draft-multiformats-multihash-05 \u2014 The Multihash Data Format Available at https://datatracker.ietf.org/doc/html/draft-multiformats-multihash-05 \u21a9
-
O\u2019Connor, J., Aumasson, J.P., Neves, S., Wilcox-O\u2019Hearn, Z., BLAKE3: one function, fast everywhere. Version 20211102173700, accessed July 2022. Available at https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf \u21a9
"},{"location":"iep-0003/","title":"ISCC-UNIT Content-Code Text","text":"IEP: 0003 Title: ISCC-UNIT Condent-Code Text Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/8 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0003/#1-general","title":"1. General","text":" - The Content-Code Subtype Text (Text-Code) shall be a Content-Code generated from the plain text content extracted from a digital asset that contains text.
- The Text-Code shall be robust against text document format conversion and minor edits.
"},{"location":"iep-0003/#2-format","title":"2. Format","text":"The Text-Code shall have the data format illustrated in Figure 5.
Figure 5 - Data format of the Text-Code EXAMPLE 1: 64-bit Text-Code in its canonical form with text input \u201cHello World\u201d:
ISCC:EAASKDNZNYGUUF5A
EXAMPLE 2: 256-bit Text-Code in its canonical form with text input \u201cHello World\u201d:
ISCC:EADSKDNZNYGUUF5AMFEJLZ5P66CP5YKCOA3X7F36RWE4CIRCBTUWXYY
"},{"location":"iep-0003/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Text-Code shall be the plain text as extracted from a digital document.
- An ISCC implementation may use any text extraction mechanism, including optical character recognition, to extract plain text from a digital document.
- Plain text used as input for Text-Code calculation shall not include any processing instructions (SGML, HTML, Markdown, and other markup information).
"},{"location":"iep-0003/#4-outputs","title":"4. Outputs","text":"Text-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Text-Code in its canonical form (required);
- characters: The number of characters of the source text after pre-processing (optional);
- Additional metadata extracted from the document (optional).
"},{"location":"iep-0003/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process Text input as follows:
- Apply NFD Unicode Normalization.
- Remove all whitespace characters from the text.
- Convert text to lower case in accordance with Unicode \u2018Case Folding Properties\u2019 (see https://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt).
- Remove all characters from Unicode categories Mark (M), Punctuation (P) and Other (C).
- Apply NFKC Unicode Normalization.
An ISCC processor shall calculate the Text-Code as follows:
- Split the pre-processed text into n-grams of 13 characters by sliding over the text character-wise.
- Create a list of 32-bit unsigned integers by hashing the UTF-8 encoded representation of the n-grams using the XXH32 algorithm.
- Apply the Minhash256 algorithm to the list of integers to calculate the ISCC-BODY of the Text-Code.
"},{"location":"iep-0003/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating a Text-Code is specified only for UTF-8 encoded text input. An implementation of the Text-Code algorithm shall be regarded as conforming to the standard as long as it creates the same Text-Code as the reference implementation for the same UTF-8 encoded text input.
The normative behaviour of an ISCC processor in generating a Text-Code from other sources is not specified and different codes can be generated depending on the tools used for text extraction from the source. Implementers seeking to guarantee interoperability with each other in these circumstances should select the same tool for text extraction.
NOTE
For further technical details see source-code in the modules code_content_text.py and minhash.py of the reference implementation.
"},{"location":"iep-0004/","title":"ISCC-UNIT Content-Code Image","text":"IEP: 0004 Title: ISCC-UNIT Condent-Code Image Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/9 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-31 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0004/#1-general","title":"1. General","text":" - The Content-Code Subtype Image (Image-Code) shall be a perceptual similarity hash of the input image.
- The Image-Code shall be robust against image format conversion, scaling, compression and minor edits.
"},{"location":"iep-0004/#2-format","title":"2. Format","text":"The Image-Code shall have the data format illustrated in Figure 6.
Figure 6 - Data format of the Image-Code EXAMPLE 1: 64-bit Image-Code in its canonical form:
ISCC:EEA4GQZQTY6J5DTH
EXAMPLE 2: 256-bit Image-Code in its canonical form:
ISCC:EED4GQZQTY6J5DTHQ2DWCPDZHQOM6QZQTY6J5DTFZ2DWCPDZHQOMXDI
"},{"location":"iep-0004/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Image-Code shall be an image file.
- An ISCC processor shall at least support the JPEG and PNG image formats.
"},{"location":"iep-0004/#4-outputs","title":"4. Outputs","text":"Image-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Image-Code in its canonical form (required);
- width: width of the original input image in number of pixels (optional);
- height: height of the original input image in number of pixels (optional);
- thumbnail: a thumbnail of the original image encoded as Data-URL (optional);
- Additional metadata extracted from the image (optional).
"},{"location":"iep-0004/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the image file input as follows:
- Transpose the image according to its orientation tag (if available).
- Add white background to image if it contains alpha transparency.
- Crop uniformly colored borders if applicable.
- Convert image to grayscale.
- Resize grayscale image to 32x32 pixels using bicubic interpolation.
An ISCC processor shall calculate the Image-Code as follows:
- Apply discrete cosine transform to the 32x32 grayscale pixel matrix.
- Calculate the median value of the upper left 8x8 pixels of the transformed matrix.
- For each pixel of the upper-left 8x8 square set a 1-bit if the grayscale value is larger than the median and a 0-bit if it is smaller or equal to the median value.
- The collected bits are the first 64 bits of the body of the Image-Code.
- To extend the perceptual hash up to 256 bits repeat steps 2-3 for the top-right, bottom-left and bottom-right 8x8 squares in the given order.
"},{"location":"iep-0004/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating an Image\u2013Code is specified only for the pre-processed 32x32 pixel grayscale input. An implementation of the Image-Code algorithm shall be regarded as conforming to the standard if it creates the same Image-Code as the reference implementation for the same 32x32 grayscale pixel values.
Implementers seeking to guarantee interoperability with each other in these circumstances should select the same tool for image pre-processing.
NOTE
For further technical details see source-code in modules code_content_image.py and dct.py of the reference implementation.
"},{"location":"iep-0005/","title":"ISCC-UNIT Content-Code Audio","text":"IEP: 0005 Title: ISCC-UNIT Condent-Code Audio Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/10 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-31 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0005/#1-general","title":"1. General","text":" - The Content-Code Subtype Audio (Audio-Code) shall be a similarity hash of the audio input.
- The Audio-Code shall be robust against audio format conversion, compression, and minor edits.
"},{"location":"iep-0005/#2-format","title":"2. Format","text":"The Audio-Code shall have the data format illustrated in Figure 7.
Figure 7 - Data format of the Audio-Code EXAMPLE 1: 64-bit Audio-Code in its canonical form:
ISCC:EIAWUJFCEZZOJYVD
EXAMPLE 2: 256-bit Audio-Code in its canonical form:
ISCC:EIDWUJFCEZZOJYVDHJHIRB3KQSQCM2REUITDUTVAQNRGJIRENCCCULY
"},{"location":"iep-0005/#3-inputs","title":"3. Inputs","text":"The input for calculating the Audio-Code shall be the Chromaprint fingerprint (array of 32-bit signed integers) from the original audio data.
"},{"location":"iep-0005/#4-outputs","title":"4. Outputs","text":"Audio-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Audio-Code in its canonical form (required).
- duration: duration of audio in seconds (optional).
- Additional metadata extracted from the audio file (optional).
"},{"location":"iep-0005/#5-processing","title":"5. Processing","text":"An ISCC processor shall calculate the Audio-Code as follows:
- Calculate a 32-bit similarity hash from the input array.
- Additionally, calculate 32-bit similarity hashes separately for each quarter of values from the input array and concatenate them to the result from step 1 in order to extend the similarity hash up to a total of 160 bits.
- Sort the input array by value in ascending order and calculate similarity hashes separately for each third of the values and concatenate them to the result from step 2 in order to extend the similarity hash of the final ISCC-BODY of the Audio-Code up to the maximum of 256 bits.
"},{"location":"iep-0005/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating an Audio\u2013Code is specified only for the Chromaprint input array. An implementation of the Audio-Code algorithm shall be regarded as conforming to the standard if it creates the same Audio-Code as the reference implementation for the same Chromaprint array of 32-bit values.
NOTE
For further technical details see source-code in modules code_content_audio.py and simhash.py of the reference implementation.
"},{"location":"iep-0006/","title":"ISCC-UNIT Content-Code Video","text":"IEP: 0006 Title: ISCC-UNIT Condent-Code Video Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/11 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2024-01-01 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0006/#1-general","title":"1. General","text":" - The Content-Code Subtype Video (Video-Code) shall be a similarity hash of the input video.
- The Video-Code shall be robust against format conversions, scaling, compression, changes of framerate and minor edits.
"},{"location":"iep-0006/#2-format","title":"2. Format","text":"The Video-Code shall have the data format illustrated in Figure 8.
Figure 8 - Data format of the Video-Code EXAMPLE 1: 64-bit Video-Code in its canonical form:
ISCC:EMA7KERCWROEVL6F
EXAMPLE 2: 256-bit Video-Code in its canonical form:
ISCC:EMD7KERCWROEVL6FU3SISZAZPJCBAZKXEZLZTSRQPGASTENCMSYFOAY
"},{"location":"iep-0006/#3-inputs","title":"3. Inputs","text":"The input for calculation of the Video-Code shall be the MPEG-7 video frame signatures, as specified in ISO/IEC 15938, from the original video data.
"},{"location":"iep-0006/#4-outputs","title":"4. Outputs","text":"Video-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Video-Code in its canonical form (required);
- width: width of the original input video in number of pixels (optional);
- height: height of the original input video in number of pixels (optional);
- duration: duration of video in seconds (optional);
- fps: frames per second of the original video (optional);
- thumbnail: a thumbnail from the original video encoded as Data-URL (optional);
- Additional metadata extracted from the video (optional).
"},{"location":"iep-0006/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the video file input as follows:
- Generate the MPEG-7 video signature at 5 frames per second.
An ISCC processor shall calculate the Video-Code as follows:
- For each of the 380 integer values of the MPEG-7 frame signature calculate its sum over all frames.
- Apply WTA hash to the resulting array of 380 integer values to calculate the ISCC-BODY of the Video-Code at the desired bit-length.
The normative behaviour of an ISCC processor in generating a Video\u2013Code is specified only for the MPEG 7 video frame signature input. An implementation of the Video-Code algorithm shall be regarded as conforming to the standard if it creates the same Video-Code as the reference implementation for the same MPEG 7 video frame signature input.
NOTE
For further technical details see source-code in modules code_content_video.py and wtahash.py of the reference implementation.
"},{"location":"iep-0007/","title":"ISCC-UNIT Content-Code Mixed","text":"IEP: 0007 Title: ISCC-UNIT Condent-Code Mixed Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/12 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2024-01-02 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0007/#1-general","title":"1. General","text":" - The Content-Code Subtype Mixed (Mixed-Code) shall be a similarity preserving hash of a collection of assets of the same or different media types combined into a single multimedia file.
- An ISCC processor that supports the creation of Mixed-Codes shall publicly document the supported file formats and the rules by which it divides the different parts of a multimedia file.
- The Mixed-Code shall be robust against format conversions, scaling, compression, and minor edits of the individual parts of the multimedia file.
"},{"location":"iep-0007/#2-format","title":"2. Format","text":"The Mixed-Code shall have the data format illustrated in Figure 9:
Figure 8 - Data format of the Mixed-Code EXAMPLE 1: 64-bit Mixed-Code in its canonical form:
ISCC:EQASD57JXX7U73P7
EXAMPLE 2: 256-bit Mixed-Code in its canonical form:
ISCC:EQDSD57JXX7U73P7HPPH2P3U5OXZM7PL65T3HZ5JZ76H577P77NO5ZY
"},{"location":"iep-0007/#3-inputs","title":"3. Inputs","text":" - The input for calculating the Mixed-Code shall be the Content-Codes of the individual parts of the multimedia file.
- At least two Content-Codes shall be required as input to calculate a Mixed-Code.
"},{"location":"iep-0007/#4-outputs","title":"4. Outputs","text":"Mixed-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Mixed-Code in its canonical form (required);
- parts: the list of Content\u2013Codes used for calculating the Mixed-Code (recommended);
- Additional metadata extracted from the multimedia file (optional).
"},{"location":"iep-0007/#5-processing","title":"5. Processing","text":"An ISCC processor shall pre-process the multimedia file as follows:
- Generate individual Content-Codes for each part of the multimedia file according to the specifications in IEP-0003, IEP-0004, IEP-0005 and IEP-0006.
An ISCC processor shall calculate the Mixed-Code as follows:
- Create a byte sequence from each Content-Code retaining the first byte of the ISCC-HEADER concatenated with the bytes of the ISCC-BODY.
- Apply the similarity hash to the list of byte sequences from step 1 to calculate the ISCC-BODY of the Mixed-Code.
"},{"location":"iep-0007/#6-conformance","title":"6. Conformance","text":"The normative behaviour of an ISCC processor in generating a Mixed Code is specified only for Content-Code inputs. An implementation of the Mixed-Code algorithm shall be regarded as conforming to the standard if it creates the same Mixed-Code as the reference implementation for the same Content-Code inputs.
NOTE
For further technical details see source-code in modules code_content_mixed.py and simhash.py of the reference implementation.
"},{"location":"iep-0008/","title":"ISCC-UNIT Data-Code","text":"IEP: 0008 Title: ISCC-UNIT Data-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/13 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2024-01-03 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0008/#1-general","title":"1. General","text":" - The Data-Code shall be a similarity hash for any kind of data regardless of its media type.
- The Data-Code shall cluster digital assets that have near-identical data.
- Small differences (as a proportion of the whole) in referent data shall yield identical Data-Codes.
- More significant differences in referent data shall produce similar Data-Codes that can be compared against each other to estimate the data-similarity of the referents.
- The Data-Code shall be resistant to data shifting and reordering sequences of data within referent data.
NOTE
Changes of the Data-Code do not reflect semantic or syntactic changes of the content.
"},{"location":"iep-0008/#2-format","title":"2. Format","text":"The Data-Code shall have the data format illustrated in Figure 10:
Figure 10 - Data format of the Data-Code EXAMPLE 1: 64-bit Data-Code in its canonical form:
ISCC:GAAWAIBQLNWP7X32
EXAMPLE 2: 256-bit Data-Code in its canonical form:
ISCC:GADWAIBQLNWP7X32J3INMAMDUJ4QMN67BBQKVTVZIWHXQ7QJIKHYTBY
"},{"location":"iep-0008/#3-inputs","title":"3. Inputs","text":"The input for calculating the Data-Code shall be the bytes of a file, without reference to their meaning or structure.
"},{"location":"iep-0008/#4-outputs","title":"4. Outputs","text":"Data-Code processing shall generate the following output elements:
- iscc: the Data-Code in its canonical form (required).
"},{"location":"iep-0008/#5-processing","title":"5. Processing","text":"An ISCC processor shall calculate the Data-Code as follows:
- Split the data into variable sized chunks with an average chunk size of 1024 bytes using the content defined chunking (CDC) algorithm.
- Calculate the 32-bit integer hash of each chunk using the XXH32 algorithm.
- Apply the minhash algorithm to the array of 32-bit integers to calculate the ISCC-BODY of the Data-Code with appropriate length.
NOTE
For further technical details see source-code in modules code_data.py and minhash.py of the reference implementation.
"},{"location":"iep-0008/#6-conformance","title":"6. Conformance","text":"An implementation of the Data-Code algorithm shall be regarded as conforming to the standard if it creates the same Data-Code as the reference implementation for the same data input.
NOTE
The ISCC reference implementation uses the open source XXHASH library 1 for XXH32 chunk hashing and appropriate use of this software will generate the same codes as the reference implementation.
Bibliography
-
Collet, Yann. xxHash: Extremely fast hash algorithm. Accessed July 2022, available at https://cyan4973.github.io/xxHash/ \u21a9
"},{"location":"iep-0009/","title":"ISCC-UNIT Instance-Code","text":"IEP: 0009 Title: ISCC-UNIT Instance-Code Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/14 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0010/","title":"ISCC-CODE","text":"IEP: 0010 Title: ISCC-CODE Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/15 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0011/","title":"ISCC-ID","text":"IEP: 0011 Title: ISCC-ID Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/16 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2022-09-28"},{"location":"iep-0012/","title":"ISCC Metadata","text":"IEP: 0012 Title: ISCC Metadata Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/17 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-28 Note
This document is a DRAFT contributed as input to ISO TC 46/SC 9/WG 18. The final version is developed at the International Organization for Standardization as ISO/DIS 24138
"},{"location":"iep-0013/","title":"ISCC Decentralized Content Registry","text":"IEP: 0013 Title: ISCC Decentralized Content Registry Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/18 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30"},{"location":"iep-0013/#1-status-of-this-document","title":"1. Status of This Document","text":"This document is a draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organization.
"},{"location":"iep-0013/#2-introduction","title":"2. Introduction","text":"The purpose of a decentralized content registry is to connect Actors to Digital Content in a permissionless decentralized environment and provide a global and verifiable data space for content identification and matching.
Actors authenticate themselves with their blockchain accounts which they use to sign ISCC-CODE declarations (ledger transactions). Digital Content is identified by ISCC-CODEs. The ISCC-ID is derived from an ISCC-CODE, a blockchain account and the history of previous declarations. ISCC-IDs are globally unique, persistent, authenticated, and resolve to at least exactly one ISCC-CODE and a blockchain account. The ISCC-IDs are not required to be generated or stored on the participating ledgers. ISCC-IDs are the result of processing the history of transactions according to the Minting Protocol.
"},{"location":"iep-0013/#3-protocol-overview","title":"3. Protocol Overview","text":"The protocol to declare an ISCC-CODE and trigger the minting of an ISCC-ID is divided into 3 parts, the Declaration Protocol, the Minting Protocol and the Resolution Protocol.
- The declaration protocol defines how an ISCC-CODE has to be written to a ledger to become a valid input for the off-chain minting protocol.
- The minting protocol defines how a legers history has to be parsed to mint a valid ISCC-ID
- The resolution protocol defines how an ISCC resolver answers queries about ISCC-CODEs and ISCC-ID.
"},{"location":"iep-0013/#4-declaration-protocol","title":"4. Declaration Protocol","text":"To participate in the ISCC declaration protocol, a ledger MUST establish exactly one globally unique Ledger-ID (Variable Length Integer) that will be used as a prefix for ISCC-IDs that are minted from its ISCC declarations.
Note
An ISCC-ID comes into existence only after an ISCC declaration has been confirmed on a ledger that participates in the protocol.
The following minimal information (Declaration-Set) MUST be provided and made publicly available for a valid ISCC declaration:
- An ISCC-CODE (a valid sequence of ISCC-UNITs)
- A blockchain account (actors identifier) of the declaring party
- A valid signature of the declaring party (transaction signature)
We define the party that signs the ISCC declaration as the DECLARER.
Note
The DECLARER is merely the controller of the ISCC-ID minted from the declaration. The declarer is not required to be the creator or a rights-holder of the declared digital content.
An ISCC declaration MAY additionally include:
- A link to external metadata as defined by IEP-0012 - ISCC Metadata
- A processing instruction for the minting protocol
The on-chain link to ISCC metadata SHOULD point to a public and integrity preserving resource (e.g. IPFS CID or a hashlink URI). Permissioned, confidential or mutable data SHOULD be referenced from ISCC metadata via URI.
A ledger that wants to accept ISCC declarations and trigger the minting of valid ISCC-IDs MUST fulfill the following minimum requirements:
- The ledger must provide an immutable, complete, time-ordered, append-only sequence of transactions.
- The legers transaction format must allow for embedding and signing the data required for an ISCC declarations.
- The Declaration-Set MUST be publicly readable (permisionless).
- ISCC declarations on the ledger MAY be write-permissioned.
A participating ledger or framework MUST provide documentaation of its implementation of the declaration protocol.
- The documentation MUST specify how ISCC declarations can be parsed to decode the Declaration-Set
- The documentation MUST provide sufficient information to the public such that third parties can independently verify transactions signatures and implement the Minting Protocol
- The documentation MUST define how a public observer can distinguish between a transaction that declares an ISCC-CODE and other unrelated transactions.
"},{"location":"iep-0013/#5-minting-protocol","title":"5. Minting Protocol","text":"TBD
"},{"location":"iep-0013/#6-resolution-protocol","title":"6. Resolution Protocol","text":"TBD
"},{"location":"iep-0013/#7-reference-implementation","title":"7. Reference Implementation","text":" - ISCC-ID construction: https://github.com/iscc/iscc-core/blob/main/iscc_core/iscc_id.py
- EVM registry: https://github.com/iscc/iscc-evm
- EVM observer: https://github.com/iscc/iscc-observer-evm
- ISCC-ID resolver: https://github.com/iscc/iscc-registry
"},{"location":"iep-0014/","title":"EVM Based ISCC Registries","text":"IEP: 0014 Title: EVM Based ISCC Registries Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/19 Status: TBD Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2022-09-28"},{"location":"iep-0015/","title":"ISCC DID Method","text":"IEP: 0015 Title: ISCC DID Method Author: Titusz Pan tp@iscc.foundation Comments: https://github.com/iscc/iscc-ieps/issues/20 Status: DRAFT Type: Core License: CC-BY-4.0 Created: 2022-09-28 Updated: 2023-12-30 Abstract
A DID method that identifies decentralized declarations of digital content using ISCC-IDs.
Status of This Document
This document is a draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organization.
"},{"location":"iep-0015/#1-introduction","title":"1. Introduction","text":""},{"location":"iep-0015/#11-preface","title":"1.1 Preface","text":"The ISCC DID method specification conforms to the requirements specified in the Decentralized Identifiers v1.0 Specification DID-CORE.
"},{"location":"iep-0015/#12-motivation","title":"1.2 Motivation","text":"The need for a universal identifier for digital content has emerged as an increasing amount of dynamic, short-lived and granular digital content is produced, consumed and processed. Commercial interests of many stakeholders depend on proper identification of digital content.
Professionally produced digital content but also semi-professional and user-generated content are the currency of the information age. A variety of specific content identifier standards already exist, but a universal content-dependent identifier for digital media has not yet been developed.
In particular, the structure and management of identifiers for digital content have a substantial impact on the level of possible adoption, automation, and the potential for machine-to-machine communication and innovation within and across different industry sectors.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing its underlying data as it travels through a complex network of actors and systems. These circumstances require a special design for a universal identifier that is capable of matching transcoded or otherwise transformed content.
"},{"location":"iep-0015/#13-the-iscc","title":"1.3 The ISCC","text":"The ISCC (International Standard Content Code) is a universal and open identification system for text, audio, image, and video content. ISCC-CODEs can be created from media assets by anybody using open source software. Similar content can then be matched by comparing ISCC-CODEs only.
Example ISCC-CODE
ISCC:KECYCPU3OKIUDZ7TYBRK5HZ4JGPTILLAT2IW7TY7EYIJI4QSK5I353I\nDecoded: ISCC-IMAGE-V0-MCDI-813e9b729141e7f3c062ae9f3c499f342d609e916fcf1f26109472125751beed\n
Users can also register ISCC-CODEs on any supported public blockchain to obtain a short and globaly unique ISCC-ID. The ISCC-ID is under the control of the registrant and resolves to an ISCC-CODE, on-chain metadata and optional off-chain metadata. ISCC-IDs are globaly unique even if the same ISCC-CODE is registered multiple times by different entities. An ISCC-ID is minted deterministically by observing participating legers and can be reproduced by anybody who observes the public and immutable registration events.
Example ISCC-ID
ISCC:MIAGWPTV4J2Z57CI\nDecoded: ID-ETHEREUM-V0-64-6b3e75e2759efc48\n
"},{"location":"iep-0015/#14-iscc-id-as-did","title":"1.4 ISCC-ID as DID","text":"The ISCC DID method creates a mechanism to reference digital content with a globaly unique persistent identifier that does not require a centralized registration authority. Instead, the ISCC system defines an open and voluntary cross-chain registration protocol using cryptography and distributed ledger technology.
Integrating ISCC with the DID system improves ISCC interoperability. DID documents provide standardized ways to discover services related to the referenced content and its registrant.
Verifiable credentials discovered through the DID document service
property can improve trust in otherwise permissionless content registrations. Additionaly the use of decentralized web nodes allow for interoperable discovery and data sovereignity of hosted verifiable credentials.
At the same time ISCC would bring open content identification to the Decentalized Identifiers ecosystem.
"},{"location":"iep-0015/#2-method-syntax","title":"2. Method Syntax","text":""},{"location":"iep-0015/#21-method-name","title":"2.1 Method Name","text":" - The name that shall identify this DID method is:
iscc
. - A DID that uses this method MUST begin with the following prefix:
did:iscc:
. - According to the DID specification, this string MUST be in lowercase.
"},{"location":"iep-0015/#22-method-specific-identifier","title":"2.2 Method Specific Identifier","text":"The ISCC DID scheme conforms to the DID Syntax and is defined by the follwing ABNF:
ISCC DID scheme ABNF
iscc-did = \"did:iscc:\" iscc\niscc = 10*88(numbers / letters)\nnumbers = %x32-37 ; 2-7\nletters = %x61-7A ; a-z\n
- The data structure of the ISCC is
<MainType><SubType><Version><Length><ISCC-BODY>
- The method specific identifier is a lower-cased base32 representation of the ISCC structure.
- The regular expression for this DID method is
^did:iscc:[2-7a-z]{10,88}$
"},{"location":"iep-0015/#23-iscc-did-example","title":"2.3 ISCC DID Example","text":"DID representation of an ISCC-ID
did:iscc:miagwptv4j2z57ci\n
"},{"location":"iep-0015/#3-method-operations","title":"3. Method Operations","text":""},{"location":"iep-0015/#31-creation","title":"3.1 Creation","text":" - An ISCC DID MUST be created by a signed and confirmed ledger transaction that declares an ISCC-CODE in accordance with the cross-chain declaration protocol.
- The initial controller of a newly created ISCC DID MUST be the did:pkh representation of the blockchain account that signed the declaration transaction.
- The controller MAY set a custom DID Document at declaration time by embedding or referencing it from ISCC Metadata
"},{"location":"iep-0015/#32-read","title":"3.2 Read","text":" - A basic DID document is implicitly created with every ISCC declaration and MUST be deterministicaly derived from on-chain metadata.
- Extended DID document properties MAY be imported from externaly referenced ISCC Metadata
"},{"location":"iep-0015/#33-update-deactivate","title":"3.3 Update & Deactivate","text":"The ISCC DID MAY be updated or deactivated in accordence with the chain specific implementation of the ISCC declaration protocol.
"},{"location":"iep-0015/#4-verifiable-data-registry","title":"4. Verifiable Data Registry","text":"The verifiable data registry or \"target system\" for ISCC DIDs is a federation of existing public ledgers that support the ISCC declaration protocol. The protocol can be implemented on most public ledgers (even without smart contracts) that provide an orderd, immutable, append-only history of signed transactions.
Figure 1 - ISCC Verifiable Data Registry"},{"location":"iep-0015/#5-did-document","title":"5. DID Document","text":"DID documents are sourced from on-chain metadata and optionally from immutably or mutably referenced off-chain metadata.
All information required to construct a minimal valid DID document from an ISCC declaration is available on-chain and can be dynamically transformed and presented as DID document by a DID driver implementation.
Minimal ISCC DID Document example
{\n \"@context\": \"https://www.w3id.org/ns/did/v1\"\n \"id\": \"did:iscc:miagwptv4j2z57ci\",\n \"controller\": \"did:pkh:eip155:1:0x901ee44e3bddf4bc1c08a2ed229498512f8bcfdc\",\n \"alsoKnownAs\": \"iscc:kecycpu3okiudz7tybrk5hz4jgptillat2iw7ty7eyiji4qsk5i353i\",\n \"service\": [{\n \"id\":\"did:iscc:miagwptv4j2z57ci#iscc-metadata\",\n \"type\": \"IsccMetadata\", \n \"serviceEndpoint\": \"ipfs://bafybeiccys7kilr3rynlhoelrdn6ragpbfoti73h4e3oszbgd5inthicja/iscc-metadata/43.json\"\n }]\n}\n
- The DID subject (
id
-property) MUST be the ISCC-ID in DID representation. - The DID controller (
controller
-property) MUST be constructed deterministically by converting the blockchain account that signed the declaration transaction to a did:pkh
. - The
alsoKnownAs
-property MUST be set to the ISCC-CODE registered by the transaction. - If the original ISCC declaration includes a link to off-chain metadata the DID document MUST include the reference via an entry into the
service
-property with type \"IsccMetadata\". The referenced serviceEndpoint
SHOULD return a document of type http://purl.org/iscc/context.
Info
Properties like verificationMethod
, authentication
, assertionMethod
etc. are left out intentionally, as their autoritative values are managed by the DID document associated with the controller
that can be resolved separately.
To be defined
Additional/Optional DID document data MAY be added off-chain in mutable or immutable modes and retrived and incjected by the DID driver in realtime to compose an extended DID document that includes other properties like service
.
"},{"location":"iep-0015/#6-resolving-dids","title":"6. Resolving DIDs","text":" - An ISCC DID can be resolved by querying an instantiation of an ISCC content registry.
- The authenticity of the response can be verified through the referenced on-chain transaction.
Figure 2 - ISCC DID Architecture"},{"location":"iep-0015/#7-security-considerations","title":"7. Security Considerations","text":"Implementers should be aware that ISCC-CODEs are not cryptographic hashes but descriptors or similarity preserving (soft) hashes. As such they leak information about the structure of the identified content. This is by design and necessary to support similarity matching with ISCC-CODEs.
An ISCC DID document does not need to contain a proof property. All operations are authenticated with the signature of the transaction payload sent to the network of the originating ledger. This signature is generated using a key specified in the corresponding DID Document.
"},{"location":"iep-0015/#8-privacy-considerations","title":"8. Privacy Considerations","text":"ISCC declarations do not publish any personal data on-chain. Declarers may optionally reference off-chain metadata related to their content registration. Such metadata may contain personal data such as creator and rightsholder information. The assumption is that creators have an interest in proper attribution. Applications that implement ISCC declarations are advised to inform users about any privacy related matters specific to their application.
"},{"location":"iep-0015/#9-reference-implementation","title":"9. Reference Implementation","text":"An end-to-end reference implementation of the decentralized content registry is manifested by the following modules:
- Codec and Algorithms: https://github.com/iscc/iscc-core
- ISCC Metadata: https://github.com/iscc/iscc-schema
- EVM Smart Contracts: https://github.com/iscc/iscc-evm
- EVM Chain Observer: https://github.com/iscc/iscc-observer-evm
- ISCC Content Registry: https://github.com/iscc/iscc-registry / https://iscc.id
- ISCC DID driver: https://github.com/iscc/iscc-did-driver / https://did.iscc.id
"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 09a49ae..f218e7f 100755
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,92 +2,92 @@
https://ieps.iscc.codes/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0000/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0001/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0002/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0003/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0004/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0005/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0006/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0007/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0008/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0009/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0010/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0011/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0012/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0013/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0014/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/iep-0015/
- 2024-01-02
+ 2024-01-03
daily
https://ieps.iscc.codes/includes/abbreviations/
- 2024-01-02
+ 2024-01-03
daily
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index a366b3574af10463f971e66a0fbaf393c1734033..ff42eccea6cd621f0080aa1c5c6adbb7d251e249 100755
GIT binary patch
literal 284
zcmV+%0ptE3iwFqWQI%x^|8r?{Wo=<_E_iKh0L_+7a>6hSh4-8y<9+-SXbVhYc0ECR
z0OPocXB?YGhQRHs0Mp5~n~t=xBtOgF?x{XphfrMH2tLJzDK1&zqD`%jUBf;<-pegJ
z=v93l1GrqEI6}kvoKJfprfK5fvm49_cp5taw#Nd*vneFsF}*4<<^hlF9(71o&r<-$
zc!$qDSj+KW2cj19?NV$Y52@9atTv@orCg)r*;~kLZ(`@ZM)$2-VBku>U#rg;TonDVd3jhFLg@$AR
literal 284
zcmV+%0ptE3iwFoq-IHYk|8r?{Wo=<_E_iKh0L_+7Qo}F|h4(#0#{2ju6lfX8-Ejgq
zfa{pVGmcX%w}IQEv>hf}HjK2fBtOgF?nxiM##mi^hA^d;DK1&%lbgDb`j)-Fz1BN+
zFdKc&5qzmooS_>_AAZomS>y(uK$Gqb5MmI1fy8ahtSEK>l-
z1dq=HILGl{4?>IOb}hD0#;G%nY__#jwQNvw@2zBRu&MVS*?*eOhWM?;s&k_dzZic3
z!n&3iWXTj+vW6_#16i^yS+Ym6WKU$tp2?E!$dc`;k}0wtpvZcFBI^N)tOqEv9-t_C
iKyvZCB=&gB{t|4#W%eKT$m@Y?vH1bJ;rc5^3jhHAK8kVx