A lightweight, textual tagging system aimed at DJs for managing custom metadata.
A gig tag is a flat structure with the following, pre-defined fields or components:
All components are optional with the following restrictions:
A label is a non-empty string that contains arbitrary text without leading/trailing whitespace.
Labels are supposed to be edited by users and are displayed verbatim in the UI.
|Label|Comment|
|---|---|
|Wishlist
|a single word|
|FloorFiller
|multiple words concatenated in PascalCase|
|Floor Filler
|multiple words separated by whitespace|
The same content rules that apply to labels also apply to facets.
Moreover facets must not start with a leading slash /
character that
would otherwise interfere with the serialization format (see below).
Facets serve a different semantic purpose than labels. They are used for categorizing, namespacing or grouping a set of labels or for defining the context of associated properties.
Facets are supposed to represent pre-defined identifiers that are neither editable nor directly displayed in the UI.
Facets that consist of 8 decimal digits and nothing else are reserved
for encoding date information. Those numbers encode ISO 8601 calendar
dates without a time zone in the format yyyyMMdd
.
These so called date facets are used for anchoring tags chronologically.
|Facet|Comment|
|---|---|
|audio-features
|a tag for encoding Spotify/EchoNest audio features|
|20220625
|a date facet that denotes the calendar day 2022-06-25 in any time zone|
|20220625 Some Text
|an ordinary facet that does not denote a date, even though it is prefixed with 8 decimal digits that could denote a date|
Custom properties could be attached to tags, abbreviated as props.
Properties are represented as a non-empty, ordered list of key/value pairs.
Keys are non-empty strings that contain arbitrary text without leading/trailing whitespace. There are no restrictions regarding the uniqueness of keys, i.e. duplicate keys are permitted.
Values are arbitrary strings without any restrictions. Empty values are permitted.
Applications are responsible for interpreting the keys and values in their respective context. Facets could be used for defining this context.
Individual tags are encoded as URIs:
URI = scheme ":" ["//" authority] path ["?" query] ["#" fragment]
authority = [userinfo "@"] host [":" port]
Only the path, query, and fragment components could be present. All other components must be absent, i.e. the URI string must neither contain a scheme nor an authority component.
The following table defines the component mapping:
|Tag component|URI component| |---|---| |label|fragment| |facet|path| |props|query|
Tags, respective their URIs, are serialized as text and percent-encoded according to RFC 2396/1738.
Empty components are considered as absent when parsing a gig tag from an URI string.
A valid gig tag URI contains either a single ?
character, or a
single #
character, or both of them. This is also beneficial for
distinguishing encoded gig tags from arbitrary text.
The following examples show variations of the encoded string with empty components that are ignored when decoding the URI.
|Encoded|Facet|Label|Props: Keys|Props: Values
|---|---|---|---|---|
|#MyTag
?#MyTag
||MyTag
|
|20220625#Someone%27s%20wishlist%20for%20this%20day%
20220625?#Someone%27s%20wishlist%20for%20this%20day%
|20220625
|Someone's wishlist for this day
|
|audio-features?energy=0.78&valence=0.61
audio-features?energy=0.78&valence=0.61#
|audio-features
||energy
valence
|0.78
0.61
|
The following tokens do not represent valid gig tags:
|Encoded|Comment|
|---|---|
|https://#MyTag
|scheme is present|
|https://#MyTag
|scheme is present|
|MyTag
|only a facet, but neither a label nor props|
|#
|empty label is considered as absent|
|?
|empty facet and props are considered as absent|
|?#
|empty facet, props, and label are considered as absent|
Multiple tags are formatted and stored as text by concatenating the corresponding, encoded URIs. Subsequent URIs are separated by whitespace, e.g. a single ASCII space character.
Often it is not possible to store the encoded gig tags in a reserved field. In this case gig tags could appended to any text field by separating them with arbitrary whitespace from the preceding text.
Text is split into tokens that are separated by whitespace. Parsing starts with the last token and continues from back to front. It stops when encountering a token that could not be parsed as a valid gig tag.
The first token that could not be parsed as a valid gig tag is considered the last token of the preceding text. The preceding text including this token and the whitespace until the first valid gig tag token must be preserved as an undecoded prefix.
When re-encoding the gig tags the undecoded prefix that was captured during parsing must be prepended to the re-encoded gig tags string. This rule ensures that only whitespace characters could get lost during a decode/re-encode roundtrip, i.e. when unintentionally parsing arbitrary words from the preceding text as valid gig tags (false positives).
The text with the encoded gig tags is appended (separated by whitespace) to the Content Group field of audio files:
GRP1
(primary/preferred) / TIT11
(traditional/fallback)GROUPING
©grp
Licensed under the Mozilla Public License 2.0 (MPL-2.0) (see MPL-2.0.txt or https://www.mozilla.org/MPL/2.0/).
Permissions of this copyleft license are conditioned on making available source code of licensed files and modifications of those files under the same license (or in certain cases, one of the GNU licenses). Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. However, a larger work using the licensed work may be distributed under different terms and without source code for files added in the larger work.
Any contribution intentionally submitted for inclusion in the work by you shall be licensed under the Mozilla Public License 2.0 (MPL-2.0).
It is required to add the following header with the corresponding SPDX short identifier to the top of each file:
rust
// SPDX-License-Identifier: MPL-2.0