BDV

Unambiguous Delimited Values. Similar to CSV, but consistent, unambiguous, and predictable.

Description

Uses leading delimiters and simple character escapes to allow simple and unambiguous introduction of units and records, unambiguous header declaration, unambiguous concatenation of documents, the ability to discern the differences between 0 fields and 1 blank field, and the ability to use arbitrary binary data.

The EBNF is like this, where the all-caps values are each a configurable single byte delmiter:

ebnf stream = {garbage}, { message, {garbage} }, [ ENDSTREAM ]; garbage = (* - (STARTMESSAGE | STARTHEADER | ENDSTREAM) ) message = [header], STARTMESSAGE, { record }, ENDMESSAGE; header = STARTHEADER, units; record = STARTRECORD, units; units = { STARTUNIT, unit }; unit = { (* - control) | (ESCAPE, control) }; control = ENDSTREAM | STARTHEADER | STARTMESSAGE | ENDMESSAGE | STARTRECORD | STARTUNIT | ESCAPE;

the defaut delimters:

ebnf STARTHEADER = "#"; STARTMESSAGE = ">"; ENDMESSAGE = "<"; STARTRECORD = ? ASCII newline ?; STARTUNIT = ","; ESCAPE = "\"; ENDSTREAM = "!";

Examples, using default delimiters

Single message with a header and two records

```

,id,name,value>

,1,taylor,developer ,2,namewith\,comma,valuewith\ newline< ```

Single message with no header and two records

> ,1,taylor,developer ,2,namewith\,comma,valuewith\ newline<

Single message with a header and no records

```

,id,name,value><

```

Single message with a header and one empty record

```

,id,name,value>

< ```

Single message with a header with an empty unit, and a record of all empty units

```

,id,name,,value>

,,,,< ```

The shortest valid message

```

< ```

Single message with no header and one record with one empty unit

> ,<

Single message with no header and one record with zero empty units, one record with one empty unit, and one record with two empty units

``` >

, ,,< ```

All the previous examples concatenated as a stream of messages, with an ENDSTREAM character to delimit the end

This takes advantage of the fact that any amount of garbage data may appear before any STARTMESSAGE, STARTHEADER, or ENDSTREAM character, to allow trailing newlines to not cause issues.

```

,id,name,value>

,1,taylor,developer ,2,namewith\,comma,valuewith\ newline< > ,1,taylor,developer ,2,namewith\,comma,valuewith\ newline<

,id,name,value><

,id,name,value>

<

,id,name,,value>

,,,,<

<

,<

, ,,< ! ```

The shortest valid stream

!

Advantages over CSV

ebnf STARTHEADER = SOH; STARTMESSAGE = STX; ENDMESSAGE = ETX; STARTRECORD = RS; STARTUNIT = US; ESCAPE = ESC; ENDSTREAM = EOT;

If you have a stream of mostly string messages, these rules can help serialize into a compact stream with as little escaping as possible.

Disadvantages