This is a crate (Schema-Based Random Data GENerator, i.e. SBRD GEN) that can generate random dummy data based on a schema. It is available both as a library and as a CLI tool.
See About Schema for schema and schema generators, and List of generators that can be specified for generators and their builders.
This program uses serde to parse the schema and format the generated results.
If you want to use it as a library, there are two ways: How to generate with a single generator and How to combine multiple generators with a schema.
A single generator method can be used when the generated results are not so good that they need to be combined. Of course, it can also be generated by How to combine multiple generators with a schema.
The usage is as follows
1. Prepare a builder with new_xx
(where xx is variable) in GeneratorBuilder
; If you want to be able to generate nulls, add the nullable
specification.
2. Convert the builder into a generator by build
.
3. Generate dummy data by passing the seed species and context to the generator.
The following is an example of an actual description.
```rust use rand::threadrng; use sbrdgen::builder::GeneratorBuilder; use sbrd_gen::value::DataValueMap;
fn main() { let builder = GeneratorBuilder::newint(Some((0..=100).into())).nullable(); let generator = builder.build().unwrap(); let generatedvalue = generator.generate(&mut thread_rng(), &DataValueMap::new()).unwrap();
println!("generated: {}", generated_value);
} ```
If you want to use multiple generators, you can use this method.
The procedure is as follows
1. Prepare a list of ParentGeneratorBuilder
as a list of generators you want to use.
Note that this list is used for generation from the top to the bottom, so if you declare them in the wrong order, Script and Format, which can replace keys with generated values, will not function properly.
2. Prepare a list of keys to be output out of the generators you wish to use.
3. Construct SchemaBuilder
with the list of keys you want to output and the list of generators you want to use as arguments.
4. Build the SchemaBuilder
and convert it to Schema
.
5. Generate generate
with the converted Schema
to generate dummy data, or write it to the Writer with write_xx
(where xx is variable) in the GeneratedValueWriter
trace.
See all_builder.rs for an actual writing example.
When used as a CLI tool, dummy data can be generated by specifying the file path of the schema file. The CLI allows you to specify the file format of the schema file, the number of files to output, and the format of the output. For details, please see the CLI help.
There are several ways to install, the most common being Install using Cargo and Install from GitHub release page.
With the cargo
command available, hit the following command.
If you get a help message with sbrd-gen --help
, the installation was successful.
bash
cargo install sbrd-gen
sbrd-gen --help
To install from the GitHub release page, download the desired version from here.
After extracting the downloaded folder, make it available through the binary file path.
If you get a help message with sbrd-gen --help
, the installation was successful.
Run the command with the syntax sbrd-gen [OPTIONS] <SCHEMA_FILE_PATH>
after passing executable file (e.x. sbrd-gen.exe
in Windows).
The following describes the arguments and options that can be specified, but can also be viewed in the help message displayed by sbrd-gen --help
.
<SCHEMA_FILE_PATH>
: File path of the file containing the schema to be used for generation.--parser <PARSER_TYPE>
-p <PARSER_TYPE>
<PARSER_TYPE>
.--type <OUTPUT_TYPE>
-t <OUTPUT_TYPE>
<OUTPUT_TYPE>
.--num <COUNT>
-n <COUNT>
keys
in the schema. Specify the number in <COUNT>
.--no-header
--dry-run
--help
-h
--version
-V
The schema is specified by a Map(KVS) consisting of a sequence of Key to be output with keys
as key and a sequence of Generator Builders with generators
as key.
The formats supported are Yaml and Json.
For example descriptions, see all.yaml and all.json.
When generating dummy data from the schema, the generators specified in the schema are executed from the top. The generated values are stored in a Map (KVS) data structure called a Value Context. In other words, the pairs that can be referenced in the Value Context are the key/value pairs of the generators that were successfully generated at the time of reference. This Value Context can be used to retrieve the value associated with a key from the key to be output, or to convert the notation "{key}" (no space between brackets and key) specified as Script or Format to the current The value of the key in the context is replaced by the value associated with the key in the context, and then evaluated, etc.
The parent generator is specified by a Map(KVS) consisting of keys and builder options.
The structure is ParentGeneratorBuilder
.
A key to identify the generator, specified as a string with key
as the key.
It is also used as a substitution key when evaluating Script or Format.
You can specify the generator options listed in List of generators that can be specified. The generator to be generated is determined by Type, and other options are interpreted in the same way.
Generators that can be specified as a schema or a single generator are as follows.
This module consists of a collection of generators that assemble strings based on the results generated by other generators.
* duplicate permutation generator
* Description : Generator that combines generated results into a string. Generate values as many times as specified in the range and paste them with Separator to create a string. Default for Separator is an empty string ("").
* Remarks : None
* Struct : DuplicatePermutationGenerator
* Type : duplicate-permutation
* Required options : Type, Separator, One or more in parentheses(List of child generators, Character list, List of Values, External file path)
* Available options : Type, Nullable, Range (Integer), Separator, List of child generators, Character list, List of Values, External file path
* Generate value type : String
* format generator
* Description : Generator that constructs a string by adapting the context to the specified format.
* Remarks : None
* Struct : FormatGenerator
* Type : format
* Required options : Type, Format
* Available options : Type, Nullable, Format
* Generate value type : String
This module consists of a collection of generators that generate random numbers based on a distribution function.
* normal generator
* Description : Generator that generates random numbers according to a normal distribution.
* Remarks : Parameters can be the mean of Real-number (mean
) and the standard deviation of Real-number (std_dev
). Default is 0.0 and 1.0, respectively.
* Struct : NormalGenerator
* Type : dist-normal
* Required options : Type, Parameters
* Available options : Type, Nullable, Parameters
* Generate value type : Real-number
This module consists of a collection of generators that evaluate a given expression and output a value.
* eval generator
* Description : Generator that outputs the result of evaluating the specified Script.
* Remarks : None
* Struct : EvalGenerator
* Type : eval-int(Integer), eval-real(Real-number), eval-bool(Boolean), eval-string(String)
* Required options : Type, Script
* Available options : Type, Nullable, Script
* Generate value type : Integer(eval-int), Real-number(eval-real), Boolean(eval-bool), String(eval-string)
This module consists of a collection of generators that change sequentially, such as increasing by a certain amount each time they are executed.
* increment id generator
* Description : Generator that adds the number of steps of the specified Increment before each generation. The initial value is the initial value of the specified Increment.
* Remarks : Default for Increment is 1 increase beginning 1.
* Struct : IncrementIdGenerator
* Type : increment-id
* Required options : Type
* Available options : Type, Nullable, Increment (Integer)
* Generate value type : Integer
This module consists of a collection of generators that generate basic values.
* int generator
* Description : Generator that generates Integer with the specified Range, where the Default range is between the minimum value of i16 (-32768) and the maximum value of i16 (32767).
* Remarks : None
* Struct : IntGenerator
* Type : int
* Required options : Type
* Available options : Type, Nullable, Range (Integer)
* Generate value type : Integer
* real generator
* Description : Generator that generates a Real-number in the specified Range, where the Default range is between the minimum value of i16 (-32768) and the maximum value of i16 (32767).
* Remarks : The larger the absolute value of the generated value, the fewer the number of characters after the decimal point, and the smaller the absolute value, the more the number of characters after the decimal point.
* Struct : RealGenerator
* Type : real
* Required options : Type
* Available options : Type, Nullable, Range (Real-number)
* Generate value type : Real-number
* bool generator
* Description : Generator that generates true or false with 50% probability.
* Remarks : None
* Struct : BoolGenerator
* Type : bool
* Required options : Type
* Available options : Type, Nullable
* Generate value type : Boolean
* date time generator
* Description : This generator generates date and time in the format specified by Format.
* Remarks : The format of date and time specified by Range is "%Y-%m-%d %H:%M:%S". Default value format of Format has the same format. See here for the format. Default value of Range is from 1900-01-01 00:00:00 less than 2151-01-01 00:00:00. An unspecified boundary is assumed to have a Default value.
* Struct : DateTimeGenerator
* Type : date-time
* Required options : Type
* Available options : Type, Nullable, Range (DateTime-String), Format
* Generate value type : String
* date generator
* Description : This generator generates date in the format specified by Format.
* Remarks : The format of date specified by Range is "%Y-%m-%d". Default value format of Format has the same format. See here for the format. Default value of Range is from 1900-01-01 less than 2151-01-01. An unspecified boundary is assumed to have a Default value.
* Struct : DateGenerator
* Type : date
* Required options : Type
* Available options : Type, Nullable, Range (Date-String), Format
* Generate value type : String
* time generator
* Description : This generator generates time in the format specified by Format.
* Remarks : The format of time specified by Range is "%H:%M:%S". Default value format of Format has the same format. See here for the format. Default value of Range is from 00:00:00 less than and equal 23:59:59. An unspecified boundary is assumed to have a Default value.
* Struct : TimeGenerator
* Type : time
* Required options : Type
* Available options : Type, Nullable, Range (Time-String), Format
* Generate value type : String
* always null generator
* Description : Generator that always generates null.
* Remarks : None
* Struct : AlwaysNullGenerator
* Type : always-null
* Required options : Type
* Available options : Type, Nullable
* Generate value type : Null
List of child generators is a generator that generates values.
* case when generator
* Description : Generator generated by evaluating Condition in the order of declaration and using child generators that are true.
* Remarks : A child generator for the Default condition (i.e., Condition is not specified) is needed in case Condition is not caught.
* Struct : CaseWhenGenerator
* Type : case-when
* Required options : Type, List of child generators with Condition specified
* Available options : Type, Nullable, List of child generators with Condition specified
* Generate value type : Generate value type of the child generator used for generation
* random child generator
* Description : This generator is generated using a randomly selected generator considering Weight.
* Remarks : None
* Struct : RandomChildGnerator
* Type : random-child
* Required options : Type, List of child generators with Weight specified
* Available options : Type, Nullable, List of child generators with Weight specified
* Generate value type : Generate value type of the child generator used for generation
This module consists of a collection of generators that generate values using Character list, List of Values, and External file path.
* select generator
* Description : Generator to randomly select values specified by Character list, List of Values, or External file path.
* Remarks : None
* Struct : SelectGenerator
* Type : select-int(Integer), select-real(Real-number), select-string(String)
* Required options : Type, One or more in parentheses(Character list, List of Values, External file path)
* Available options : Type, Nullable, Character list, List of Values, External file path
* Generate value type : Integer(select-int), Real-number(select-real), String(select-string)
* get value at generator
* Description : Generator that retrieves the value at the index obtained by evaluating Script from a list of input values.
* Remarks : None
* Struct : GetValueAtGenerator
* Type : get-int-value-at(Integer), get-real-value-at(Real-number), get-string-value(String)
* Required options : Type, Script, One or more in parentheses(Character list, List of Values, External file path)
* Available options : Type, Nullable, Script, Character list, List of Values, External file path
* Generate value type : Integer(get-int-value-at), Real-number(get-real-value-at), String(get-string-value-at)
* get value index generator
* Description : Generator to obtain an index of a random selectable range from a list of input values.
* Remarks : None
* Struct : GetValueIndexGenerator
* Type : get-value-index
* Required options : Type, One or more in parentheses(Character list, List of Values, External file path)
* Available options : Type, Nullable, Character list, List of Values, External file path
* Generate value type : Integer(Not negative)
The following options can be specified to build the generator. The available options vary from generator to generator, but all other options are ignored.
GeneratorType
type
Value type : String
Description : A flag indicating whether null can be generated in addition to the value generated by the generator; if true, null can be generated; Default is false.
bool
nulable
Value type : Boolean
Description : This format is used for key/value pairs in Value Context (let's say the pair is (key, value)). is in turn evaluated as a String after replacing the string "{key}" or "{key:\
String
format
Value type : String
Description : This script is a key/value pair (let's say the pair is (key, value)) in Value Context. is in turn replaced by the string "{key}" or "{key:\
String
script
Value type : String
Description : A string used for delimitation in string construction, etc.
String
separator
Value type : String
Description : This option is used to specify the range of the number of iterations and the range of values to be generated.
ValueBound
range
Value type : Map (KVS) consisting of the key start
with the value of value type, the key end
, and the key include_end
with the flag indicating that the value of end
is included, each of which is optional. The default value of include_end
is true.
Description : Option to specify the initial value and the amount of change in the value that will be updated each time the generator is run.
ValueStep
increment
Value type : Map(KVS) consisting of a key initial
with a value of value type as an initial value and a key step
with a value of value type representing the amount of change, where initial
is required and step
is optional.
Description : This option specifies the sequence of generators specified in List of generator options. The generator specified here is called a child generator, and unlike the parent generator, an additional List of options for child generator can be specified.
Vec<ChildGeneratorBuilder>>
children
Value type : Sequence of child generators
Description : Option to enumerate characters for random selection.
String
chars
Value type : String
Description : Option to enumerate values for random selection.
Vec<DataValue>
values
Value type : Sequence consisting of Integer, Real-number, or String type
Description : This option specifies the file path of a file that enumerates the values to be selected for random selection as a single line == one value. In addition to an absolute path, it can be specified relative to the schema file.
PathBuf
filepath
Value type : String
Description : This option is used to specify the parameters needed to construct the distribution function. See each generator in Distribution system for the keys and values to specify.
DataValueMap<String>
parameters
The child generator can specify the options listed below in addition to the options that can be specified by the generator.
String
condition
Value type : String
Description : Option to specify the weight for random selection of child generators. The higher the weight, the more often it is selected; Default weight is 1.
Weight
weight
MIT