👉 TOML language is widely used in Rust projects, and a toml configuration file is created whenever you run cargo new
for a new project. Therefore, it is necessary to familiarize yourself with the relevant syntax of TOML while learning Rust, so that you can proficiently edit configuration files in Rust projects.
👉 This article is from the Chinese translation version of toml-lang. It is included here for the convenience of Rust learners to refer to uniformly. If you find any errors or areas that need improvement, please point out or correct them in the original project repository of toml.io.
TOML v1.0.0#
Full name: Tom's (Obvious) Minimal Language.
Authors: Tom Preston-Werner, Pradyun Gedam, and others.
Purpose#
TOML aims to be a minimal configuration file format that is semantically clear and easy to read.
TOML is designed to unambiguously map to a hash table.
TOML should be easily parsed into data structures in various languages.
Table of Contents#
- Specification
- Comments
- Key-Value Pairs
- Key Names
- Strings
- Integers
- Floating Point Numbers
- Booleans
- Coordinated Date-Time
- Local Date-Time
- Local Dates
- Local Times
- Arrays
- Tables
- Inline Tables
- Table Arrays
- File Extensions
- MIME Types
- ABNF Syntax
Specification#
- TOML is case-sensitive.
- TOML files must be valid UTF-8 encoded Unicode documents.
- Whitespace refers to tabs (0x09) or spaces (0x20).
- Line breaks refer to LF (0x0A) or CRLF (0x0D0A).
Comments#
The hash character marks the remainder of the line as a comment, unless it is inside a string.
# This is a full line comment
key = "value" # This is an end-of-line comment
another = "# This is not a comment"
Control characters other than tabs are not allowed in comments.
Key-Value Pairs#
The most basic building block of a TOML document is a key-value pair.
The key name is on the left side of the equals sign, and the value is on the right side.
Whitespace around the key name and value is ignored.
The key, equals sign, and value must be on the same line (although some values can span multiple lines).
key = "value"
The value must be one of the following types.
- Strings
- Integers
- Floating Point Numbers
- Booleans
- Coordinated Date-Time
- Local Date-Time
- Local Dates
- Local Times
- Arrays
- Inline Tables
Not specifying a value is illegal.
key = # Illegal
A line break must follow a key-value pair (or end the file).
(Exceptions are noted in Inline Tables)
first = "Tom" last = "Preston-Werner" # Illegal
Key Names#
Key names can be bare, quoted, or dot-separated.
Bare keys can only contain ASCII letters, ASCII digits, underscores, and hyphens (A-Za-z0-9_-
).
Note that bare keys can consist solely of pure ASCII digits, such as 1234
, but are understood as strings.
key = "value"
bare_key = "value"
bare-key = "value"
1234 = "value"
Quoted keys follow the same rules as basic strings or literal strings and allow you to use a wider range of key names.
Unless clearly necessary, using bare keys is best practice.
"127.0.0.1" = "value"
"character encoding" = "value"
"ʎǝʞ" = "value"
'key2' = "value"
'quoted "value"' = "value"
Bare keys cannot be empty, but empty quoted keys are allowed (though not recommended).
= "no key name" # Illegal
"" = "blank" # Legal but discouraged
'' = 'blank' # Legal but discouraged
Dot-separated keys are a series of bare keys or quoted keys connected by dots.
This allows you to group related attributes together:
name = "Orange"
physical.color = "orange"
physical.shape = "round"
site."google.com" = true
Equivalent to the following JSON structure:
{
"name": "Orange",
"physical": {
"color": "orange",
"shape": "round"
},
"site": {
"google.com": true
}
}
For detailed information on defining dot-separated keys, see the Tables section below.
Whitespace around dot separators is ignored.
However, best practice is to avoid any unnecessary whitespace.
fruit.name = "banana" # This is best practice
fruit. color = "yellow" # Equivalent to fruit.color
fruit . flavor = "banana" # Equivalent to fruit.flavor
Indentation is treated as whitespace and ignored.
Defining the same key multiple times is illegal.
# Don't do this
name = "Tom"
name = "Pradyun"
Note that bare keys and quoted keys are equivalent:
# This is not allowed
spelling = "favorite"
"spelling" = "favourite"
As long as a key has not been directly defined yet, you can still assign values to it and its subordinate key names.
# This makes the "fruit" key exist as a table.
fruit.apple.smooth = true
# So next you can add content to the "fruit" table like this:
fruit.orange = 2
# The following is illegal
# This defines the value of fruit.apple as an integer.
fruit.apple = 1
# But then this treats fruit.apple as a table.
# An integer cannot become a table.
fruit.apple.smooth = true
Defining dot-separated keys in a jumpy manner is discouraged.
# Legal but discouraged
apple.type = "fruit"
orange.type = "fruit"
apple.skin = "thin"
orange.skin = "thick"
apple.color = "red"
orange.color = "orange"
# Recommended
apple.type = "fruit"
apple.skin = "thin"
apple.color = "red"
orange.type = "fruit"
orange.skin = "thick"
orange.color = "orange"
Since bare keys can consist solely of ASCII digits, it is possible to write what looks like a floating-point number, but is actually a two-part dot-separated key.
Unless you have a compelling reason (which is unlikely), do not do this.
3.14159 = "pi"
The above TOML corresponds to the following JSON.
{ "3": { "14159": "pi" } }
Strings#
There are four ways to represent strings: basic strings, multi-line basic strings, literal strings, and multi-line literal strings.
All strings can only contain valid UTF-8 characters.
Basic strings are wrapped in quotes ("
).
Any Unicode character can be used except those that must be escaped: quotes, backslashes, and control characters (U+0000 to U+0008, U+000A to U+001F, U+007F).
str = "I am a string. \"You can quote me\". Name\tJos\u00E9\nLocation\tSan Francisco."
For convenience, some common characters have shorthand escape sequences.
\b - backspace (U+0008)
\t - tab (U+0009)
\n - linefeed (U+000A)
\f - form feed (U+000C)
\r - carriage return (U+000D)
\" - quote (U+0022)
\\ - backslash (U+005C)
\uXXXX - unicode (U+XXXX)
\UXXXXXXXX - unicode (U+XXXXXXXX)
Any Unicode character can be escaped using the form \uXXXX
or \UXXXXXXXX
.
Escape codes must be valid Unicode scalar values.
All other escape sequences not listed above are reserved; if used, TOML should produce an error.
Sometimes you need to represent a small piece of text (like a translation) or want to wrap a very long string.
TOML simplifies this.
Multi-line basic strings are wrapped in three quotes, allowing line breaks.
The line break immediately following the opening quote will be removed.
Other whitespace and line breaks will be preserved as is.
str1 = """
Roses are red
Violets are blue"""
The TOML parser can flexibly parse line break characters valid for the platform.
# On Unix systems, the above multi-line string may be equivalent to:
str2 = "Roses are red\nViolets are blue"
# On Windows systems, it may be equivalent to:
str3 = "Roses are red\r\nViolets are blue"
If you want to write long strings without introducing irrelevant whitespace, you can use "line-ending backslashes."
When the last non-whitespace character of a line is an unescaped \
, it will be removed along with all whitespace (including line breaks) until the next non-whitespace character or the closing quote.
All escape sequences valid for basic strings are also valid for multi-line basic strings.
# Each of the following strings is exactly the same:
str1 = "The quick brown fox jumps over the lazy dog."
str2 = """
The quick brown \
fox jumps over \
the lazy dog."""
str3 = """\
The quick brown \
fox jumps over \
the lazy dog.\
"""
Any Unicode character can be used, except those that must be escaped: backslashes and control characters (U+0000 to U+0008, U+000B, U+000C, U+000E to U+001F, U+007F).
You can write a quote or two adjacent quotes anywhere within a multi-line basic string.
They can also be written right next to the delimiter.
str4 = """This has two quotes: ""." Simple enough."""
# str5 = """This has two quotes: """."""" # Illegal
str5 = """This has three quotes: ""\"."""
str6 = """This has fifteen quotes: ""\"""\"""\"""\"""\"."""
# "This," she said, "is just a meaningless clause."
str7 = """"This," she said, "is just a meaningless clause.""""
If you often need to specify Windows paths or regular expressions, escaping backslashes can become tedious and error-prone.
To help with this, TOML supports literal strings, which do not allow any escapes.
Literal strings are wrapped in single quotes.
Like basic strings, they can only appear as a single line:
# As is.
winpath = 'C:\Users\nodejs\templates'
winpath2 = '\\ServerX\admin$\system32\'
quoted = 'Tom "Dubs" Preston-Werner'
regex = '<\i\c*\s*>'
Since there are no escapes, you cannot write a single quote within a literal string wrapped in single quotes.
Fortunately, TOML supports a multi-line version of literal strings to solve this problem.
Multi-line literal strings are wrapped in three single quotes on each side, allowing line breaks.
Like literal strings, there are no escapes.
The line break immediately following the starting delimiter will be removed.
All other content between the start and end delimiters will be treated as is.
regex2 = '''I [dw]on't need \d{2} apples'''
lines = '''
The first line break in the raw string is removed.
All other whitespace
is preserved.
'''
You can write one or two single quotes anywhere within a multi-line literal string, but sequences of three or more single quotes are not allowed.
quot15 = '''This has fifteen quotes: """"""""""""'''
# apos15 = '''This has fifteen apostrophes: ''''''''''''''''' # Illegal
apos15 = "This has fifteen apostrophes: ''''''''''''''"
# 'That,' she said, 'still makes no sense.'
str = ''''That,' she said, 'still makes no sense.' '''
All control characters except tabs are not allowed in literal strings.
Therefore, for binary data, it is recommended to use Base64 or other suitable ASCII or UTF-8 encodings.
The handling of those encodings will be left to the application itself.
Integers#
Integers are pure numbers.
Positive numbers can have a plus sign prefix.
Negative numbers have a minus sign prefix.
int1 = +99
int2 = 42
int3 = 0
int4 = -17
For large numbers, you can use underscores between digits to enhance readability.
Each underscore must be surrounded by at least one digit.
int5 = 1_000
int6 = 5_349_221
int7 = 53_49_221 # Indian counting system grouping
int8 = 1_2_3_4_5 # Legal but discouraged
Leading zeros are not allowed.
The integer values -0
and +0
are valid and equivalent to zero without a prefix.
Non-negative integer values can also be represented in hexadecimal, octal, or binary.
In these formats, +
is not allowed, while leading zeros (after the prefix) are allowed.
Hexadecimal values are case insensitive.
Underscores between digits are allowed (but cannot exist between the prefix and the value).
# Hexadecimal with `0x` prefix
hex1 = 0xDEADBEEF
hex2 = 0xdeadbeef
hex3 = 0xdead_beef
# Octal with `0o` prefix
oct1 = 0o01234567
oct2 = 0o755 # Useful for representing Unix file permissions
# Binary with `0b` prefix
bin1 = 0b11010110
Any 64-bit signed integer (from −2^63 to 2^63−1) should be accepted and processed without loss.
If an integer cannot be represented without loss, an error must be thrown.
Floating Point Numbers#
Floating point numbers should be implemented as IEEE 754 binary64 values.
A floating point number consists of an integer part (following the same rules as decimal integer values) followed by a decimal part and/or an exponent part.
If both the decimal part and the exponent part are present, the decimal part must come before the exponent part.
# Decimal
flt1 = +1.0
flt2 = 3.1415
flt3 = -0.01
# Exponent
flt4 = 5e+22
flt5 = 1e06
flt6 = -2E-2
# Both
flt7 = 6.626e-34
The decimal part is a decimal point followed by one or more digits.
An exponent part is an E (case insensitive) followed by an integer part (following the same rules as decimal integer values, but can include leading zeros).
The decimal point, if used, must be adjacent to at least one digit on each side.
# Illegal floating point numbers
invalid_float_1 = .7
invalid_float_2 = 7.
invalid_float_3 = 3.e+20
Similar to integers, you can use underscores to enhance readability.
Each underscore must be surrounded by at least one digit.
flt8 = 224_617.445_991_228
Floating point values -0.0
and +0.0
are valid and should follow IEEE 754.
Special floating point values can also be represented.
They are lowercase.
# Infinity
sf1 = inf # Positive infinity
sf2 = +inf # Positive infinity
sf3 = -inf # Negative infinity
# NaN
sf4 = nan # Corresponds to signaling NaN or quiet NaN, depending on implementation
sf5 = +nan # Equivalent to `nan`
sf6 = -nan # Valid, actual value depends on implementation
Booleans#
Booleans are just as you would expect.
They should be lowercase.
bool1 = true
bool2 = false
Coordinated Date-Time#
To accurately represent a specific time in the world, you can use the date-time format specified by RFC 3339 with a timezone offset.
odt1 = 1979-05-27T07:32:00Z
odt2 = 1979-05-27T00:32:00-07:00
odt3 = 1979-05-27T00:32:00.999999-07:00
For readability, you can replace the T between the date and time with a space character (this is allowed in section 5.6 of RFC 3339).
odt4 = 1979-05-27 07:32:00Z
Millisecond precision is required.
Higher precision fractional seconds depend on the implementation.
If its value exceeds the precision supported by the implementation, the excess must be discarded, not rounded.
Local Date-Time#
If you omit the timezone offset in the RFC 3339 date-time, it indicates that the date-time does not involve a timezone offset.
Without additional information, it is unclear which moment in the world it should be converted to.
If conversion is still requested, the result will depend on the implementation.
ldt1 = 1979-05-27T07:32:00
ldt2 = 1979-05-27T00:32:00.999999
Millisecond precision is required.
Higher precision fractional seconds depend on the implementation.
If its value exceeds the precision supported by the implementation, the excess must be discarded, not rounded.
Local Dates#
If you only write the date part of the RFC 3339 date-time, it represents a whole day and does not involve a timezone offset.
ld1 = 1979-05-27
Local Times#
If you only write the time part of the RFC 3339 date-time, it will only represent that moment in a day, unrelated to any specific date and not involving a timezone offset.
lt1 = 07:32:00
lt2 = 00:32:00.999999
Millisecond precision is required.
Higher precision fractional seconds depend on the implementation.
If its value exceeds the precision supported by the implementation, the excess must be discarded, not rounded.
Arrays#
Arrays are enclosed in square brackets containing values.
Whitespace is ignored.
Sub-elements are separated by commas.
Arrays can contain values of the same data types allowed in key-value pairs.
Different types of values can be mixed.
integers = [ 1, 2, 3 ]
colors = [ "red", "yellow", "green" ]
nested_array_of_ints = [ [ 1, 2 ], [3, 4, 5] ]
nested_mixed_array = [ [ 1, 2 ], ["a", "b", "c"] ]
string_array = [ "all", 'strings', """are the same""", '''type''' ]
# Arrays allowing mixed types
numbers = [ 0.1, 0.2, 0.5, 1, 2, 5 ]
contributors = [
"Foo Bar <[email protected]>",
{ name = "Baz Qux", email = "[email protected]", url = "https://example.com/bazqux" }
]
Arrays can span multiple lines.
A trailing comma (also known as a trailing comma) is allowed after the last value in the array.
Any number of line breaks and comments can exist before values, commas, and the closing bracket.
Indentation between array values and commas is treated as whitespace and ignored.
integers2 = [
1, 2, 3
]
integers3 = [
1,
2, # This is allowed
]
Tables#
Tables (also known as hash tables or dictionaries) are a collection of key-value pairs.
They are defined by a table header, appearing on a separate line with square brackets.
You can see that table headers differ from arrays because arrays only have values.
[table]
Below it, until the next table header or the end of the file, are the key-value pairs of this table.
Tables do not guarantee to maintain the specified order of key-value pairs.
[table-1]
key1 = "some string"
key2 = 123
[table-2]
key1 = "another string"
key2 = 456
The rules for table names are the same as for key names (see the earlier Key Names definition).
[dog."tater.man"]
type.name = "pug"
Equivalent to the following JSON structure:
{ "dog": { "tater.man": { "type": { "name": "pug" } } } }
Whitespace around key names is ignored.
However, best practice is to avoid any unnecessary whitespace.
[a.b.c] # This is best practice
[ d.e.f ] # Equivalent to [d.e.f]
[ g . h . i ] # Equivalent to [g.h.i]
[ j . "ʞ" . 'l' ] # Equivalent to [j."ʞ".'l']
Indentation is treated as whitespace and ignored.
You do not have to fully write out all the parent tables you do not want to write.
TOML knows what to do.
# [x] You
# [x.y] No
# [x.y.z] Need these
[x.y.z.w] # To make this work
[x] # Postfix parent table definitions are allowed
Empty tables are allowed as long as they contain no key-value pairs.
Similar to key names, you cannot redefine a table.
Doing so is illegal.
# Don't do this
[fruit]
apple = "red"
[fruit]
orange = "orange"
# Also don't do this
[fruit]
apple = "red"
[fruit.apple]
texture = "smooth"
Defining tables in a disordered manner is discouraged.
# Valid but discouraged
[fruit.apple]
[animal]
[fruit.orange]
# Recommended
[fruit.apple]
[fruit.orange]
[animal]
Top-level tables, also known as root tables, start at the beginning of the document and end before the first table header (or the end of the file).
Unlike other tables, it has no name and cannot be postfix.
# Top-level table starts.
name = "Fido"
breed = "pug"
# Top-level table ends.
[owner]
name = "Regina Dogman"
member_since = 1999-08-04
Dot-separated keys create and define a table for each key name before the last key name, if those tables have not yet been created.
fruit.apple.color = "red"
# Defines a table named fruit
# Defines a table named fruit.apple
fruit.apple.taste.sweet = true
# Defines a table named fruit.apple.taste
# fruit and fruit.apple have already been created
Since tables cannot be defined more than once, it is not allowed to redefine such a table using [table]
headers.
Similarly, it is not allowed to redefine a table that has already been defined in the form of [table]
using dot-separated keys.
However, the [table]
form can be used to define sub-tables within tables defined by dot-separated keys.
[fruit]
apple.color = "red"
apple.taste.sweet = true
# [fruit.apple] # Illegal
# [fruit.apple.taste] # Illegal
[fruit.apple.texture] # You can add sub-tables
smooth = true
Inline Tables#
Inline tables provide a more compact syntax for representing tables.
This is especially useful for otherwise verbose grouped data.
Inline tables are fully defined within curly braces: {
and }
.
Inside the braces, there can be zero or more comma-separated key-value pairs.
Key-value pairs take the same form as those in standard tables.
Any type of value can be included, including inline tables.
Inline tables must appear on the same line.
A trailing comma (also known as a trailing comma) is not allowed after the last key-value pair in an inline table.
No line breaks are allowed within the braces, unless they are legal within the values.
Even then, it is strongly discouraged to make an inline table span multiple lines.
If you find yourself needing to do so, it means you should use a standard table.
name = { first = "Tom", last = "Preston-Werner" }
point = { x = 1, y = 2 }
animal = { type.name = "pug" }
The above inline tables are equivalent to the following standard table definitions:
[name]
first = "Tom"
last = "Preston-Werner"
[point]
x = 1
y = 2
[animal]
type.name = "pug"
Inline tables are self-contained, defining all keys and sub-tables internally.
You cannot add keys or sub-tables outside the braces.
[product]
type = { name = "Nail" }
# type.edible = false # Illegal
Similarly, inline tables cannot be used to add keys or sub-tables to an already defined table.
[product]
type.name = "Nail"
# type = { edible = false } # Illegal
Table Arrays#
The last syntax we haven't discussed allows you to write table arrays.
This can be represented by writing the table name in square brackets in the table header.
The first instance of the header defines the array and its first table element, while each subsequent instance creates and defines a new table element in that array.
These tables are inserted into the array in the order they appear.
[[products]]
name = "Hammer"
sku = 738594937
[[products]] # Empty table in the array
[[products]]
name = "Nail"
sku = 284758393
color = "gray"
Equivalent to the following JSON structure.
{
"products": [
{ "name": "Hammer", "sku": 738594937 },
{ },
{ "name": "Nail", "sku": 284758393, "color": "gray" }
]
}
Any reference to a table array points to the most recently defined table element in that array.
This allows you to define sub-tables within the most recent table, even nested table arrays.
[[fruits]]
name = "apple"
[fruits.physical] # Sub-table
color = "red"
shape = "round"
[[fruits.varieties]] # Nested table array
name = "red delicious"
[[fruits.varieties]]
name = "granny smith"
[[fruits]]
name = "banana"
[[fruits.varieties]]
name = "plantain"
The above TOML is equivalent to the following JSON structure.
{
"fruits": [
{
"name": "apple",
"physical": {
"color": "red",
"shape": "round"
},
"varieties": [
{ "name": "red delicious" },
{ "name": "granny smith" }
]
},
{
"name": "banana",
"varieties": [
{ "name": "plantain" }
]
}
]
}
If a table or table array's parent is an array element, that element must be defined before defining its children.
Reversed order behavior must throw an error during parsing.
# Illegal TOML document
[fruit.physical] # Sub-table, but which parent element should it belong to?
color = "red"
shape = "round"
[[fruit]] # The parser must throw an error when it discovers "fruit" is an array rather than a table
name = "apple"
If you attempt to append content to a statically defined array, even if the array is empty, it must throw an error during parsing.
# Illegal TOML document
fruits = []
[[fruits]] # Not allowed
If you attempt to define a table with a name that has already been determined to be an array, it must throw an error during parsing.
Similarly, redefining an array as a regular table must also throw an error during parsing.
# Illegal TOML document
[[fruits]]
name = "apple"
[[fruits.varieties]]
name = "red delicious"
# Illegal: This table conflicts with the previously defined table array
[fruits.varieties]
name = "granny smith"
[fruits.physical]
color = "red"
shape = "round"
# Illegal: This table array conflicts with the previously defined table
[[fruits.physical]]
color = "green"
You can also appropriately use inline tables:
points = [ { x = 1, y = 2, z = 3 },
{ x = 7, y = 8, z = 9 },
{ x = 2, y = 4, z = 8 } ]
File Extensions#
TOML files should use the .toml
extension.
MIME Types#
When transmitting TOML files over the internet, the appropriate MIME type is application/toml
.
ABNF Syntax#
The rigorous specification of TOML syntax is provided in a separate ABNF file.