A Tour of PDS Clausewitz Syntax banner

A Tour of PDS Clausewitz Syntax

Date written: October 8, 2020

Last updated: 2021-04-07

Paradox Development Studio (PDS) develops a game engine called Clausewitz that consumes and produces files in a proprietary format. This format is undocumented so I decided that it would be a good idea to showcase the happy path, but more importantly, edge cases so that anyone who is interested in writing parsers (myself included) can plan accordingly because there are a lot of parsers: #1, #2, #3, #4, #5, #6, #7, #8, #9, #10, #11, #12, #13, #14, #15, #16, #17, #18). And these are the only ones I’ve found after a quick open source search!

So if anyone wants to write a parser for Europa Universalis IV, Crusader Kings III, Stellaris, Hearts of Iron IV, Imperator – this should be a good starting point. Before getting started with the tour, I see some try and describe the format formally with EBNF and while this may be possible, reality is a bit more messy. The data format is undocumented and any parser will need to be flexible enough to ingest whatever the engine produces or can also ingest.

To keep the scope of this post limited. We’ll only cover the plain text format. The binary format used predominantly for save files will be for another time. How to write scripted game files encoded in the Clausewitz format won’t be covered as this layer above Clausewitz is called Jomini and has at least some documentation. As an aside, it’s a bit frustrating or just unfortunate to be an author of a Clausewitz parser also called Jomini that predates Paradox’s implementation by several years. I guess naming is hard or great minds think alike!

Two things to keep in mind before we begin:

The Tour

The simplest of examples can use TOML syntax highlighting

# This is a line comment
cid = 1 # This is an inline comment
name = "Rakaly Rulz"

The above depicts a nice 1-to-1 key-value mapping that any language worth its salt can store in an ergonomic data structure.

But before we turn off syntax highlighting, let’s visit the first edge case: duplicate and unordered keys

# This is a line comment
cid = 1 # This is an inline comment
name = "Rakaly Rulz"
cid = 2

In this case, cid wouldn’t map to a singular value but instead to a list of values. This format is commonly seen in EU4 saves

But that’s about as far as we can take syntax highlighting so future examples will be plain.


A value in a key-value pair that contains the smallest unit of measurement is called a scalar. Shown below is an example demonstrating a smattering of scalars.

aaa=foo         # a plain scalar
bbb=-1          # an integer scalar
ccc=1.000       # a decimal scalar
ddd=yes         # a true scalar
eee=no          # a false scalar
fff="foo"       # a quoted scalar
ggg=1821.1.1    # a date scalar in Y.M.D format

Some notes:

Keys are scalars:

@my_var="ccc"    # define a variable

One can have multiple key values pairs per line as long as boundary character is separating them:

a=1 b=2 c=3

Whitespace is considered a boundary, but we’ll see more.

Quoted scalars are by far the trickiest as they have several escape rules:

hhh="a\"b"      # escaped quote. Equivalent to `a"b`
iii="\\"        # escaped escape. Equivalent to `\`
mmm="\\\""      # multiple escapes. Equivalent to `\"`

# a multiline quoted scalar

# Quotes can contain escape codes! Imperator uses them as
# color codes (somehow `0x15` is translated to `#` in the
# parsing process)
nnn="ab <0x15>D ( ID: 691 )<0x15>!"

Arrays / Objects

Arrays and objects are values that contain either multiple values or multiple key-value pairs.

Below, flags is an object.


And an array looks quite similar:


And one can have arrays of objects

campaign_stats={ {
} {
    localization="Henry VI"
} }


There are more operators than equality separating keys from values:

intrigue >= high_skill_rating
age > 16
count < 2
scope:attacker.primary_title.tier <= tier_county

The other operators are typically reserved for game files (save files only use equals).

Boundary Characters

Mentioned earlier, what separates values are boundary characters. Boundary characters are:

Thus, one can make some pretty condensed documents.

a={b=1 c=d}foo=bar

And though I don’t have confirmation, I believe there are a couple more boundary characters:

So the below document could be possible:


The Weeds

An object / array value does not need to be prefixed with an operator:


# is equivalent to `foo={bar=qux}`

A value of {} could mean an empty array or empty object depending on the context. I like to leave it up to the caller to decide.


Any number of empty objects / arrays can occur in an object and should be skipped.

history={{} {} 1629.11.10={core=AAA}}

An object can be both an array and an object at the same time:

brittany_area = { #5
    color = { 118  99  151 }
    169 170 171 172 4384

Scalars can have non-alphanumeric characters:

province_id = event_target:agenda_province
@planet_standard_scale = 11

Don’t try to blank store all numbers as 64 bit floating point, as there are some 64 bit unsigned integers that would cause floating point to lose precision:


# converted to floating point would equal:
# identity=18446744073709548000

Equivalent quoted and unquoted scalars are not always intepretted the same by EU4, so one should preserve if a value was quoted in whatever internal structure. It is unknown if other games suffer from this phenomenon. The most well known example is how EU4 will only accept the quoted values for a field:

unit_type="western"  # bad: save corruption
unit_type=western    # good

Victoria II has instances where unquoted keys contain non-ascii characters (specifically Windows-1252 which matches the Victoria II save file encoding).

jean_jaurès = { }

A scalar has at least one character:

# `=` is the key and `bar` is the value

Unless the empty string is quoted:


The type of an object or array can be externally tagged:

color = rgb { 100 200 150 }
color = hsv { 0.43 0.86 0.61 }
color = hsv360{ 25 75 63 }
color = hex { aabbccdd }
mild_winter = LIST { 3700 3701 }

The EU4 1.26 (Dharma) patch introduced parameter syntax that hasn’t been seen in other PDS titles. From the changelog:

Syntax is [[var_name] code here ] for if variable is defined or [[!var_name] code here ] for if it is not.

An example of the parameter syntax:

generate_advisor = {
  [[!skill] if = {} ]

An object can have a leading scalar (or my preference is looking at it as an array where the last element is an object):

levels={ 10 0=2 1=2 }
# I view it as equivalent to
# levels={ { 10 } { 0=2 1=2 } }

Objects can be infinitely nested. I’ve seen modded EU4 saves contain recursive events that reach several hundred deep.


The first line of save files indicate the format of the save and shouldn’t be considered part of the standard syntax.


It is valid for a file to have extraneous closing braces, which can be seen in Victoria II saves, CK2 saves, and EU4 game files (looking at you verona.txt):

a = { 1 }
b = 2

Save files can reach 100 MB in size and reach over 7 million lines long, so any parser must have performance as a focus.


With all these edge cases, a parser needs to be flexible. There are many ways to parse data, and I won’t say which one is correct. In the list of open source parsers there’s a good mix of regular expressions, parser generators, pull parsers, push parsers, dom parsers, and my favorite: tape parsers. Which is the best approach may come down to the specific situation.

Good luck!

Feel free to get in contact via Discord or hi [(at)] rakaly.com