#### Generating recursive data – Hypothesis

Sometimes you want to generate data which is *recursive*.

That is, in order to draw some data you may need to draw

some more data from the same strategy. For example we might

want to generate a tree structure, or arbitrary JSON.

Hypothesis has the *recursive* function in the hypothesis.strategies

module to make this easier to do. This is an article about how to

use it.

Lets start with a simple example of drawing tree shaped data:

In our example a tree is either a single boolean value (a leaf

node), or a tuple of two child trees. So a tree might be e.g. True,

or (False, True), or ((True, True), False), etc.

First off, it might not be obvious that you *need* the recursive

strategy. In principle you could just do this with composite:

```
import hypothesis.strategies as st
@st.composite
def composite_tree(draw):
return draw(st.one_of(
st.booleans(),
st.tuples(composite_tree(), composite_tree()),
))
```

If you try drawing examples from this you’ll probably see one of

three scenarios:

- You’ll get a single boolean value
- You’ll get a very large tree
- You’ll get a RecursionError from a stack overflow

It’s unlikely that you’ll see any non-trivial small examples.

The reason for this is that this sort of recursion tends to

explode in size: If this were implemneted as a naive random

generation process then the expected size of the tree would

be infinite. Hypothesis has some built in limiters to stop

it ever trying to actually generate infinitely large amounts

of data, but it will still tend to draw trees that are very

large if they’re not trivial, and it can’t do anything about

the recursion problem.

So instead of using this sort of unstructured recursion,

Hypothesis exposes a way of doing recursion in a slightly more

structured way that lets it control the size of the

generated data much more effectively. This is the recursive

strategy.

In order to use the recursive strategy you need two parts:

- A base strategy for generating “simple” instances of the

data that you want. - A function that takes a child strategy that generates data

of the type you want and returns a new strategy generating

“larger” instances.

So for example for our trees of booleans and tuples we could

use booleans() for the first and something for returning tuples

of children for the second:

```
recursive_tree = st.recursive(
st.booleans(), lambda children: st.tuples(children, children)
)
```

The way to think about the recursive strategy is that you’re

repeatedly building up a series of strategies as follows:

```
s1 = base
s2 = one_of(s1, extend(s1)
s3 = one_of(s2, extend(s2))
...
```

So at each level you augment the things from the previous

level with your extend function. Drawing from the resulting

recursive strategy then picks one of this infinite sequence

of strategies and draws from it (this isn’t quite what happens

in practice, but it’s pretty close).

The resulting strategy does a much better job of drawing small

and medium sized trees than our original composite based one

does, and should never raise a RecursionError:

```
>>> recursive_tree.example()
((False, True), ((True, True), False))
>>> recursive_tree.example()
((((False, False), True), False), False)
>>> recursive_tree.example()
(False, True)
>>> recursive_tree.example()
True
```

You can also control the size of the trees it draws with the

third parameter to recursive:

```
>>> st.recursive(st.booleans(), lambda children: st.tuples(children, children), max_leaves=2).example()
True
>>> st.recursive(st.booleans(), lambda children: st.tuples(children, children), max_leaves=2).example()
(True, False)
```

The max_leaves parameter controls the number of values drawn from

the ‘base’ strategy. It defaults to 50, which will tend to give you

moderately sized values. This helps keep example sizes under control,

as otherwise it can be easy to create extend functions which cause the

size to grow very rapidly.

In this particular example, Hypothesis will typically not hit the default,

but consider something like the following:

```
>>> st.recursive(st.booleans(), lambda children: st.lists(children, min_size=3)).example()
[[False,
True,
False,
False,
False,
True,
True,
True,
False,
False,
False,
True,
True,
False],
False,
[False, True, False, True, False],
[True, False, True, False, False, False]]
```

In this case the size of the example will tend to push up against the max_leaves value

because extend() grows the strategy in size quite rapidly, so if you want larger

examples you will need to turn up max_leaves.

## Leave a Reply

You must be logged in to post a comment.