scvalex.net

18. Sized QuickChecks

I recently ran into an issue with QuickCheck where one of my tests seemed to hang. My mistake was that I was ignoring how QuickCheck generates sized arbitrary values, and ended up creating very large structures.

Consider S-Expressions (sexps). In the following, we define a sexp as either an atom containing a string, or a list of sexps. We define an arbitrary sexp as being, with equal probability, an atom containing an arbitrary string, or a list of arbitrary sexps. Although this definition sometimes works, it may generate very deep and very big sexps.

data Sexp = List [Sexp] | Atom ByteString
          deriving ( Eq, Show )

instance Arbitrary Sexp where
    arbitrary = do
        n <- choose (1, 2 :: Int)
        case n of
            1 -> Atom <$> arbitrary
            2 -> List <$> arbitrary
            _ -> fail "can't touch this"

The issue is how QuickCheck controls the size of the generated structures. Basically, the Gen monad, in which arbitrary runs, passes around a size parameter, which the various Arbitrary instances use. For instance, an arbitrary list will have a random number of elements between \(0\) and size. In our case, if you generate an arbitrary List of at most \(100\) elements, each of which may be an arbitrary List of at most \(100\) elements, and so on, you may generate a very big List.

The solution is to ensure that arbitrary sublists are smaller than their parent lists by some factor. In the following code, each sublist is at most half the size of the parent list, so the maximum depth of the top-level list is \(log_2(size)\).

instance Arbitrary Sexp where
    arbitrary = sized $ \sz -> do
        n <- choose (1, 2 :: Int)
        case n of
            1 -> Atom <$> arbitrary
            2 -> List <$> (resize (sz `div` 2) arbitrary)
            _ -> fail "can't touch this"