By Ugorji Nwoke   16 Dec 2014 (updated 02 Jul 2019)   /blog   technology go-codec

Benchmarks!!! Serialization in Go!!!

View articles in the go-codec series, source at http://github.com/ugorji/go

Let’s have some fun with some numbers.

In the serialization in go article, we discussed a number of types of encoding formats and their libraries in go.

In this article, we will compare them on these metrics:

  • speed: clock time
  • memory usage: amount of bytes allocated
  • memory allocations: number of allocations

We will first compare the benchmark results in visual charts and explain the data. Thereafter, we will show the raw data, which includes amount of memory allocated and number of allocations.

Show me the numbers

For each encoding library, we will encode and decode a TestStruc value. This value is shown in detail in a follow up section. The encoded lengths using each library is shown below:

Below, we will compare the encoding and decoding speed of each library, in 2 different scenarios:

  • Runtime Introspection
  • Code Generation (for libraries which support code generation)

Let us look at the encoding speed of the different libraries below.

Let us also look at the decoding speed of the different libraries below.

Requirement to participate: How do libraries make the cut?

To be a candidate, a library must be able to encode and decode the TestStruc type as described in https://github.com/ugorji/go-codec-bench/blob/master/values_test.go .

A snapshot (from 2018-02-26) is captured below:

type TestStruc struct {
	// _struct struct{} `json:",omitempty"` //set omitempty for every field

	TestStrucCommon

	Mtsptr     map[string]*TestStruc
	Mts        map[string]TestStruc
	Its        []*TestStruc
	Nteststruc *TestStruc
}

type TestStrucCommon struct {
	S string

	I64 int64
	I32 int32
	I16 int16
	I8  int8

	I64n int64
	I32n int32
	I16n int16
	I8n  int8

	Ui64 uint64
	Ui32 uint32
	Ui16 uint16
	Ui8  uint8

	F64 float64
	F32 float32

	B  bool
	By uint8 // byte: msgp doesn't like byte

	Sslice    []string
	I64slice  []int64
	I16slice  []int16
	Ui64slice []uint64
	Ui8slice  []uint8
	Bslice    []bool
	Byslice   []byte

	Iptrslice []*int64

	WrapSliceInt64  wrapSliceUint64
	WrapSliceString wrapSliceString

	Msi64 map[string]int64

	Simplef testSimpleFields

	SstrUi64T []stringUint64T

	AnonInTestStruc

	NotAnon AnonInTestStruc

	// R          Raw // Testing Raw must be explicitly turned on, so use standalone test
	// Rext RawExt // Testing RawExt is tricky, so use standalone test

	Nmap   map[string]bool //don't set this, so we can test for nil
	Nslice []byte          //don't set this, so we can test for nil
	Nint64 *int64          //don't set this, so we can test for nil
}

type wrapSliceUint64 []uint64
type wrapSliceString []string
type wrapUint64 uint64
type wrapString string
type wrapUint64Slice []wrapUint64
type wrapStringSlice []wrapString

type stringUint64T struct {
	S string
	U uint64
}

type AnonInTestStruc struct {
	AS         string
	AI64       int64
	AI16       int16
	AUi64      uint64
	ASslice    []string
	AI64slice  []int64
	AUi64slice []uint64
	AF64slice  []float64
	AF32slice  []float32

	// AMI32U32  map[int32]uint32
	// AMU32F64 map[uint32]float64 // json/bson do not like it
	AMSU16 map[string]uint16

	// use these to test 0-len or nil slices/maps/arrays
	AI64arr0    [0]int64
	A164slice0  []int64
	AUi64sliceN []uint64
	AMSU16N     map[string]uint16
	AMSU16E     map[string]uint16
}

// testSimpleFields is a sub-set of TestStrucCommon
type testSimpleFields struct {
	S string

	I64 int64
	I8  int8

	Ui64 uint64
	Ui8  uint8

	F64 float64
	F32 float32

	B bool

	Sslice    []string
	I16slice  []int16
	Ui64slice []uint64
	Ui8slice  []uint8
	Bslice    []bool

	Iptrslice []*int64

	WrapSliceInt64  wrapSliceUint64
	WrapSliceString wrapSliceString

	Msi64 map[string]int64
}

TestStruc is designed to test out parsing a fully representative struct:

  • Large enough, with adequate representation of basic types, including strings containing utf8 characters in BMP and SMP planes and floats whose mantissa cannot fit into a uint64.
  • Can be expanded by having self-references e.g. type A has a field which contains multiple *A
  • has interfaces which can be optionally turned off
  • uses anonymous fields, including anonymous values and pointer fields e.g. type A struct { Anon1; *Anon2 } where Anon1 and Anon2 are named types.
  • uses custom types with extensions attached

Running the benchmark myself: Libraries/Formats compared

We have built a go-codec-bench repository which contains the following:

Provided by go-codec:

  1. msgpack: http://github.com/msgpack/msgpack
  2. binc: http://github.com/ugorji/binc
  3. cbor: http://cbor.io http://tools.ietf.org/html/rfc7049
  4. simple:
  5. json: http://json.org http://tools.ietf.org/html/rfc7159

Other codecs compared:

  1. go.mongodb.org/mongo-driver/bson
  2. https://github.com/philhofer/msgp

Other libraries considered, which did not make the cut:

To run the benchmarks, you need to do some ensure you have all the libraries benchmarked against:

  go get -u -t github.com/ugorji/go-codec-bench

There is a script in there with usage information for running:

  ./bench.sh -?

Sample test execution, including setup for codecgen and execution:

# download the different modules that we test
./bench.sh -d

# Generate the implementations for codec supported formats, msgp, easyjson, etc
./bench.sh -c

# Run tests to see the encoded size of each, and whether their decoding works
./bench.sh -tx

# Run benchmark (different permutations)
./bench.sh -s
./bench.sh -sg
./bench.sh -sgx

# Run simple short suites 
./bench.sh -j
./bench.sh -q

Benchmark Raw Data

This benchmark gathers results over the following axes:

  • runtime introspection
  • code generation (for libraries which support code generation)
  • code generation vs runtime introspection (for libraries which support both modes)

Encoded Lengths

==== X Baseline ====
     	ApproxDeepSize Of benchmark Struct: 75588 bytes
     Benchmark One-Pass Run (with Unscientific Encode/Decode times): 
     	   msgpack: len: 37070 bytes,	 
     	      binc: len: 36742 bytes,	 
     	    simple: len: 38627 bytes,	 
     	      cbor: len: 36846 bytes,	 
     	      json: len: 45372 bytes,	 
     	  std-json: len: 45124 bytes,	 
     	       gob: len: 34923 bytes,	 encoded != decoded
     	 json-iter: len: 45096 bytes,	 encoded != decoded
     	 v-msgpack: len: 36870 bytes,	 
     	      bson: len: 50022 bytes,	 
     	   mgobson: len: 50227 bytes,	 encoded != decoded
     	    sereal: len: 27566 bytes,	 encoded != decoded

==== X Generated ====
     	ApproxDeepSize Of benchmark Struct: 75588 bytes
     Benchmark One-Pass Run (with Unscientific Encode/Decode times): 
     	   msgpack: len: 37070 bytes,	 
     	      binc: len: 36742 bytes,	 
     	    simple: len: 38627 bytes,	 
     	      cbor: len: 36846 bytes,	 
     	      json: len: 45372 bytes,	 
     	      msgp: len: 36978 bytes,	 encoded != decoded
     	  easyjson: len: 43864 bytes,	 encoded != decoded
     	    ffjson: len: 45040 bytes,	 
        

Runtime Introspection Libraries

The following libraries are supported:

  1. go-codec: will all its supported formats (msgpack, cbor, binc, json)
  2. standard library provided: gob, json
  3. github.com/vmihailenco/msgpack
  4. gopkg.in/mgo.v2/bson

    ENCODING
    
    Benchmark__Msgpack____Encode-8         	   14409	     83038 ns/op	    3192 B/op	      44 allocs/op
    Benchmark__Binc_______Encode-8         	   14032	     85238 ns/op	    3192 B/op	      44 allocs/op
    Benchmark__Simple_____Encode-8         	   14312	     84302 ns/op	    3192 B/op	      44 allocs/op
    Benchmark__Cbor_______Encode-8         	   14319	     84576 ns/op	    3192 B/op	      44 allocs/op
    Benchmark__Json_______Encode-8         	    6260	    187597 ns/op	    3256 B/op	      44 allocs/op
    Benchmark__Std_Json___Encode-8         	    5685	    215119 ns/op	   74473 B/op	     444 allocs/op
    Benchmark__Gob________Encode-8         	    6942	    176537 ns/op	  170415 B/op	     592 allocs/op
    Benchmark__JsonIter___Encode-8         	    7154	    165144 ns/op	   57064 B/op	     142 allocs/op
    Benchmark__Bson_______Encode-8         	    5170	    236868 ns/op	  222828 B/op	     364 allocs/op
    Benchmark__Mgobson____Encode-8         	    3356	    358042 ns/op	  323169 B/op	    3386 allocs/op
    Benchmark__VMsgpack___Encode-8         	    4528	    272714 ns/op	  166016 B/op	     469 allocs/op
    Benchmark__Sereal_____Encode-8         	    3604	    332795 ns/op	  272004 B/op	    3218 allocs/op
    
    DECODING
    
    Benchmark__Msgpack____Decode-8         	    6058	    198857 ns/op	   67388 B/op	     913 allocs/op
    Benchmark__Binc_______Decode-8         	    5620	    211878 ns/op	   67392 B/op	     913 allocs/op
    Benchmark__Simple_____Decode-8         	    5928	    200791 ns/op	   67386 B/op	     913 allocs/op
    Benchmark__Cbor_______Decode-8         	    5821	    203224 ns/op	   67377 B/op	     913 allocs/op
    Benchmark__Json_______Decode-8         	    3206	    376412 ns/op	   89305 B/op	    1041 allocs/op
    Benchmark__Std_Json___Decode-8         	    1404	    839196 ns/op	  138557 B/op	    3032 allocs/op
    Benchmark__Gob________Decode-8         	    4137	    291493 ns/op	  156160 B/op	    2242 allocs/op
    Benchmark__JsonIter___Decode-8         	    3722	    326241 ns/op	  129244 B/op	    2504 allocs/op
    Benchmark__Bson_______Decode-8         	    2480	    463973 ns/op	  183867 B/op	    4085 allocs/op
    Benchmark__Mgobson____Decode-8         	    2326	    507473 ns/op	  171280 B/op	    6956 allocs/op
    
    

Code Generation Libraries

The following libraries support code generation:

  1. go-codec: will all its supported formats (msgpack, cbor, binc, json)
  2. https://github.com/philhofer/msgp

msgp uses unsafe when type-switching on struct field names, and does not support interfaces fully, as full support for interfaces requires falling back to reflection.

We compare using the unsafe option to codecgen, and run the benchmark with the interface field (aka *AnonInTestStrucIntf) set to nil. This gives like setup for both.

ENCODING

Benchmark__Msgpack____Encode-8         	   29084	     40229 ns/op	     288 B/op	       2 allocs/op
Benchmark__Binc_______Encode-8         	   27104	     44053 ns/op	     288 B/op	       2 allocs/op
Benchmark__Simple_____Encode-8         	   27472	     43519 ns/op	     288 B/op	       2 allocs/op
Benchmark__Cbor_______Encode-8         	   28280	     42594 ns/op	     288 B/op	       2 allocs/op
Benchmark__Json_______Encode-8         	    8953	    134502 ns/op	     352 B/op	       2 allocs/op
Benchmark__Msgp_______Encode-8         	   43567	     27626 ns/op	       0 B/op	       0 allocs/op
Benchmark__Easyjson___Encode-8         	    8083	    149802 ns/op	   50576 B/op	      12 allocs/op
Benchmark__Ffjson_____Encode-8         	    4281	    284832 ns/op	  128208 B/op	    1033 allocs/op

DECODING

Benchmark__Msgpack____Decode-8         	   10000	    117719 ns/op	   64117 B/op	     871 allocs/op
Benchmark__Binc_______Decode-8         	    9328	    131288 ns/op	   64117 B/op	     871 allocs/op
Benchmark__Simple_____Decode-8         	   10000	    121058 ns/op	   64118 B/op	     871 allocs/op
Benchmark__Cbor_______Decode-8         	    9770	    123299 ns/op	   64117 B/op	     871 allocs/op
Benchmark__Json_______Decode-8         	    4191	    284946 ns/op	   87511 B/op	    1002 allocs/op
Benchmark__Msgp_______Decode-8         	   16860	     71120 ns/op	   65878 B/op	     889 allocs/op
Benchmark__Easyjson___Decode-8         	    4208	    286718 ns/op	   96882 B/op	    1000 allocs/op
Benchmark__Ffjson_____Decode-8         	    2922	    413792 ns/op	   92225 B/op	    1202 allocs/op

Code Generation vs Runtime Reflection

Here, we take libraries which support both runtime reflection and code generation.

The following libraries fit this bill:

  1. go-codec: will all its supported formats (msgpack, cbor, binc, json)

This has been done before. More information is available in the article on codecgen.

Tags: technology go-codec


Subscribe: Technology
© Ugorji Nwoke