Arow Serializer Code Generator
Arow Serializer Overview
Adaptive Row (Arow) - code generator produces C# code that serializes and de-serializes object
instances derived form TypedRow
. The Arow serializer is very efficient as it avoids boxing and
making transitive copies. The Arow serializer does not support polymorphism and only allows for the
following types:
- CLR Primitives
- Nullable CLR Primitives
- Guid, GDID, Amount NFX structs
- Types derived directly or indirectly from TypedRow
- List or arrays of the aforementioned types
The Arow serializer only processes properties decorated with [Field]
attribute having
isArow: true
property. The backendName
has to be 1-8 ASCII-only chars, so the field names are
written using compressed ULONG format for efficiency (strings are much less efficient because they
are reference types and require GC). Arow binds rows by name, consequently Arow is a Version-tolerant
serializer.
An example class:
[Arow]
public class SimplePersonRow : TypedRow
{
[Field(backendName: "id", isArow: true)] public GDID ID {get; set;}
[Field(backendName: "name", isArow: true)] public string Name {get; set;}
[Field(backendName: "age", isArow: true)] public int Age {get; set;}
[Field(backendName: "b1", isArow: true)] public bool Bool1 {get; set;}
[Field(backendName: "s1", isArow: true)] public string Str1 {get; set;}
[Field(backendName: "s2", isArow: true)] public string Str2 {get; set;}
[Field(backendName: "d1", isArow: true)] public DateTime Date {get; set;}
[Field(backendName: "slr", isArow: true)] public double Salary{get; set;}
}
Can be serialized like so (from unit test):
..............
ArowSerializer.RegisterTypeSerializationCores( Assembly.GetExecutingAssembly() );
..............
var row1 = new SimplePersonRow
{
Age = 123, Bool1 =true, ID = new GDID(12,234), Name = "Jacques Anthony", Salary=143098,
Str1="Some String", Date = new DateTime(1980, 08, 12, 13, 45, 11)
};
var writer = SlimFormat.Instance.GetWritingStreamer();
var reader = SlimFormat.Instance.GetReadingStreamer();
using(var ms = new MemoryStream())
{
writer.BindStream(ms);
ArowSerializer.Serialize(row1, writer);
writer.UnbindStream();
ms.Position = 0;
var row2 = new SimplePersonRow();
reader.BindStream(ms);
ArowSerializer.Deserialize(row2, reader); //deserialize
reader.UnbindStream();
Aver.AreEqual(row1.ID, row2.ID);
...................................
}
Performance
The performance of Arow is akin to the fastest binary serializers in CLR (.NET Core or full) due to C# code which gets emitted for every type and avoidance of boxing and string instances for field names (similar approach is used in Protobuf et. al.)
The following table compares the performance of Arow vs. Slim (which is the general-purpose CLR serializer ala BinaryFormatter only faster see Serbench
Intel Core i7-3930K @ 3.2 GHz (6 Physical), Windows 7 64bit .NET 4.5
Single-threaded SimplePerson (a flat linear structure described above):
Serialization:
NFX.NUnit.Serialization.ARowBenchmarking.Serialize_SimplePerson_Arow
Arow did 250,000 in 166 ms at 1,506,024 ops/sec. Stream Size is: 86 bytes
NFX.NUnit.Serialization.ARowBenchmarking.Serialize_SimplePerson_Slim
Slim did 250,000 in 191 ms at 1,308,901 ops/sec. Stream Size is: 59 bytes
Deserialization:
NFX.NUnit.Serialization.ARowBenchmarking.Deserialize_SimplePerson_Arow
Arow did 250,000 in 190 ms at 1,315,789 ops/sec. Stream Size is: 86 bytes
NFX.NUnit.Serialization.ARowBenchmarking.Deserialize_SimplePerson_Slim
Slim did 250,000 in 283 ms at 883,392 ops/sec. Stream Size is: 59 bytes
Multi-threaded SimplePerson (a flat linear structure described above):
Serialization:
NFX.NUnit.Serialization.ARowBenchmarkingParallel.Serialize_SimplePerson_Arow(250000,12)
Arow did 3,000,000 in 362 ms at 8,287,293 ops/sec
NFX.NUnit.Serialization.ARowBenchmarkingParallel.Serialize_SimplePerson_Slim(250000,12)
Slim did 3,000,000 in 443 ms at 6,772,009 ops/sec
Deserialization:
NFX.NUnit.Serialization.ARowBenchmarkingParallel.Deserialize_SimplePerson_Arow(250000,12)
Arow did 3,000,000 in 191 ms at 15,706,806 ops/sec
NFX.NUnit.Serialization.ARowBenchmarkingParallel.Serialize_SimplePerson_Arow(250000,12)
Arow did 3,000,000 in 362 ms at 8,287,293 ops/sec
The CLI tool
The serialization/deserialization cores get generated by the command line tool arow
.
A recommended pattern for project/solution layout with Arow:
- Create TypedRow-derived business/domain objects (i.e.
MyBusiness.dll
) - Create another project with the same name as the one that contains objects, add "Arow" suffix,
(in the example above
MyBusiness.Arow.dll
) - Add the POST BUILD step to the first project like so:
arow "MuBusiness.dll" "$(SolutionDir)\Source\MyBusiness.Arow.dll"
use macros as appropriate - Add the generated
*.cs
files into *.Arow project and cloak them from source control (as they are auto-generated)
Usage:
arow assembly out-path [/h | /? | /help]
[/c | /code FilePerNamespace|FilePerType|AllInOne]
assembly - source assembly file (with path)
out-path - directory path where to create files - must exist
Options:
/c | /code - how to organize gen code in files
Examples:
arow "mytypes.dll" "..\source\ArowSer\"