1. SERIALIZATION
Serialization is a key part of the .NET framework. The remoting infrastructure, including Web Services
and SOAP depend on serialization, which is the process of reducing an object instance into a
transportable format that represents a high-fidelity representation of the object. What makes this
process interesting is that you may also take the serialized representation, transport it to another context
such as a different machine, and rebuild your original object. Given an effective serialization
framework, objects may be persisted to storage by simply serializing object representation to disk.
Given an efficient serialization framework, objects may be remoted by simply serializing an object to a
stream of bytes stored in memory, and transmitting the stream to a cooperating machine that
understands your serialization format.
In .NET, serialization is often used with streams, which are the abstractions used to read and write to
sources such as files, network endpoints, and memory sinks.
How Do You Use Serialization?
Serialization is handled primarily by classes and interfaces in the System.Runtime.Serialization
namespace. To serialize an object, you need to create two things:
•a stream to contain the serialized objects
•a formatter to serialize the objects into the stream
The code required to perform serialization in .NET is very simple. Most serialization code is similar to
the boilerplate code shown below, which serializes an object into a file stream using the
BinaryFormatter class:
public static void WriteToFile(BaseballPlayer bp, String filename)
{
Stream str = File.OpenWrite(filename);
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(str, bp);
str.Close();
}
Using the BinaryFormatter for serialization results in a compact representation on disk, although it is
not a form that is easily read using a text editor. If you would like a more human-friendly
representation, you can use the SOAP formatter, as shown below:
public static void WriteToFile(SerialCircle shape, String filename)
{
Stream str = File.OpenWrite(filename);
SoapFormatter formatter = new SoapFormatter();
formatter.Serialize(str, shape);
str.Close();
}
2. Simplified Serialization Using Attributes
The simplest way to make your classes eligible for serialization is to use the Serializable attribute. By
decorating your class with this attribute as shown below, your classes are immediately made
serializable:
[Serializable]
public class BaseballPlayer
{
[...]
}
By default, all members of a class are serialized, including private members. To reduce the amount of
data serialized to the stream, you can inhibit serialization of members that are not required to
reconstitute the class by attaching the NonSerialized attribute to those members:
[NonSerialized]
private Decimal _salary;
The NonSerialized attribute is useful for those member variables that represent calculated values or
contain information that is transient rather than persistent, or data that should be hidden and not
persisted to a storage medium.
Serialization and Private Members
In order for serialization to work effectively, the serialization mechanism must be able to capture
enough information about the state of an object to allow it to properly recreate a true copy of the
original object at a later time. This often requires information not available to public clients of a class,
but it is a necessary side-effect of the work that the serializer must perform. Improperly serialized
objects cannot be deserialized properly—it's as simple as that. An example of failed serialization can be
seen in the movie Galaxy Quest, where the transporter mechanism fails its test, effectively deserializing
a menacing beast inside-out, with explosive (and messy) results.
Is Exposure Required?
In the MFC class library, objects were responsible for serializing themselves. While this did prevent the
sharing of internal class details with formatters, it did require all class authors to write correct
serialization code for all classes that might ever require serialization. This requirement leads to the
following problems:
•Many developers who can write usable classes cannot write good serialization code. This leads to
problems similar to the exploding space pig in Galaxy Quest.
•Testing can detect the likely hood that a specific class will result in explosions, but such tests
increase the testing effort required for each class that implements serialization support.
•Embedded serialization code adds to the costs of maintainability and adds to the risk of updating
components. Any changes to a class that supports serialization in MFC must be properly reflected
in the serialization code. Errors in this code may be, as noted previously, just as catastrophic as
code that performs the "real work" of the class.
•But there is a simple way around this mechanism—you can simply elect to not participate in MFC
serialization, which may limit the usefulness of your class. Note that even though your class does
not need serialization today, it may need it tomorrow, and software developers are notoriously bad
at foretelling the future. In NET, all types are self-describing, and the serialization architecture
3. simply leverages the self-describing nature of .NET objects to perform serialization, without all of
the problems inherent in the MFC approach. What do you get out of the improved serialization
in .NET?
•In most cases, you need to write very little serialization code. In this column, the code examples
are covering special cases, but much of the time you'll need to write zero (or very little) code.
•You don't need to maintain the serialization code you don't write—there's somebody at Microsoft
doing that for you.
•You can version your classes as needed—the serialization will occur correctly, as the serialization
architecture adapts to changes to your classes.
•The serialization framework provides several places where you can customize portions of the
framework to suit your needs. For example, you can write your own formatter class if you need to
format your serialization output in a specific way, using ROT-13 encoded XML, for example.
•All of your classes can participate in serialization, with no work (other than an attribute tag)
required by you.
How does Serialization Work in .NET?
As discussed earlier, .NET objects are serialized to streams, which are discussed in the Richard Grimes
article here. To summarize and review, when serializing objects to a stream, you must use a .NET
formatter class to control the serialization of the object to and from the stream. In addition to the
serialized data, the serialization stream carries information about the object's type, including its
assembly name, culture, and version.
The Role of Formatters in .NET Serialization
A formatter is used to determine the serialized format for objects. All formatters expose the IFormatter
interface, and two formatters are provided as part of the .NET framework:
•BinaryFormatter provides binary encoding for compact serialization to storage, or for socket-
based network streams. The BinaryFormatter class is generally not appropriate when data must be
passed through a firewall.
•SoapFormatter provides formatting that can be used to enable objects to be serialized using the
SOAP protocol. The SoapFormatter class is primarily used for serialization through firewalls or
among diverse systems. The .NET framework also includes the abstract Formatter class that may
be used as a base class for custom formatters. This class inherits from the IFormatter interface, and
all IFormatter properties and methods are kept abstract, but you do get the benefit of a number of
helper methods that are provided for you.
When implementing a formatter, you'll need to make use of the FormatterServices and ObjectManager
classes. The FormatterServices class provides basic functionality that a formatter requires, such as
retrieving the set of serializable members object, discovering their types, and retrieving their values.
The ObjectManager class is used during deserialization to assist with recovering objects from the
stream. When a type is encountered in the stream, it is sometimes a forward reference, which requires
special handling by the ObjectManager class.
Taking Control of Serialization with the ISerializable Interface
While the [Serializable] attribute is fine for classes that don't require fine-grained control of their object
state, occasionally you may require a more flexible serialization mechanism. Classes that require more
control over the serialization process can implement the ISerializable interface.
4. When implementing the ISerializable interface, a class must provide the GetObjectData method that is
included in the interface, as well as a specialized constructor that is specialized to accept two
parameters: an instance of SerializationInfo, and an instance of StreamingContext. A minimal class that
implements ISerializable is shown below:
[Serializable]
public class SerialCircle: ISerializable
{
public SerialCircle(double radius)
{
Console.WriteLine("Normal constructor");
ConfigureCircleFromRadius(radius);
}
private SerialCircle(SerializationInfo info, StreamingContext context)
{
Console.WriteLine("Deserialization constructor via ISerializable");
double radius = info.GetDouble("_radius");
ConfigureCircleFromRadius(radius);
}
public void GetObjectData(SerializationInfo info, StreamingContext context)
{
Console.WriteLine("Serialization via ISerializable.GetObjectData");
info.AddValue("_radius", _radius);
}
private void ConfigureCircleFromRadius(double radius)
{
_radius = radius;
_circumference = 2 * 3.14 * radius;
_area = 3.14 * radius * radius;
}
public double Circumference { get {return _circumference;} }
public double Radius { get {return _radius;} }
public double Area { get {return _area;} }
private double _radius;
private double _area;
private double _circumference;
}
A version of SerialCircle that uses default serialization serializes the values of each member variable to
the stream. However, the _area and _circumference members can be calculated based on the value of
_radius. The SerialCircle class implements the ISerializable interface so that it can control which class
members are serialized—it serializes only the _radius member, and calculates that values for other
members when deserialized.
The GetObjectData function is used to serialize the object, and the specialized constructor is used to
deserialize the object. The constructor and GetObjectData are passed the same parameters: an instance
of the SerializationInfo class and an instance of the StreamingContext structure.
The framework calls GetObjectData to notify an object that a serialization is in progress. The object is
expected to serialize itself into the SerializationInfo object that is passed as a parameter to
GetObjectData:
public void GetObjectData(SerializationInfo info, StreamingContext context)
5. {
info.AddValue("_radius", _radius);
}
SerializationInfo is a final class that holds the serialized representation of an object. During
serialization, an instance of this class is populated with data about the serialized object using the
AddInfo function. During deserialization, an instance of SerializationInfo is used to construct a new
instance of the serialized class, by calling one of the GetXxxx functions to extract data from the
SerializationInfo object:
private SerialCircle(SerializationInfo info, StreamingContext context)
{
double radius = info.GetDouble("_radius");
}
The SerializationInfo.AddInfo member function is overloaded to provide versions for any type that you
can serialize. The AddInfo method is used to create a name-value pair that is serialized to the stream.
During deserialization the name is used to retrieve the value, using one of the GetXxxx methods, where
Xxxx is replaced by the type to be recovered. In the example above, GetDouble is used to return a
double; there are similar versions for Strings, integers, and all other .NET types.
Using the StreamingContext Structure
The StreamingContext structure is used to indicate how a serialized object will be used. Classes that
implement ISerializable may optionally use this information to determine which fields are relevant for
serialization. For example, some objects may use a more compact representation when serialized to
disk, with the intention of recreating internal structures when the object is recovered. When cloned into
a memory stream in the same process—perhaps for cloning a new object, the serialization policy may
preserve internal structures for runtime efficiency. Will every class need to use this sort of advanced
serialization Kung Fu? No, but it's there if you need it.
There are two properties that are exposed by the StreamingContext structure:
•State is a value from the ContextStreamingStates enumeration, which is discussed below. This is
the property gives you a hint about the reason for the serialization request.
•Context an object that is associated with this instance of StreamingContext. This value is
generally not used unless you have associated an interesting value with the StreamingContext as
part of the serialization process.
ContextStreamingStates is an enumeration that provides a clue about the type of serialization that is
occurring. This information is sometimes useful—for example, when a client is on a remote machine,
information about process handles should be serialized completely, but if the serialization is occurring
within a process, a reference to the handle may be sufficient. The enumeration values for
ContextStreamingStates are shown in the table below.
Value
Meaning
All The serialized data may be used or sent from any context
Clone The serialized data targets the same process
CrossAppDomain The serialized data is for a different AppDomain
6. Value
Meaning
CrossMachine The serialized data is for a different computer
CrossProcess The serialized data is for a different process on the current computer
File The serialized data is read or written to a file
Other The context is not known
Persistence The serialized data is stored in a database, file, or other persistent store
The serialized data is for a remote context, which may be a different
Remoting
computer
Of the possible values for ContextStreamingStates, you should pay special attention to the File and
Persistence states. These two values indicate that the object is being deserialized into a stream that may
be long-lived, and is likely to require special handling. For example, the object may be deserialized
days, weeks or years from now—serializing values that are short-lived may not be required.
In the SerialCircle example, ISerializable is implemented in order to remove two fields from the
serialization stream; however, you could just as easily add information into the stream, such as
authentication hints or optimization instructions. Once you take control of the serialization process for
your class, you can manage serialization however you see fit.
When is Deserialization Complete?
Serializing simple objects that have no dependencies on other objects is a simple matter; even a
computer book author can do it. In real life, objects are often serialized together, with some objects in
the serialization stream depending on other objects. This presents a problem, as during deserialization
there is no guarantee on the order that specific objects are reconstituted. If you find that instances of
your class depend on other objects that are being deserialized, you can receive a notification when all
deserialization is complete by implementing the IDeserializationCallback interface.
IDeserializationCallback has one method: OnDeserialization. This method is implemented by
serializable classes, and is invoked by the framework when all objects have been deserialized.
Continuing with the SerialCircle example presented earlier, initialization of the circle can be deferred
until deserialization is complete by waiting until OnDeserialization is called:
private SerialCircle(SerializationInfo info, StreamingContext context)
{
_radius = info.GetDouble("_radius");
}
public void OnDeserialization(Object sender)
{
ConfigureCircleFromRadius(_radius);
}
7. In the code fragment above, we have changed the deserialization constructor so that it only initializes
the _radius member variable. When the framework invokes OnDeserialization through the
IDeserializationCallback interface, the initialization of the object is completed by calling
ConfigureCircleFromRadius. The SerialCircle project included with this article includes the
OnDeserialization code.
A Word About Final Classes
The .NET framework allows classes such as SerializationInfo and SerializableAttribute to be declared
as final, meaning that they cannot be subclassed. Although the framework uses the term final, as in
"this is the final form of this type," each language in the runtime seems to use a different term:
•Visual Basic programmers use the NotInheritable keyword
•C# programmers use the sealed keyword
•If you're using the managed C++ compiler, look for __sealed
•Eiffel users have the frozen keyword
XML SERIALIZATION
Using SOAP and Binary Serialization
SOAP and binary serialization are essential if you are planning to transport objects across a network.
The SOAP formatter is ideal for sending the object via HTTP, while the binary is more ideal because of
its more lightweight and compact nature. The XML serializer cannot prepare an object for
transportation by itself. It also ignores private member fields and properties.
XML Serialization Sample Code
By simply adding the [Serializable] attribute to the top of the sample class above, we now can use the
SOAP or binary formatter to serialize our object to the respective format. The following code
demonstrates using the SOAP formatter.
TestData obj = new TestData();
obj.Name = "testing";
obj.IgnoreMe = "ignore";
IFormatter formatter = new
System.Runtime.Serialization.Formatters.Soap.SoapFormatter();
Stream stream = new FileStream("c:MyFile.xml", FileMode.Create,
FileAccess.Write, FileShare.None);
formatter.Serialize(stream, obj);
stream.Close();
Resulting SOAP
It is important to notice how the SoapFormatter does not pay any attention to any of the XML attributes
we had previously assigned our class above.
8. <SOAP-ENV:Envelope
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:clr="http://schemas.microsoft.com/soap/encoding/clr/1.0"
SOAP-ENV:encodingStyle=
"http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<a1:TestData id="ref-1"
xmlns:a1="http://schemas.microsoft.com/clr/nsassem/
CodeGuru.Serialization/
CodeGuru.Serialization%2C%20Version%3D1.0.1404.42352%2C%20
Culture%3Dneutral%2C%20PublicKeyToken%3Dnull">
<_Identity>0</_Identity>
<_Name id="ref-3">testing</_Name>
<_IgnoreMe id="ref-4">ignore</_IgnoreMe>
</a1:TestData>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Implementing the ISerializable Interface
Simply putting the [Serializable] attribute on top of a class is a simple way to make an object
serializable. This alone works great if you have basic needs when serializing an object. What happens
when you need control over the serialization process and what is ultimately put into the serialized
format? This is the purpose of the ISerializable interface. It provides an interface that gives you
complete flexibility in the items contained within the serialized format. The interface requires that a
constructor be overridden that is used to recreate an instance of the object from a serialized version of
the data, also known as deserialization. The second method involved is GetObjectData, which is
responsible for controlling the actual values put into the serialized version of the object.
ISerializable Interface Sample Code
The following code refines our sample class defined earlier. It now implements an additional
constructor used to deserialize an object, and the GetObjectData method is used to control the
serialization process. Now, when the SOAP or binary formatter objects are used to serialize an object,
they produce the version controlled by the GetObjectData method.
using System;
using System.Runtime.Serialization;
using System.Xml.Serialization;
namespace CodeGuru.Serialization
{
[Serializable]
public class TestData : ISerializable
{
private int _Identity = 0;
private string _Name = "";
public string Name
{
get { return this._Name; }
9. set { this._Name = value; }
}
private string _IgnoreMe = "";
public string IgnoreMe
{
get { return this._IgnoreMe; }
set { this._IgnoreMe = value; }
}
public TestData()
{
}
protected TestData(SerializationInfo info,
StreamingContext context)
{
this._Identity = info.GetInt32("_Identity");
this._Name = info.GetString("_Name");
}
void ISerializable.GetObjectData(SerializationInfo info,
StreamingContext context)
{
info.AddValue("_Identity", this._Identity);
info.AddValue("_Name", this._Name);
}
}
}
Resulting SOAP
<SOAP-ENV:Envelope
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:clr="http://schemas.microsoft.com/soap/encoding/clr/1.0"
SOAP-ENV:encodingStyle=
"http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<a1:TestData id="ref-1" xmlns:a1=
"http://schemas.microsoft.com/clr/nsassem/CodeGuru.Serialization/
CodeGuru.Serialization%2C%20Version%3D1.0.1404.42999%2C%20
Culture%3Dneutral%2C%20PublicKeyToken%3Dnull">
<_Identity>0</_Identity>
<_Name id="ref-3">testing</_Name>
</a1:TestData>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Summary
You now have seen various ways in which objects can be serialized. This will allow you to store
objects in a file, a database, or in an ASP session state and then deserialize them back into original
form.