How Not to Implement Serialization in C++
I spent most of the day knocking out a nice PowerPoint slide deck to walk a customer's developers through a large swath of code I just checked into their subversion repository. I hammered out a framework that would improve productivity and implementation. With a littel luck, the project would be back on schedule.
The framework included serialization, complements of the Boost serialization library. However, the principle architect balked at using Boost serialization -- I should have used the persistence mechanism that was coded by their developers. I argued, but when the person who signs my timesheets agreed with the architect, the battle was over. The customer is always right.
With that, I needed to revamp my code to use the persistence mechanism (which used blocks of memory that were flushed to disk or flash memory), but make it more usable. Hundreds of lines of, "if (p) p->write(sizeof(x), &x, 1)" are not acceptable to me. There has to be a better, more developer friendly way.
I immediately turned to Google for help. Surely a developer out there had a simular interest in rolling their own serialization scheme. Amazingly, I found several "tutorials" on serialization that had the exact same theme.
The number one serialization tutorial was the "<ahref="http://www.functionx.com/cpp/articles/serialization.htm" functionx C++ Object Serialization tutorial</a>. It should be entitled, "How NOT to Implement Object Serialization."
Take the following example:
#include <fstream>
#include <iostream>
using namespace std;
class Student
{
public:
char FullName[40];
char CompleteAddress[120];
char Gender;
double Age;
bool LivesInASingleParentHome;
};
int main()
{
Student one;
strcpy(one.FullName, "Ernestine Waller");
strcpy(one.CompleteAddress, "824 Larson Drv, Silver Spring, MD 20910");
one.Gender = 'F';
one.Age = 16.50;
one.LivesInASingleParentHome = true;
ofstream ofs("fifthgrade.ros", ios::binary);
ofs.write((char *)&one, sizeof(one));
return 0;
}
#include <iostream>
using namespace std;
class Student
{
public:
char FullName[40];
char CompleteAddress[120];
char Gender;
double Age;
bool LivesInASingleParentHome;
};
int main()
{
Student one;
strcpy(one.FullName, "Ernestine Waller");
strcpy(one.CompleteAddress, "824 Larson Drv, Silver Spring, MD 20910");
one.Gender = 'F';
one.Age = 16.50;
one.LivesInASingleParentHome = true;
ofstream ofs("fifthgrade.ros", ios::binary);
ofs.write((char *)&one, sizeof(one));
return 0;
}
So what is wrong with doing this? It is an extremely poor solution that will only work on the same compiler/platform reliably.
Also, the pointer to the class one doesn't necessarily point to the first data item, public or private within this class. I spent several days tracking down and squashing a bug in an embedded system because of assumptions like that. In my case, the original developer cast the class to a char *, then wrote the class to a socket. Unfortunately for the developer, who assumed that the pointer to the class would point to the private data, he did not consider that this might be compiler dependent. After looking at network traces I quickly figured out the problem. One compiler was injecting some extra bytes, causing problems.
Next, what happens when you save a class that contains pointers to other classes? Kaboom.
Labels: C++, Programming
Ironically, almost a year later, boost is *still* not workable on some of the smaller embedded platforms and the defined storage interface you balked at has been implemented and is functional on no less than four different and distinct systems, three of which have only 128KB of RAM total to work with...
:-)
Posted by
The architect in question |
August 5, 2008 2:51 PM
LOL pwned :)
Posted by
Anonymous |
November 6, 2008 4:22 AM