First, a little back story. I've got a
Property
class that provides generic access to an object's property value. To provide this, the Property
class must know the data type of the property that it encapsulates. So, I've also got a DataType
class that encapsulates a data type and provides generic access to values of that type. This DataType
class uses standard polymorphic class design such that the abstract
base DataType
class is implemented for each data type that we need to support (i.e., DataType_int
or DataType_MyClass
). So, my Property
class has a reference (pointer) to a DataType
object which provides it with generic access to that types' value. This is also an example of the Strategy pattern, which allows for the Property class to change its behavior (its DataType
) at runtime and an example of design by composition (Property
HAS a DataType
) rather than inheritance (Property
is subclassed for each DataType
it must support). So far, I think that I'm on the right path.The problem arises when I make a couple of
DataType
subclasses and begin trying to assign them to Property
. Since Property
has a reference to a DataType
object, that object must exist somewhere. So, I have a couple of options. I can have Singleton instances of each DataType
subclass and let Property
objects reference those Singletons. Or I can dynamically allocate an instance of a DataType
class and let the Property
class manage that object's memory. The latter would result in many small allocations, which would be slow and could fragment the heap. So it isn't desirable. And I prefer not to keep globals around if at all possible, so the Singleton solution, while not terrible, was not ideal. I started thinking of using a structure of function pointers to encapsulate the many behaviors required to encapsulate a given type. However, I quickly realized that this would result in huge objects when I really only wanted a single reference to a class of functionality that the group of functions would define. At this point, I realized (as I'm sure you also have) that what I needed was a class. The class provides each instance of it with a group of functions accessed via a single reference, the v-table.Following this train of thought, I began to think of an object as a reference to a group of functions (methods). If I just copied this reference, then I could change the functionality of my object (exactly the way that my
Property
class can change its functionality by changing its DataType
reference). This is the standard strategy design pattern. Code
The solution that I arrived at looks like this (I'll explain below):#include <cstring> // for memcpy
// Base DataType class
class DataType {
public:
// Construction
DataType() {}
DataType(const DataType &newType) { setType(newType); }
// Set the polymorphic behavior of this DataType object
void setType(const DataType &newType) {
memcpy(this, &newType, sizeof(DataType));
}
// Polymorphic behavior example
protected: virtual int _getSizeOfType() const { return -1; }
public: inline int getSizeOfType() const { return _getSizeOfType(); }
// Polymorphic behavior example
protected: virtual const char *_getTypeName() const { return NULL; }
public: inline const char *getTypeName() const { return _getTypeName(); }
};
// Implementation of DataType for 'int'
class DataType_int : public DataType {
public:
// Construction
DataType_int() {}
DataType_int(const DataType &newType) : DataType(newType) {}
// Polymorphic behavior example
protected: virtual int _getSizeOfType() const { return sizeof(int); }
// Polymorphic behavior example
protected: virtual const char *_getTypeName() const { return "int"; }
};
// Implementation of DataType for 'float'
class DataType_float : public DataType {
public:
// Construction
DataType_float() {}
DataType_float(const DataType &newType) : DataType(newType) {}
// Polymorphic behavior example
protected: virtual int _getSizeOfType() const { return sizeof(float); }
// Polymorphic behavior example
protected: virtual const char *_getTypeName() const { return "float"; }
};
// Example
DataType myType = DataType_int();
const char *typeName = myType.getTypeName(); // returns "int"
int typeSize = myType.getSizeOfType(); // returns sizeof(int)
myType.setType(DataType_float());
typeName = myType.getTypeName(); // returns "float"
As you can see, when we set the type, we are simply using
memcpy
to make the object's v-table pointer point to the v-table of the object that gets passed in. This changes myType
's polymorphic behavior to that of the new type!And we no longer need pointers or singletons or dynamic memory allocations! We have an object that is the size of a v-table pointer and that is all! If you prefer a bit of a speedup here, you could just use
*((void**)this) = *((void**)&newType;
to copy directly, assuming that your DataType class has no members (thanks to Dezhi Zhao for pointing that out in his comments below). Please keep in mind that this technique is not standards compliant, as the standard doesn't say anything about v-tables or v-ptrs (thank you to all of the commentators below that pointed this out). If a compiler implements virtual methods in such a way that doesn't store lookup information within an object's memory space, this technique will fail completely. However, I have never heard of a C++ compiler that doesn't work this way.
Also, you can see that we can easily change the type of
myType
at any point during runtime. This allows you the flexibility of having an uninitialized array of DataType
objects and initialize them whenever you like later. For the performance minded out there, Dezhi Zhao also pointed out below that this will most likely cause the processor's branch prediction to fail for the getTypeName()
call immediately after changing it. This will only happen for the DataType_float
version above, however, as the prediction will only fail if the processor has made a prediction already.One curiosity that you may have noticed was the use of
public
proxy methods (getSizeOfType
) that call protected
virtual methods (_getSizeOfType
). We need to do this because the compiler may skip the v-table lookup when it knows the actual type of an object (as opposed to pointers or references where it doesn't). This is perfectly reasonable, but breaks our setup.Inside the proxies, though, the v-table lookup always happens. And because they are inline, all they really do is make the compiler look up the correct method in the v-table and call that one. Remember, however, that we are NOT removing the virtual method lookup. This setup will not speed up virtual method calls in any way. In fact, we depend on compiler looking up our virtual method for this to work.
Members
One important thing to note about this setup is the absence of any member variables inDataType
. Since we are doing a memcpy
expecting that both objects have the same size (sizeof(DataType)
), none of DataType
's subclasses may add any member variables. You could add member variables to DataType
with no problem, but you are NOT able to add any member variables to subclasses. Since I didn't need any member variables for DataType
, this didn't present a problem for me. However, it is not impossible to add member variables to subclasses. You just need to use memory that was provided in the base class as the memory where your members live. For example:#include <cstring> // for memcpy
// Base DataType class
class DataType {
public:
// Construction
DataType() {}
DataType(const DataType &newType) { setType(newType); }
// Set the polymorphic behavior of this DataType object
void setType(const DataType &newType) {
memcpy(this, &newType, sizeof(DataType));
}
protected:
// Member data
enum { kMemberDataBufferSize = 256, kMemberDataSize = 0 };
char memberDataBuffer[kMemberDataBufferSize];
};
// My Data Type class
class DataType_MyType : public DataType {
public:
// Construction
DataType_MyType() {}
DataType_MyType(const DataType &newType) : DataType(newType) {}
// Access myData
inline int getExampleMember() const {
return _getMemberData().exampleMember;
}
inline void setExampleMember(int newExampleMember) {
_getMemberData().exampleMember = newExampleMember;
}
protected:
// Member Data
struct SMemberData {
int exampleMember;
};
// Amount of member data buffer that we use (this class' member data +
// all base class' member data)
enum { kMemberDataSize = sizeof(SMemberData) + DataType::kMemberDataSize };
// Make sure that we don't run out of data buffer
#define compileTimeAssert(x) typedef char _assert_##__LINE__[ ((x) ? 1 : 0) ];
compileTimeAssert(kMemberDataSize <= kMemberDataBufferSize);
// Access member data
inline SMemberData &_getMemberData() {
return *((SMemberData*) memberDataBuffer);
}
inline const SMemberData &_getMemberData() const {
return *((const SMemberData*) memberDataBuffer);
}
};
As you can see, the
DataType
base class simply provides a buffer of data, which the subclasses may use to store whatever member data they like. While this setup is a bit messy, it clearly works and without too many hoops to jump through.