C equivalent to python pickle (object serialization)?

Go To StackoverFlow.com

5

What would be the C equivalent to this python code? Thanks.

data = gather_me_some_data()
# where data = [ (metic, datapoints), ... ]
# and datapoints = [ (timestamp, value), ... ]

serialized_data = cPickle.dumps(data, protocol=-1)
length_prefix = struct.pack("!L", len(serialized_data))
message = length_prefix + serialized_data
2012-04-04 03:42
by Bill
I hope you're aware that Python's pickle and cPickle modules should never be used with untrusted data - Adam Rosenfield 2012-04-04 03:51
If this is a learning exercise, then good on ya. But if not, you should be aware that there are a million things like this out there (google protocol buffers, thrift, etc) that are very well tested - Chris Eberle 2012-04-04 03:52
I am aware of the security concerns and yeah, this wasn't intended as a learning exercise in full but, after spending all day on google and writing the code above, it's certainly become a learning exercise. :) I haven't found anything at all that would point me in the right direction. Im not sure on 2 things with the above code. 1) am i doing "pickle_dump" correctly?? 2) i think i need to serialize the data before sending to the server (similar to perl pack() function) ?? Links to example C code that does what im looking for would be a tremendous help. : - Bill 2012-04-04 04:07
Voting to close as "not a question" because you haven't actually asked a question in the, err, question. Additionally, the questions in your comment seem fairly broad and discussion-oriented, which StackOverflow tries to avoid - David Wolever 2012-04-04 05:03
Define "equivalent". Will the message be unpickled in Python? Do you want to do it without using the cPickle module from C - Janne Karila 2012-04-04 05:56
Yes, the message will be unpicked using python - Bill 2012-04-04 17:03


7

C doesn't supports direct serialization mechanism because in C you can't get type information at run-time. You must yourself inject some type info at run-time and then construct required object by that type info. So define all your possible structs:

typedef struct {
  int myInt;
  float myFloat;
  unsigned char myData[MY_DATA_SIZE];
} MyStruct_1;

typedef struct {
  unsigned char myUnsignedChar;
  double myDouble;
} MyStruct_2;

Then define enum which collects info about what structs in total you have:

typedef enum {
  ST_MYSTRUCT_1,
  ST_MYSTRUCT_2
} MyStructType;

Define helper function which lets to determine any struct size:

int GetStructSize(MyStructType structType) {
      switch (structType) {
          case ST_MYSTRUCT_1:
              return sizeof(MyStruct_1);
          case ST_MYSTRUCT_2:
              return sizeof(MyStruct_2);
          default:
              // OOPS no such struct in our pocket
              return 0;
      }
}

Then define serialize function:

void BinarySerialize(
    MyStructType structType,
    void * structPointer,
    unsigned char * serializedData) {

  int structSize = GetStructSize(structType);

  if (structSize != 0) {
    // copy struct metadata to serialized bytes
    memcpy(serializedData, &structType, sizeof(structType));
    // copy struct itself
    memcpy(serializedData+sizeof(structType), structPointer, structSize);
  }
}

And de-serialization function:

void BinaryDeserialize(
    MyStructType structTypeDestination,
    void ** structPointer,
    unsigned char * serializedData)
{
    // get source struct type
    MyStructType structTypeSource;
    memcpy(&structTypeSource, serializedData, sizeof(structTypeSource));

    // get source struct size
    int structSize = GetStructSize(structTypeSource);

    if (structTypeSource == structTypeDestination && structSize != 0) {
      *structPointer = malloc(structSize);
      memcpy(*structPointer, serializedData+sizeof(structTypeSource), structSize);
    }
}

Serialization usage example:

MyStruct_2 structInput = {0x69, 0.1};
MyStruct_1 * structOutput_1 = NULL;
MyStruct_2 * structOutput_2 = NULL;
unsigned char testSerializedData[SERIALIZED_DATA_MAX_SIZE] = {0};

// serialize structInput
BinarySerialize(ST_MYSTRUCT_2, &structInput, testSerializedData);
// try to de-serialize to something
BinaryDeserialize(ST_MYSTRUCT_1, &structOutput_1, testSerializedData);
BinaryDeserialize(ST_MYSTRUCT_2, &structOutput_2, testSerializedData);
// determine which object was de-serialized
// (plus you will get code-completion support about object members from IDE)
if (structOutput_1 != NULL) {
   // do something with structOutput_1 
   free(structOutput_1);
}
else if (structOutput_2 != NULL) {
   // do something with structOutput_2
   free(structOutput_2);
}

I think this is most simple serialization approach in C. But it has some problems:

  • struct must not have pointers, because you will never know how much memory one needs to allocate when serializing pointers and from where/how to serialize data into pointers.
  • this example has issues with system endianess - you need to be careful about how data is stored in memory - in big-endian or little-endian fashion and reverse bytes if needed [when casting char * to integal type such as enum] (...or refactor code to be more portable).
2012-04-04 10:48
by Agnius Vasiliauskas
The question was not how to serialize an object in C, it was how to present a C structure in pickle format. This does not answer the question - Matthew Lundberg 2014-04-11 16:59


3

If you can use C++, there is the PicklingTools library

2012-04-04 06:51
by Janne Karila
Ads