Monday, December 10, 2012

Message / Packet Parsing

A common problem in designing and writing distributed systems is the handling of the wire protocol. To help in solving these problems both many programmers go it on their own writing their own serialization tools, while others trust third-party tools to ease their development.  After doing a little bit of both I'm not convinced I prefer one approach over another.

The example:


First let me provide an example of a message:

SampleMessage {
  int record; // unique id
  Type type; // some enumed field
  bytes message; // The data
  bytes signature; // Integrity and authenticity of this "SampleMessage"
};

Custom made serializer:

Using our own serializer, we could do the following assuming a SampleMessage msg:
bytes buffer;


buffer.append(msg.record());
buffer.append(msg.type());
buffer.append(msg.message().length());
buffer.append(msg.message());
buffer.append(msg.signature().length());
buffer.append(msg.signature());

And then on the parsing side:
bytes buffer;

msg.set_record(buffer.read_int(0));
msg.set_type(buffer.read_int(4));
int length = buffer.read_int(8); 

msg.set_message(buffer.mid(12, length));
int signature_length = buffer.read_int(12 + length); 
msg.set_signature(buffer.mid(12 + length + 4, signature_length));

So the major caveats are the following: what is the size of Type, is it uniform across all platforms? Also we're making a lot of potentially unnecessary copies for what might be large datagrams.

Third-party serializers (without message definition):

Alternatively, let's assume we have something like QDataStream:
QByteArray data;
QDataStream stream(&data, QIODevice::in);
stream << msg.record() << msg.type() << msg.message() << msg.signature();
// or maybe even
stream << msg;

For parsing:
QDataStream stream(data);
stream >> msg;
// or maybe not...
int record;
Type type;
QByteArray message;
QByteArray signature;
stream >> record >> type >> message >> signature;
msg.set_record(record);
...

In this case, we just have to check that our output is sane or perhaps look at the QDataStream and ensure that it is still in good working order (Status != ReadPastEnd), but how do we check the signature matches the tuple (record, type, message) in any efficient manner?

Third-party serializers (with message definition):

A completely different serializer, protobufs would work as such:
std::stringstream stream(std::stringstream::out);
msg.SerializeToOstream(&stream);
string output = stream.str();

And on the return:
std::stringstream stream(output, std::stringstream::out);
msg.ParseFromIstream(&stream);

Protobuf doesn't handle the signature issue any easier and requires both an external compiler and a library.

Thus far...

Protobufs would be great if we could encapsulate a Message within a SignedMessage, then we *should* be able to get the original character array used for constructing the Message and verify that the signature is correct.  Unfortunately that does not happen.

QByteArray does allow for constructing a QByteArray from another without doing a copy of the underlying array.  However, we do not have the access we need from QDataStream to know where into the QByteArray to construct the base (unsigned) message.

Using our own method allows us to have this fine grained control but at the cost of writing more expressive code and having more debugging routines.

Similar Packets

Ideally we want to reduce our packet parsing as much as possible.  So we can embed multiple packets in the same path.  Using something like protobuf, where we must define the data we expect to be pushing around, makes it complicated for this arbitrary behavior.  Requiring us to embed packets of one type as bytes in another or requiring this lower level packet to know about higher layer packets breaking modularity.  The same could be said about QDataStream, but then again it allows us to avoid unnecessary copies.  In either case, both scenarios feel unnatural.  If we want our home grown packets to have these features, the code will start feeling bloated and potentially complex -- welcome to a whole new world of coding bugs.

I'm still brainstorming on my conclusion and hopefully I'll update when I'm satisfied until then....

Tuesday, July 3, 2012

Social Keys

Back in my days at Florida, I worked on a project called SocialVPN with the intent of using Facebook as a medium for exchanging public keys amongst friends.  Now I am revisiting this with another project called Dissent, in which we want to use public keys from a group of users who may not be friends.  Ignoring the why for each of these applications, let me describe the way things were, where they have gone, what I would like to see, and what could be minimally done.

Back with SocialVPN, we created something called a "desktop application" to store the keys in Facebook.  An application in this sense utilizes Facebook in some means to enhance their experience in another domain, such as a game, organizational things, searches, or identity... anything that can benefit from social identity or the contents of a Facebook profile.  A desktop application, unlike a Web application, was a standalone application that did not require anything besides Facebook.  Unfortunately, this flexibility, is probably why desktop applications did not live very long.  Using Facebook application, we could store the keys within an application specific store accessible only to the application.  Unfortunately, the applications required both the Application ID and Secret be embedded within the application, and thus a hacker could retrieve both of them, muck around in the application's data store, and perform man-in-the-middle attack.  I suspect this was one of the many reasons why Facebook moved to Web Applications and away from Desktop Applications.

One day, in the midst of success, this suddenly stopped working.  It was horrible!  We quickly created a Web application to do the same thing, but this time, I suppose it was actually "safe" (assuming we did not introduce any accidental bugs), but unfortunately, this meant the clients had to trust us.  I did not want that!  Anyway, we got fed up over all this nonsense and began investigating GoogleTalk and XMPP.  Wow, that was wonderful, long story short, we found everything we needed to exchange keys between friends in a reasonable fashion without relying on a third-party services (besides Google, of course).

Fast-forward several years, and again, we are considering the same task of inserting keys into Facebook.  I had hoped that the years would have been good to developers and Facebook would have refined their APIs some how or another.  Taking a step back, let me first explain what we really want.  Ideally, a public key would be a first class citizen in the SN realm, such that under your info, there would be a special field where your key is stored.  Perhaps the key would actually be signed by Facebook so it could be easily redistributed outside of Facebook, yet still trusted.  In terms of API, perhaps the Facebook Graph API could be extend as such: https://graph.facebook.com/$userid?fields=public_key which would allow applications to retrive a base64 encoded key.  Furthermore, this interface should be open to the public or as much as the profile is, so that the user can authenticate to members that are not friends, but they have some association, such as in a common group.  Unfortunately, this does not exist, nor have I seen anything coming from SNs like it.  I will admit that there was a position paper 2 years after our SocialVPN papers clamoring for SocialKeys, I should read it and update this entry...

So where are we now?  While Facebook's APIs have matured, so has their privacy measures.  I suppose the APIs are largely the same from where we left off, we just never envisioned using Facebook's own internal data structures to store the key.  Well, we did at one point consider using steganography to embed it within the user profile picture, but I think that can be challenging since the picture probably has been internally manipulated by Facebook, which would probably destroy traces of the public key.  Other fields can also be populated, such as notes or picture albums, which can be searched by their subject or title, respectively, using FQL.  Unfortunately, notes does not allow for an application to change the privacy setting and uses the default user setting for notes, while picture albums could potentially be used, the privacy setting cannot be looser than what the application is set to.  By default applications are set to the profile as well, thus the only Facebook limited option that would be guaranteed to work, would require user interaction.  Furthermore, an application must be granted the privilege of creating, editing, or deleting content.  Once this privilege has been granted by a user, it must be manually deleted through the applications menu, not a very user-friendly interface for an application, and tin foil hat and many users should be wary of an application that needs constant write access for a one time (or occassional) operation.

A potential solution that should always work, would be for us to have an application, that reads the key, prints a base64 version for the user to copy into a note, and then verifies that the user has done so correctly.  This may in fact be a completely reasonable approach, but it does require greater user interaction than most casual users are interested in, which would certainly limit our application's appeal.

Yet another issue that may come up in working with a profile is the issue with "hidden users," or users who show up in the group or have some presence, but their profile is only accessible to friends.  If an application needs uniform access to profiles, it should be able to detect these types of users, warn them that their profile is too restrictive, and prevent the accidental use of their credentials to impact the rest of the group.  For example, a friend of these users may see a different set of public keys than someone who is not.

So thinking rather practical, what is one way we can accomplish this without any "additional" user interaction?  Why not let an external app manage the data, back like we did with SocialVPN?  These days Facebook has teamed with Heroku to provide a free service, so devs do not even need to manage or pay for the service.  While this seems like a great solution, it changes ownership of the key from the profile, to the application.  In the earlier iterations discussed above, the profile owned the key, independent of the application.  In this system, the application owns the key, but links it to a profile.  Granted this is probably more tin foil hat wearing than anything else, but because the profile key does not appear in the users profile, another user must trust the server that it is the correct public key.  I also believe this is a bit stronger of an assumption that trusting the social network, which is inherent in the design anyway.  A third-party developer is a stop gap, until social networks embrace social keys, perhaps I need to come up with persuasive arguments for why they should.

So maybe I can get some feedback on this problem.  That would be great.  Alternatively, we may just go with the external application, since it is guaranteed to work, but also discuss the "safer" alternatives.