Protocol Buffers

Shawn Lin's picture


  • flexible, efficient, automated mechanism for serializing structured data.
  • think XML, but smaller, faster, and simpler.
  • use special generated source code to easily write and read your structured data.
  • update your data structure without breaking deployed programs that are compiled against the "old" format.

Why not just use XML?
Protocol buffers have many advantages over XML for serializing structured data. Protocol buffers:

* are simpler
* are 3 to 10 times smaller
* are 20 to 100 times faster
* are less ambiguous
* generate data access classes that are easier to use programmatically

How do they work?

  • You specify how you want the information you're serializing to be structured by defining protocol buffer message types in .proto files.
  • Each protocol buffer message is a small logical record of information, containing a series of name-value pairs.

package tutorial;
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;

enum PhoneType {
HOME = 1;
WORK = 2;
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
repeated PhoneNumber phone = 4;
message AddressBook {
repeated Person person = 1;

Three type

  • required: a value for the field must be provided, otherwise the message will be considered "uninitialized"
  • optional: the field may or may not be set. If an optional field value isn't set, a default value is used.
  • repeated: the field may be repeated any number of times (including zero).

Start working

  • protoc -I=$SRC_DIR --cpp_out=$DST_DIR $SRC_DIR/addressbook.proto
  • This generates the following files in your specified destination directory:
  • addressbook.pb.h, the header which declares your generated classes.
  •, which contains the implementation of your classes.

Person person;
person.set_name("John Doe");
fstream output("myfile", ios::out | ios::binary);

fstream input("myfile", ios::in | ios::binary);
Person person;
cout << "Name: " << << endl;
cout << "E-mail: " << << endl;

Entire message, including:

  • bool IsInitialized() const;: checks if all the required fields have been set.
  • string DebugString() const;: returns a human-readable representation of the message, particularly useful for debugging.
  • void CopyFrom(const Person& from);: overwrites the message with the given message's values.
  • void Clear();: clears all the elements back to the empty state.
  • bool SerializeToString(string* output) const;: serializes the message and stores the bytes in the given string. Note that the bytes are binary, not text; we only use the string class as a convenient container.
  • bool ParseFromString(const string& data);: parses a message from the given string.
  • bool SerializeToOstream(ostream* output) const;: writes the message to the given C++ ostream.
  • bool ParseFromIstream(istream* input);: parses a message from the given C++ istream.