?

Log in

No account? Create an account

Protocol design question - Journal of Omnifarious

Jan. 27th, 2007

11:58 am - Protocol design question

Previous Entry Share Next Entry

So, I'm doing some protocol design for CAKE, and I'm wondering about an issue involving very long messages, verification and MAC.

I have a message format that would allow gigantically huge messages, on the order of a tebibyte in size. It's highly unlikely that there will ever be any messages that large. But I feel that in having limits of that nature, it's better to go for overkill in the extreme.

But I also recognize that since messages might be fairly large, it might also be that they are generated on the fly without any idea of how long they will eventually be. HTTP solves this problem with chunked-transfer encoding, and I have a very similar idea. But having this means that the length of the message can't be specified fully at the start.

In previous renditions of the protocol, I decided that the last message chunk was the last chunk less than 1000 bytes. I also had a MAC at the end of each chunk that included the chunk length field in the MAC. This prevented chunks from being maliciously reordered or modified. It also prevented the message from being maliciously shortened. The receiver could easily detect if any of these things had been done and reject the message and request that the sender send it again in the hopes of getting a good copy.

But this time I would also like parties in the middle to be able to collect all the chunks together. Basically if the message consists of a 50000 byte chunk and a 3000 byte chunk and a 5 byte chunk, I would like someone in the middle to be able to modify the message to be a single 53005 byte chunk.

But, I was thinking that it might be nice to have these little checks along the way that a message was still good that were provided by the original versions MACs at the end of the chunks. The only problem is if the length of the chunk isn't included in the MAC it's possible that someone in the middle could modify the message maliciously to be shorter without the receiver being able to detect this.

I can't think of a good way to preserve both properties, and I'm not sure which properties are the most important. So that's my question to all of you. Is there a good way to preserve both things? If there isn't, which is more important? Is it more important that the receiver be able to check each chunk as it's received so it can have a good running idea of the health of the message as it's coming in, or is it more important to allow intermediaries (who may be of the store it for minutes to days variety) to be able to coalesce chunks?

Current Mood: [mood icon] contemplative

Comments:

From:srlamb
Date:January 27th, 2007 10:01 pm (UTC)

What's the goal of coalescing chunks?

(Link)
It sounds like by allowing intermediaries to coalesce chunks, you're hoping to save bandwidth and/or storage costs. If that's true, let's determine the overhead of the worst case - infinite message size (so unchunked overhead approaches zero), generously large headers (say, 5 bytes offset (good for the full tebibyte at one-byte granularity), 5 bytes length (likewise), and SHA-512 means 64 bytes MAC), and stingy chunk sizes (say, 4,096 bytes including headers). 1-(4096-5-5-64)/4096 = 1.8% overhead. That's not so bad.

If it's not good enough...what's a typical MAC, 8 bytes? And if you said that all non-final chunks had to be 16,384 bytes, then you could do something like using 5 bytes for both the offset and the length - mask off the top 26 bits for the offset, mask off the bottom 14 bits for the length. I guess that's not quite right, as you'd have trouble terminating messages that are a multiple of the chunk size; you might have to steal another bit from the MAC for a 'final' flag (leaving a 63-bit MAC). That's 13 bytes of headers in a 16,384 byte chunk. 0.08% overhead.

Actually, you could do even more if you always check MACs serially. Have it be a running total, not just of the one chunk. 64-bit MAC, 15-bit length, 1-bit terminal flag means 10 bytes overhead. At maximum chunk size of 32,768 bytes, that's 0.03%.

You could have stingy MAC on each chunk and a really strong one at the end.
(Reply) (Thread)
[User Picture]
From:omnifarious
Date:January 28th, 2007 02:26 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)

I'm using the count type. So the per-message overhead for the length field is very small. The MAC OTOH is much larger, but your other idea of using much smaller MAC for the intermediate values is a really interesting one. That would solve the "The MAC can be used to make it seem that the message ends early." problem since the final MAC would be required to be a full 256 bits.

It isn't exactly to allow intermediaries to save overhead. It's more to allow intermediaries more flexibility in dealing with messages in whatever way is efficient for them.

And to answer your other question, my main concern is a DoS attack on the recipient. But really nothing I can do will makes this even a tiny bit harder. It's just better for a recipient to advertise a maximum message size that it can deal with without using too many resources.

(Reply) (Parent) (Thread)
From:srlamb
Date:January 28th, 2007 02:49 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)
Hmm, I still don't understand what you mean by flexibility. Without changing the structure of the chunks, it seems like they should be able to store uncoalesced chunks concatenated together, multiple headers and all.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious' OpenID account [omnifarious.org]
Date:January 29th, 2007 12:17 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)

Yeah, you're right. *thumps head* Sometimes you just get caught in stupid thought traps.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 30th, 2007 11:27 pm (UTC)

Re: What's the goal of coalescing chunks?

(Link)

Oh, I just remembered the real reason... :-) I don't want people who use the protocol to be able to make assumptions about chunk lengths being preserved. If the chunk lengths are forced to be preserved by the nature of the protocol, then it would be very tempting to use that fact.

I think I will be using your solution with short MACs. I'll just use a 16 bit MAC as a health check along the way, and a full MAC at the end. The MACs will not include message lengths, but truncation will be prevented because an intermediary will not be able to extend a short MAC into a full MAC to use at the end.

I don't know whether or not the intermediate health checks are useful. But I feel better putting them in, and they shouldn't be that big an efficiency burden given that the message header will typically be more than a hundred bytes anyway.

(Reply) (Parent) (Thread)
From:srlamb
Date:January 28th, 2007 07:46 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)
By the way, why your variable-length non-negative integer encoding over the elegant one used by Scarab's binary format or the older, widely used ASN.1 BER? If I understand them all, the number of bytes used go something like this:

input range Scarab BER bytes ASN.1 BER bytes Cake count bytes
[0, 2^7)111
[2^7, 222)232
[222, 8415)232
[8415, 2^14)233
[2^14, 2^16)333
[2^16, 2^21)344
[2^21, 2^24)444
[2^24, 2^28)455
[2^28, 2^32)555
[2^32, 2^35)566
[2^35, 2^40)666
[2^40, 2^42)677
[2^42, 2^48)777
[2^48, 2^49)788
[2^49, 2^56)888
[2^56, 2^63)999
[2^63, 2^64)1099


I'm not sure which is more efficient. Seems like if you came up with an idea of the distribution of numbers used as input, you could calculate the expected value of the length. My coding theory is not good, but Benford's law suggests maybe a power-law distribution like the Pareto?

Efficiency aside, I like Scarab's elegance (simpler, no length limit, no values with two representations), and it's apparently common enough that Perl's pack and unpack support it natively. All three formats screw over the alignment that makes simple and fast hardware possible, but I guess that's just the nature of the game.

Apparently there's also ASN.1 PER, supposedly more efficient and complex, with an option to not even align to byte boundaries.
(Reply) (Parent) (Thread)
From:srlamb
Date:January 28th, 2007 09:23 pm (UTC)

Re: What's the goal of coalescing chunks?

(Link)
Hmm, typo in the second row, the Cake value should be 1.

A quick Python program calculates with some distributions I found interesting:

$ ./lengthtest.py 
Expected lengths (calculated with sample size 50000):
Distribution                    Scarab  BER     Cake
paretovariate(alpha=1)          1.01    2.00    1.00
paretovariate(alpha=1/2)        1.10    2.07    1.08
paretovariate(alpha=1/5)        1.61    2.49    1.66
paretovariate(alpha=1/10)       2.59    3.34    2.75
expovariate(lambd=1)            1.00    2.00    1.00
expovariate(lambd=1/10)         1.00    2.00    1.00
expovariate(lambd=1/100)        1.28    2.08    1.11
expovariate(lambd=1/1000)       1.88    2.77    1.80
expovariate(lambd=1/10000)      2.18    2.98    2.41
lognormvariate(mu=0,sigma=1)    1.00    2.00    1.00
lognormvariate(mu=0,sigma=10)   1.59    2.49    1.67
lognormvariate(mu=0,sigma=100)  9.01    8.98    8.44
lognormvariate(mu=1,sigma=1)    1.00    2.00    1.00
lognormvariate(mu=1,sigma=10)   1.68    2.56    1.78
lognormvariate(mu=1,sigma=100)  9.11    9.06    8.53
lognormvariate(mu=2,sigma=1)    1.00    2.00    1.00
lognormvariate(mu=2,sigma=10)   1.76    2.64    1.88
lognormvariate(mu=2,sigma=100)  9.19    9.13    8.61


(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 29th, 2007 12:12 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)

I'm not overly concerned with efficiency beyond a certain point. And it looks like a CAKE count has a slight advantage there in most situations anyway.

ASN.1 has a reputation for being horribly obscure and painful to use. I don't know enough about Scarab to make a judgement. I would like to use something that was relatively nice and commonly accepted. Especially if lots of parsers existed for it already. Thinking about it, this makes D-BUS vaguely attractive, though that seems to have more other baggage than I want.

OTOH, I don't need the data format to be self-describing. Having type information encoded in the data isn't a negative if it doesn't impact efficiency, but I think it's largely superfluous for CAKE.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 28th, 2007 02:27 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)

Also, your two responses were really helpful. :-) How did you find this post? How did you find my journal at all? :-)

(Reply) (Parent) (Thread)
From:srlamb
Date:January 28th, 2007 02:46 am (UTC)

Re: What's the goal of coalescing chunks?

(Link)
Glad I could help. I saw your post in advogato's recentlog. I'm slamb there.
(Reply) (Parent) (Thread)
From:srlamb
Date:January 27th, 2007 10:06 pm (UTC)

What's the goal of running checks?

(Link)
And the opposite question - what are you trying to accomplish with running health checks? Is something likely to be corrupted in storage or transit by chance, or does a lower level take care of that for you? I'm guessing the latter - TCP or a standard filesystem will do this for you. So you're just trying to detect an attack before sending the entire message? I guess that'd be some sort of DoS attack - corrupt a large message somehow and make you transmit it over and over trying to get it through?
(Reply) (Thread)
From:hattifattener
Date:January 27th, 2007 10:40 pm (UTC)
(Link)
It seems to me that there's another desirable property, of allowing intermediate nodes to fragment a chunk without losing integrity checking of the newly-subdivided chunks. That seems really difficult. I bet there's been some academic research into it, though.

One approach would be to just not allow coalescing of MACs, even if you allow coalescing of chunks. The originator would generate a MAC for every N bytes of the message, with the MAC being computed over the data plus a header (start-offset and length of this chunk). This means the message could end up with arbitrarily many MACs, but the overhead can be kept below a fixed fraction of the total message size, so it still scales well. This is kind of how Bittorrent does it.

Intermediaries can still coalesce chunks, as long as the coalesced chunk still carries along with it the multiple MACs associated with it. Intermediaries can subdivide chunks, but the subdivided chunk is not checkable until the rest of that MAC's data has been collected.

For whole-message authentication, the originator could generate a final MAC over the whole message, or over the collection of all the message's MACs (or more generally over a pyramidal tree of M(AC)^n s), or the MACs could be chained (the implicit header hashed into each MAC would include the previous chunk's MAC as well as a flag indicating whether this is the last chunk). The tradeoff here is between allowing parallelization of message generation (maybe the message is a response from a distributed data store, and it would be nice to be able to generate each chunk independently, followed by computation of the final MAC) or allowing message streaming with O(1) storage (my CAKE-enabled webcam, for example, may not have an arbitrary amount of storage available to it).

On the third hand, maybe the right answer is to try to disentangle fragmentation-and-reassembly from the CAKE layer entirely (or at least from the integrity-protection mechanisms). CAKE messages could be nonfragmentable and limited to less than 2^32 bytes, but CAKE could define a standard protocol for communicating longer messages, much like IP datagrams are limited to 2^16 bytes but almost everybody uses TCP to handle longer messages.

(Reply) (Thread)
[User Picture]
From:omnifarious' OpenID account [omnifarious.org]
Date:January 28th, 2007 02:36 am (UTC)
(Link)

Allowing messages to be transported in a bittorrenty sort of way is an interesting idea. I think the operative construct here would be a Merkle Hash Tree (http://open-content.net/specs/draft-jchapweske-thex-02.html#anchor2). This allows to you pick a fixed small block size and to allow arbitrary coalescing and splitting of message data on block-size boundaries. I believe that what a .torrent file largely consists of is all the hash data for just such a construct for a 64kibibyte blocksize.

It will be possible for recipients to advertise a maximum message size to the world. So forcing messages to be small and then be split up isn't really necessary. A recipient can make this happen by advertising a maximum message size that's fairly small.

But your ideas give me some other interesting ideas, which I'll try to detail in another reply. I want to catch a bus that is coming soon though.

(Reply) (Parent) (Thread)
From:hattifattener
Date:January 28th, 2007 03:27 am (UTC)
(Link)
Yeah, that's what I was referring to with my comment about pyramidal tree of MACs.

Bittorrent doesn't use a hash tree. The file is divided into chunks (of arbitrary size, but it has to be uniform for any given .torrent) and the hashes of all the chunks are in the .torrent file.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 28th, 2007 11:42 pm (UTC)
(Link)

Oh, it doesn't? I'm mildly disappointed. Those seemed so nice. :-)

There is a way to handle a hash tree where only the top level hash needs to be handed out centrally. Then any other node can send you data for a block plus at most logb n hashes where b is the block size and n is the total file size in order for you to verify the block it sent. And usually it's a lot less.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 28th, 2007 02:39 am (UTC)

OpenID and links

(Link)

I hate what LJ does to links posted by OpenID users.

Merkle Hash Trees

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 28th, 2007 02:41 am (UTC)

Even better link

(Link)

Hash tree from Wikipedia.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:January 29th, 2007 12:41 am (UTC)
(Link)

My other idea that's related is to implement certain services in CAKE using a peculiar underlying model.

Anytime static data is sent back, it's never sent back within the CAKE protocol that's requested. All that's sent back is the hash of the data that was requested. Then a separate step is done to fetch the data the hash represents. This allows all kinds of arbitrary caching mechanisms to be created to handle things. It solves a couple of what I consider a major problem with the web. First that a server always has to be reachable. Secondly, that server has to have the bandwidth to send everybody the data they're asking for.

Always using the root hash of a Merkle tree hash allows a LOT more flexibility in how caching schemes might be implemented.

(Reply) (Parent) (Thread)
From:hattifattener
Date:January 27th, 2007 10:45 pm (UTC)
(Link)
And on a random tangent, are you aware of Gale? It seems kind of CAKEy.
(Reply) (Thread)
[User Picture]
From:omnifarious
Date:January 29th, 2007 12:43 am (UTC)
(Link)

Hmm, at first glance it's missing one of the more important ideas which is to identify recipients first and foremost by a hash of their public key, and use various other names for them as either locations or pet names.

(Reply) (Parent) (Thread)
From:procyon112
Date:January 30th, 2007 01:17 am (UTC)
(Link)
(Reply) (Thread)
[User Picture]
From:omnifarious
Date:January 30th, 2007 11:18 pm (UTC)
(Link)

The messages are datagrams, but it's unlikely that they will ever be sent directly over Ethernet. So the size of the frame of the medium they are sent with isn't very important. Each message would likely be a 'transactional unit' for whatever protocol is being implemented. For example, in the case of email each message would likely be a complete email message.

Now, message chunks are something completely different. For example in HTTP a web server will run a script that does not have an output length that's known in advance. So what the webserver does is capture the output of the script and periodically send the output its collected so far to the client. It prefixes this chunk with a length. There is a special tag (in HTTPs case, a 0 length chunk) indicating that no more chunks are to follow.

Someone else pointed out, there is no reason for intermediate nodes to be able to combine or split message chunks. It buys you very little in the way of storage or bandwidth efficiency, especially if you use a very compact encoding for the chunk lengths and intermediate hashes.

So, having all the lengths as trailers defeats the purpose in several ways. First, it requires that there be an EOM delimeter that's outside the protocol. Secondly, it is still vulnerable to message truncation, since a router can just drop trailing message chunks without invalidating things. Lastly it would be nice if the thing generating the message didn't have to keep track of all the chunks its sent so it can send a trailer with all that data.

Also, I will be using MACs, not signatures. This is because MACs can be spoofed by the recipient, and I explicitly want this. If you sign every single message you send then there is a log by which the person who gets your messages can later prove to the whole world that you sent them. One of the things about having a conversation that supposedly includes you outed on the Internet is that you can claim that you didn't necessarily say all that and that someone edited the conversation. With signatures instead of MACs, that can be proven to be a lie.

(Reply) (Parent) (Thread)