danger to HTTPS, doom to SPDY

Since the BREACH attack, it seems that there is no way to transport content securely in the HTTP world.

The BREACH arrack is an HTTP version of CRIME, which recovers encrypted messages by analyzing the compress ratio of different media. It is well-know that people can see distinct pictures from the text by the compress ratio; however, before CRIME, there is no easy way to detect what exactly the information is by the ratio only. But the breach always exists. The word “faster” and “sunoru” have the same length. However, the entropy(binary) of “faster” is 2.58496, and the entropy of “sunoru” is 2.25163. So, if you know the original length(6) of the words, and also get access to the entropy of the words, you can easily obtain rich information from the results. For a “perfect” compress algorithm with an observe-only way to get information, you can get how much time different alpha is included in each word, which, generally, is not so useful(But shouldn’t be public even so). But real-world compress algorithm is NOT perfect, and real-world environment is NOT observed only. You can send a message to the server to determine which real-world compress method the server is using, and you can obtain much more information form the simple ratio if multiple requests are made by the CRIME attack.

For HTTPS, it represents a danger for web pages with simple information. For example, some banks in China using a number in a picture to show how much money you have, when the picture is compressed, it is pretty easy to obtain the real number the picture shows by compress ratio. By using a precomputed table, you can decrypt millions of those “money pictures” per second with a Macbook Air. So if you find your bank is transport money number in the picture, you should be aware it may be a deliberate way to publish that information to the whole net.

However, for SPDY, your app may be cracked even without deliberate setups. SPDY’s speed is based on compressed headers, which include URL, cookie, and authorization token. As the client will send the header wherever people visit the same site, you just need to XSS the client to a static page(e.g., a 404-page ~), then you can obtain all the information in the header without any painful struggle. And when you get the header, you get the URL(so the complete browsing history is public), the cookie and authority token(so the log-in status of the personal), and all the content of the page. So, it’s just like that you are visiting the page using HTTP without S.

Not only HTTPS and SPDY are effected, Tor, which uses gzip as it’s compression algorithm, is also affected. But it may be not so easy to crack Tor as it reuses TCP tunnel… SSH with compress can also be decrypted this way. However, it needs some small skill and luck to do the gzip guess as you cannot easily make the user resend things.
In conclusion, SPDY is just like clear text for a careful attacker, and HTTPS is not so secure anymore…

The good news is that the network working group finally finds the danger in compression, and decides not to support compression any more in TLS 1.3 draft-02. Have I said that is good news? It seems not like a pleasant change for those who only have limited network bandwidth…


SNI means Server Name Indication, which is a technology to let the server know which domain the client is linking to and return the certification correspondingly, which makes a single IP possible to serve multiple HTTPS sites. It is defined in RFC 6066 section 3.

The protocol extension changes the handshake process in the TLS. The client should include a struct array of the DNS name of the server the client wants to link to. And if the server has the certification, the handshake goes on normally. If not, the server should send a fatal level error and drop the connection, or just go on as if nothing happened(and give out the default certification).

The protocol also influenced the session cache of the TLS server. The TLS server which supports the extension will never give out any session to the client if the server_name mismatches. Even if the client has all the outer things qualified.

Some people think that SNI will add security risks as the client will transport the server name in cleartext. However, if a site is a TLS site(without SNI), anyone can know who the client is talking to by linking to the server. Essentially means the IP in traditional TLS servers gives out the information of the domain. Telling the domain will not add security risk to the protocol.

In fact, as the protocol provides another way to check session cache, it actually reduces the risk(though seems impossible&useless already in traditional TLS server) if the server uses the wrong TLS session which is opened by an attacker to send message to the user.

SQL’s not end

Today, in a distributed cloud environment, there is no good DB that can have both ACID and SQL support, at the same time keep the performance scale. So, there is NoSQL.

NoSQL means Not Only SQL. It does not means NO SQL. MongoDB-like applications have set up a bad example for its followers. A NoSQL DBMS might not use SQL as its base query language, but it at least should support SQL as a higher layer query language, just like what FoundationDB tries to do. Of course, the problem is ACID with the flexibility of SQL language, which is a rather difficult problem for a sharding DB, who becomes more and more common since cloud computing is conquering the server world. SQL language itself is not difficult to carry out, ACID with the ability to support complex SQL is.

But ACID is a must-have feature in many apps, not only bankers needs ACID, all app developer who wants to make a robust app must have transaction as one of their basic tools. No one can accept an app if it only acts normally when the user is lucky. Those who want to throw away ACID will only find they are implementing their own ACID solutions later.

There are many ways to overcome ACID implement problem. In the cloud, locking is an unacceptable way, unless there is some way to lock and sync at a very small, accurate level, which is not an easy job for 100+ sharding servers. Log and check(MVCC) is another way, which is easier than locks, and implemented in many DB solutions.

If ACID is possible in a cloud, SQL will be, too. But it may exist as a layer on an ACID system based on some simpler API. Whatever, SQL will not be ended by NoSQL and cloud, it will still be used in many places for whoever wants to keep data update easy(or even possible). Maybe one day there will be only copied and not reference in DB world, but I think the day has not come yet.