07 April 2012

The birthday paradox

According to the mathematical theory, if we put 23 people in a room there is around a 50% chance that 2 of them have the same birthday.If instead of 23 there are 75 people the probability rises  to 99.9%. Be aware that we are not talking about someone having the same birthday as us (in this case we would be imposing a condition of a specific date which is a stronger constraint), but any two people agree on her birthday (may be any date).

The mathematics behind this can be complex if you have not studied anything about probability, but we will try to explain them. To begin we must know how many possible pairings can occur in a room of 
For example, in a room with 23 people there are 253 possible pairs.
The probability that one partner has the same birthday is:
Its complement is the probability that one partner does NOT have the same birthday:

The probability that there is no pair with a matching birthday is:
Since the base is less than 1 is likely to decrease with the increasing of the number of people in the sample. Said in another way: the more people the harder it is not to find a coincidence with the birthday. Known that, the probability (P) that we DO have a match is complementary to the above, ie:

If we give the value P 0.5 (limit beyond which the chances of finding a match are placed on our side) and solve for n, we get exactly 23.

So far the theory, however it should be noted (for those who would excel at a party) that the practice is quite different since our birth dates for quite some time ceased to be equally likely because doctors increasingly scheduled births more often and not scheduled for the weekend, breaking the equiprobability!.

But the interest of the result we have obtained is not limited to the story in a meeting but we will see if we generalize the formula that has interesting applications in the field of computer security, among others. Thus, if we call p the probability that one partner will pass something and p (add a top bar) to its complement, the probability that this couple don't do what we are evaluating, then we have that  the above generalized formula is:Thus P is the probability of an event when the number of elements in the sample base is n.If we solve the formula n is as follows:For the birthday example we would like to find the number n of people from which P>0.5. To find out that we would sustituteP by 0.5, and p (with bar above) for 364/365 and would see that n is actually 23.

But lets see a more interesting example taken from the world of computer security. Suppose we are at job and a seller comes to tell us the wonders of his facial recognition system. The stuff would be placed at the entrance of the building and would replace the classic sign machine, which would eliminate the queues when the 5,000 employees of our organization arrived each morning. The commercial tells us quite self confident that the machine itself has a probability of confusing two people of about 1 in 1,000,000 (ie a probability of success of 0.999999), so he would be astonished when we reject his system for insecure . Why is it unsafe?, Because if we apply the above data to formula (taking into account that p (bar) is 0.999999) we see that the chances of error are higher than 0.5 once it has scanned 1178 employees.Put another way, there would be no way to end the morning without the little machine to make a mistake.

An useful tip is, when dealing with large numbers we can approximate the general formula with:

Where N is the inverse of p (probability of occurrence with a single pair of elements). In the case of N would be 365 birthday, so that the approximation would n = 19.10 and in the case of face recognition N would be 1,000,000 and would obtain an approximation of n = 1000.

This approach is widely used in the world of cryptography in which the birthday paradox is very much in mind. For example, suppose we have a table with an encrypted password hash algorithm of 64-bit and we're calm because we believe that a dictionary attack would have to do 2 ^ 64 attempts before giving our password ... but no, nothing is further from reality as the birthday paradox shows that the attacker will surely our password when you've done little more than 2 ^ 32 attempts.

As you can see from the above, the birthday paradox transcends the merely anecdotal becoming an important tool for any security engineer.

06 April 2012


man-in-the-middle attack occurs when a third is located between two principals, whether individuals or some type of equipment, and deceives them into believing that each of them he is the other principal. It acts as an interface for modifying the flow of information exchanged. The only solution to this type of attack is the encryption of communications using a protocol that involves a key exchange through a secure channel, so that even if an attacker could intervene communication would be unable to understand or to modify it without being detected by integrity checks. 

So much for theory. The practice supports several variations, which have been made throughout history. One of the most spectacular I have found during the reading of Security Engineering , of Ross Anderson. To illustrate the dissertation on man-in-the-middle attacks, the author tells the disaster suffered by the South African troops because of an attack variant that the time has come calling MIG-in-the-middle. 

Article's author note: As Zooko commented, Ross Anderson admitted in the second edition of his book that he recently discovered this history is unfounded. However it could have happened (change the actors) in other wars in other places, so I think it's still interesting.

In the late 80, South Africa was involved in a war in northern Namibia and southern Angola. His goal was to keep the white government of Namibia and to the puppet government of UNITA in Angola. As the South African army was composed exclusively of white recruits and these did not represent a very large population segment, it was essential to maintain a casualtie rate as low as possible.So the South African army preferred to concentrate on maintaining order in Namibia and support with its air force to the UNITA troops in Angola. To prevent the Angolan rebel troops and their Cuban allies to counter strike against Namibia, South Africa deployed a powerful barrier of anti-aircraft batteries. To prevent the so-called "friendly fire," South African aircraft were equipped with IFF (Identify-friend-or-foe), common today, which responded according to a predetermined algorithm to the signals from anti-aircraft batteries so that these identify themselves as friends to the South African aircraft.

The defensive system seemed unbeatable, and yet one day a squad of Cuban MiGs crossed the line of defense and bombed one of the main camps of Namibia. South African anti-aircraft batteries didn't shoot a single missile. Casualties were terrible and led the South African government to decide to withdraw troops from Namibia, which eventually led to the fall the white government of that country. 

Time passed until the discovery (and confession) of the causes of that monumental failure in the South African defense. It appears that the Angolan intelligence learned that the South Africans, perhaps due to overconfidence, left switched on their IFF devices while conducting missions over Angolan territory. What they did was to situate a MIG waiting near the border of Namibia to a South African squadron (SAAF in the figure) headed to bomb Angola.When South Africans were present on Angola to carry out their respective missions, the MIGs entered the territory of Namibia. Of course, the South African air defenses detected the MIGs and launched the IFF signals to determine whether they were friend or foe aircrafts. The MIGs lacked the necessary keys to correctly answer the IFF signals, so their plan was to forward the signals to their radio stations in Angola and make them send that signals to the South African squadron which was flying over that country at that time. The plan worked because South African airplanes had turned on their IFF and when Angolan radio stations broadcast the signals sent by the MIGs, the South Africans began to reply to them automatically without knowing that these responses were being forwarded to the MIG squadron so that they can serve it to  the South African batteries in Namibia. Thus, South African air defenses in Namibia thought that those MIGs were friendly aircrafts, letting them pass. 

That attack forced them to rethink deeply technology, protocols and operating procedures used for the IFF. To begin with, at the operational level South African prohibited pilots to let their IFFs on when they flew over enemy territory. At the protocol level the solution was much harder to find and certainly was not definitive, as they began include the aircraft identification in the response returned by the IFF so that they could correlate the data offered by control towers to find differences. Another method tried was to measure the delays of the responses so that if any came "later" than usual they could suspect that the IFF was generated by a plane that was farther away than the plane detected. Keep in mind that at this particular attack the encryption of communications is not useful as MiGs did not need to break the confidenciality or integrity of IFF signals, since they needed only to "reflect them" for their radio stations . All these answers are far from perfect and much less viable in battle areas crowded with air traffic of the two sides, which is why IFF techniques is a field in constant development to avoid disasters like that caused by the attack MIG- in-the-middle.

Dissapearing Cryptography

The problem with cryptography books is that either require a degree in mathematics to understand them, or stay in a series of algorithmic recipes for programmers. Finding a work like this book, who knows how to please both those who seek a solid theoretical basis as those seeking a more practical approach is extremely difficult.

Author has managed to structure a book in layers that allow a progressive approach to the subject without having to delve into the arcane details to complete the work with a clear vision of the foundations of cryptography to hide or steganography .Thus, each chapter begins with an overview of the subject matter going after gradually in mathematics. The reader can skip this part at any time to go to mathematical algorithms describing the possible forms of practical benefit to all this. It is so educational and affordable, that even without prior knowledge of steganography, a reader with some agility in a programming languages can do his own data hiding tools after finishing the book.

It is also quite extensive, playing a wide range of information hiding techniques. For example, it has an extremely interesting chapter on steganography using grammars and tree-structured code so that seemingly innocuous texts contain hidden messages completely different. The example used after an alleged sports broadcasting is very revealing ... Of course, it is also the techniques (most popular), hiding in images and sounds coming well in advanced materials, such as manipulation of the coefficients of the FFT to hide information in image compression format, or different techniques for watermarking resistant to suppression or manipulation. It touchs other topics that may not come under the heading of steganography from the orthodox view, but certainly interesting for anyone studying the subject. This is the anonymity of email services, or browsing through remailers and anonymous networks respectively.

Annexes deserve special mention because in them the author delves into the practical side of book.  In one of them offers a practical example in Java of some of the issues.  In other uses LISP to illustrate the chapter of steganography over grammars.  And in another place an intensive review of steganographic tools available online, as well as to deepen the available literature on the subject. 

In short, a highly recommended book for both experts and cryptography and steganography as for those looking to start in this discipline.

05 March 2012

Chryptographic filesystems

At this point it is redundant to explain the reasons for using cryptographic techniques. Throughout history we have seen many examples which are summarized in the premise that someone wants to keep secret something that another would like to know.

At the time that computers and mass storage devices began to be "transportable" came immediately the need to find mechanisms to protect the information contained in these devices on the road. Imagine the damage that a public agency or a large company could suffer if one of their employees have his work laptop stealt at a gas station or an airport. Therefore, it was needed to find an environment that ensured the confidentiality of the information of the portable devices in a comfortable way that do not penalize their use.

On the other hand, it exists the possibility that the owner of the data suffers pressure to decrypt the data with its key. Imagine if an employee who works on military bussines or a political dissident is tortured or interrogated to find out how to enter his laptop to get a nes future fighter blueprints or opinion pieces that could lead him to a firing squad. In such cases it is very useful that mechanisms exist to provide deniability of information. These mechanisms would help deciphering "lure" files of little importance or even outright false which would mislead the interrogators, who would have no way of knowing if that is the whole information encrypted or if instead there is more. 
To address these problems the military has been developing long ciphers able to protect data in case of mass storage devices fall into enemy hands.

At civil level, one of the earliest examples is found in the file system StegFS. This file system allows not only encrypt confidential data but apply steganographic methods that gives deniability of that information. StegFS development is stopped and is not currently recommended for use in modern infrastructure (in fact only supports kernels 2.2.x branch), however their website has several papers, highly recommended reading, explaining the theoretical of such mechanisms.

Another option, the other in constant development and has come to pick up the baton from the previous system, is TrueCrypt . This tool is cross-platform (runs on Windows, Mac and Linux) and allows you to create encrypted virtual disks that can be mounted as standard units of the system, performing the encryption-decryption in real time (on-the-fly), so that operation on encrypted files is completely transparent to the user. When removing these virtual disks TrueCrypt encrypts data in a single file so there is no way for an attacker to know how many encrypted files are inside the main one. For all this the user uses a password of his invention and one of the symmetric encryption algorithms offered by the tool, such as AES-256, Serpent or Twofish, among others.

The usefulness of this tool is obvious, since we can encrypt personal data that we have in our USB memory (for example, our certificate of the FNMT, lists of passwords, network maps or personal photos) or even an entire partition, including boot, as TrueCrypt allows for authentication during startup, so that if any steals the laptop will be virtually impossible to him to extract any information from the hard drive.

TrueCrypt provides two levels of deniability. To begin with: without the decryption key, encrypted TrueCrypt volumes are indistinguishable from any other type of binary data files. Files containing encrypted TrueCrypt volumes lack any type of signature that identifies them as such. This makes it easy to hide it by changing its extension such as. Iso,. Raw,. Img or what comes to mind ... to TrueCrypt will not be affected by extension because when mounting the volume Truecrypt accepts anyone. The second level of deniability corresponds to hidden volumes that consists in an encrypted volume inside another so that the user has two passwords: one that opens the main encrypted volume, which will have decoy files, and other that opens a secondary encripted volume (the one in the main) with the really important files. The TrueCrypt developers say there is no way to tell if an encrypted volume contains another hidden. 
Besides the above, Truecrypt is really easy to use by relying on a truly intuitive windows interface. To make things better the documentation of its website is very well organized and is very educational explaining in a clear and easily way the fundamentals of the functionality offered by the tool.

With this, there is no excuse for not wearing our data safely on our portable devices, completely safe from leaks because of theft or espionage.