Frequency Analysis

In this worksheet, we do a little frequency analysis. Here is the frequency distribution of a typical long english text (the first chapter of the first Hercule Poirot novel, The mysterious affair at Styles).


That’s not a very nice way to display the data — it isn’t even in order! Here’s a way to graphically display it, where the x-axis is the letters of the alphabet in order.


Here’s a coded message, coded with a Caesar shift.


We can ask Sage to compute a frequency distribution for the letters in our message.


Now compare these two pictures. One should be a shift of the other. Make keyGuess your guess for the key, and try deciphering to see if you get english:


This (the message you just decrypted, taken from a news article) is an example of security through obscurity. By contrast, Kerckhoff’s Principle or Shannon’s Maxim is that the enemy should be assumed to know the system, just not the key. The shift cipher is wholly useless, even to someone armed only with pen and paper — it offered security to Caesar only through obscurity (his contemporaries were not practiced cryptanalysts).

Here’s a sandbox if you want to play around: