article

Frequency Analysis

Email
Submitted on: 2/10/2015 7:46:00 AM
By: Daniel M (from psc cd)  
Level: Intermediate
User Rating: By 6 Users
Compatibility: VB 3.0, VB 4.0 (16-bit), VB 4.0 (32-bit), VB 5.0, VB 6.0, VB Script, ASP (Active Server Pages) , VBA MS Access, VBA MS Excel
Views: 779
 
     Learn how to break cyphertext with Frequency Analysis.

This article has accompanying files
 
				After reading "The Code Book", by Simon Singh I have been inspired to start cryptanalysis and have been attempting to write strong encryption algorithms and break them. This article focuses on how to use Frequency Analysis - the method of determining substituted characters by analyzing the frequency, or repetition/iterations of characters and comparing them to standard English.

First, I will give a brief overview of the steps taken to use Frequency Analysis.

1. The first step of Frequency Analysis is to count up the frequencies of each character in the ciphertext. I have included a .zip which automatically does this for you. There should be about five letters in which have a frequency less than 1% and they are most likely the letters j, k, q, x, and z. One of the letters should have a frequency greater than 10%, which probably represents the letter "e". That is, this generalization occurs only if the language it is written in follows the frequency chart for it's specific language, in this case, is English.

2. If the frequency chart follows the english frequency chart but decipherment is still not possible, the next step is to focus on pairs of repeated letters. For instance, in English, the most commonly repeated letters are as follows: ss, ee, tt, ff, ll, mm, and oo. If the ciphertext has any repeated characters, you can assume that they are one of those.

3. If the ciphertext has spaces between words, then try to decipher words that contain a length of less than four letters. Here is a list of one to three letter words that are most common and can be tried when deciphering:
1 Letter: A, I
2 Letters: of, to, in, it, is, be, as, at, so, we, he, by, or, on, do, if, me, my, up, an, go, no, us, am
3 Letters: the, and

4. If it is possible, find english texts that are similar to the ciphertext and use those for your frequency chart to get a most accurate chart. For instance, excerpt taken from "The Code Book"

"military messages tend to omit pronouns and articles, and the loss of words such as I, he, a and the will reduce the frequency of some of the commonest letters. If you know you are tackling a military message, you should use a frequency table generated from other military messages."

5. A skill commonly used in frequency analysis is the ability to indentify words or whole phrases based on experience or guesses. For example, if the military sends an encrypted weather report at 6:00 PM everyday, you can possibly assume the first word of the ciphertext may be the word "Weather", in which you could use to help break the rest of the ciphertext. These are known as cribs.

6. Last, but not least, if two frequency charts seem to match, but the ciphertext is not readable, this draws the conclusion that the text is indeed not a substitution cipher, but a transposition cipher.

7. Further methods of frequency analysis become more complicating, but can further help a person break a cipher. These include gathering statistics on the relationships between letters -how often a letter is seen neighboring another letter, or how often a letter begins a new word or ends a new word. Frequency analysis is a powerful tool for deciphering text if you follow the correct steps.

The .zip I have included offers a frequency chart generator in which you can export the information to a .txt file. I plan to further this project into a fully-functional Frequency Analysis Decrypter utility which will take the user step-by-step to decrypt the text. Thank you for reading.

winzip iconDownload article

Note: Due to the size or complexity of this submission, the author has submitted it as a .zip file to shorten your download time. Afterdownloading it, you will need a program like Winzip to decompress it.Virus note:All files are scanned once-a-day by Planet Source Code for viruses, but new viruses come out every day, so no prevention program can catch 100% of them. For your own safety, please:
  1. Re-scan downloaded files using your personal virus checker before using it.
  2. NEVER, EVER run compiled files (.exe's, .ocx's, .dll's etc.)--only run source code.
  3. Scan the source code with Minnow's Project Scanner

If you don't have a virus scanner, you can get one at many places on the net including:McAfee.com


Other 7 submission(s) by this author

 


Report Bad Submission
Use this form to tell us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:

Your Vote

What do you think of this article (in the Intermediate category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor (See voting log ...)
 

Other User Comments

4/1/2017 3:17:03 PMDavid

Interesting article, and code. Having read a lot on encryption, I downloaded it out of interest. Good start for anyone wanting to dig deeper, for personal encryption etc.

There was an undeclared variable error in Function GenerateFrequencyChart() so you need to declare blnProceed As Boolean, to avoid an error.
(If this comment was disrespectful, please report it.)

 

Add Your Feedback
Your feedback will be posted below and an email sent to the author. Please remember that the author was kind enough to share this with you, so any criticisms must be stated politely, or they will be deleted. (For feedback not related to this particular article, please click here instead.)
 

To post feedback, first please login.