Home Cryptography - Breaking Single Byte XOR Encryption
Post
Cancel

Cryptography - Breaking Single Byte XOR Encryption

In this post we’ll cover how to decrypt messages that have been XOR encrypted using a single byte key, such as b7. While this might not sound that useful, it’s a pre-cursor to breaking XOR encryption that uses a repeating key, such as 84 d2 7a 09 4c.

This post is also a solution to challenge 3 on the cryptopals website. This site is a great resource for hands on learning about cryptography. The technique used to complete this challenge is outlined on that page. This post will explain the same technique with a bit more detail and examples.

  1. How to Break it
  2. English Scoring
  3. Brute Forcing the Key
  4. What Next?

How to Break it

Since the key is only a single byte, the keyspace is very small, only 256 possible keys. As such, brute force is practical and reasonable solution.

We still need to determine which key is actually correct though. We’re assuming that the plain-text we are recovering is English text. After attempting decryption with a key, we’ll need a method to “score” the decrypted data based on how English it is.

English Scoring

Let’s start off by writing a function that can score a String based on how English it is.

We’ll give a String points for the following:

  • +2 Points for each occurrence of the 6 most common English characters (and space)
  • +1 Point for each occurrence of the next 6 most common English characters
  • +1 Point for each occurrence of the 20 most common English Bigrams
  • +1 Point for each occurrence of the 20 most common English Trigrams
  • +1 Point for each occurrence of the 298 most common English Words

The String’s score will be the sum of these points.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Source: https://preply.com/en/blog/300-most-common-english-words/
let commonEnglishWords = `is,be,do,go,am,he,an,up,so,no,of,to,in,on,at,by,it,he,we,my,me,us,as,or,if,was,are,had,can,use,has,see,did,get,may,say,set,put,ask,try,add,saw,got,run,eat,let,cut,way,day,man,boy,end,men,air,eye,car,sea,one,all,two,new,old,any,big,own,few,not,how,out,now,too,why,off,far,for,you,his,she,her,him,who,its,our,and,but,have,were,said,will,make,like,look,been,call,find,come,made,take,know,live,give,help,tell,came,want,show,does,must,went,read,need,move,play,keep,turn,seem,open,walk,grow,took,hear,stop,miss,talk,word,time,part,work,year,back,name,line,farm,land,home,hand,page,food,tree,city,head,life,side,feet,mile,book,idea,face,girl,list,song,each,many,some,more,long,most,good,mean,same,even,such,kind,high,left,next,hard,both,four,when,then,only,very,just,much,also,well,here,away,near,last,once,soon,with,from,into,down,over,that,they,this,what,your,them,than,it’s,would,write,could,think,spell,found,study,learn,start,might,close,begin,began,carry,watch,being,leave,water,sound,place,thing,house,point,world,plant,earth,story,paper,group,night,river,state,other,great,right,three,small,large,still,every,light,white,above,young,there,first,where,again,below,never,often,later,about,after,under,along,until,which,their,these,those,while,don’t,follow,change,should,number,people,animal,letter,mother,answer,school,father,Indian,family,little,second,enough,before,around,always,really,almost,thought,picture,America,country,example,another,through,between,without,because,sentence,children,mountain,together,different,important,sometimes,something`;
commonEnglishWords = commonEnglishWords.split(',').map(word => Buffer.from(word, 'utf-8'));

// Source: https://www3.nd.edu/~busiforc/handouts/cryptography/Letter%20Frequencies.html
let mostCommonEnglishLettersPrimary = new Map();
let mostCommonEnglishLettersSecondary = new Map();
'etaoin '.split('').forEach(char => mostCommonEnglishLettersPrimary.set(char.charCodeAt(), true));
'shrdlu'.split('').forEach(char => mostCommonEnglishLettersSecondary.set(char.charCodeAt(), true));

// Source: https://www3.nd.edu/~busiforc/handouts/cryptography/Letter%20Frequencies.html
let commonEnglishBigrams = `th,he,in,er,an,re,nd,on,en,at,ou,ed,ha,to,or,it,is,hi,es,ng`;
let commonEnglishTrigrams = `the,and,ing,her,hat,his,tha,ere,for,ent,ion,ter,was,you,ith,ver,all,wit,thi,tio`;
commonEnglishBigrams = commonEnglishBigrams.split(',').map(word => Buffer.from(word, 'utf-8'));
commonEnglishTrigrams = commonEnglishTrigrams.split(',').map(word => Buffer.from(word, 'utf-8'));

// Returns an positive Integer score based on how English the input Buffer is
const scoreDecryptedBuffer = (decipheredBuffer) => {
	let commonWordMatches = 0;
	let commonBigramMatches = 0;
	let commonTrigramMatches = 0;
	let commonLetterMatches = 0;

	// Detect how many occurrences of the most common English Words there are.
	commonEnglishWords.forEach(commonWord => {
		commonWordMatches += countOccurrences(decipheredBuffer, commonWord);
	});

	// Detect how many occurrences of the most common English Bigrams there are.
	commonEnglishBigrams.forEach(commonBigram => {
		commonBigramMatches += countOccurrences(decipheredBuffer, commonBigram);
	});

	// Detect how many occurrences of the most common English Trigrams there are.
	commonEnglishTrigrams.forEach(commonTrigram => {
		commonTrigramMatches += countOccurrences(decipheredBuffer, commonTrigram);
	});

	// Detect how many occurrences of the most common English letters there are.
	// The 6 most common letters (and space) get +2 score, the next 6 most common get +1 score.
	decipheredBuffer.forEach(byte => {
		if (mostCommonEnglishLettersPrimary.has(byte)) commonLetterMatches += 2;
		else if (mostCommonEnglishLettersSecondary.has(byte)) commonLetterMatches += 1;
	});

	return commonWordMatches + commonBigramMatches + commonTrigramMatches + commonLetterMatches;
}

// Count how many occurrences of a Buffer occur within a target Buffer
const countOccurrences = (targetBuffer, toSearchForBuffer) => {
	let occurrences = 0;
	let currentOccurrenceIndex = targetBuffer.indexOf(toSearchForBuffer);

	while (currentOccurrenceIndex !== -1) {
		occurrences++;
		currentOccurrenceIndex = targetBuffer.indexOf(toSearchForBuffer, currentOccurrenceIndex+1);
	}

	return occurrences;
}

Scoring Example

1
2
3
4
5
6
7
8
9
let englishText = Buffer.from('This is a test string that will be scored on how english it is', 'utf-8');
let randomBytes = [];

for (let i = 0; i < 64; ++i) {
	randomBytes.push(Math.floor(Math.random()*256));
}

console.log('Score for English Text:', scoreDecryptedBuffer(englishText));
console.log('Score for Random Bytes:', scoreDecryptedBuffer(Buffer.from(randomBytes)));
1
2
Score for English Text: 129
Score for Random Bytes: 4


Brute Forcing the Key

Now that we can score a given String on how English it is, we can brute force the key.

As the key is only one byte, there is 256 possible keys to check. For each key we will decrypt the cipher-text with it, and score the decrypted string. The key that produces the best score is most likely to be the correct key.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Perform single byte XOR decryption on a given Cipher-Text Buffer
const singleXorKeyDecryption = (cipherTextBuffer) => {
	let bestScore = { score: 0, key: -1 };
	let decryptionBuffer = Buffer.alloc(cipherTextBuffer.length);
	let tempScore;

	// Brute force all 256 possible keys
	for (decryptionKey = 0; decryptionKey < 256; ++decryptionKey) {
		// Decrypt the Cipher-Text using the current decryption key and store results in the decryption Buffer
		for (let i = 0; i < cipherTextBuffer.length; ++i) {
			decryptionBuffer[i] = cipherTextBuffer[i] ^ decryptionKey;
		}

		// Score the decrypted Cipher-Text
		tempScore = scoreDecryptedBuffer(decryptionBuffer);

		// If this key scored best, store it
		if (tempScore > bestScore.score) {
			bestScore.score = tempScore;
			bestScore.key = decryptionKey;
		}
	}

	// Decrypt the Cipher-Text using the best scoring decryption key and store results in the decryption Buffer
	for (let i = 0; i < cipherTextBuffer.length; ++i) {
		decryptionBuffer[i] = cipherTextBuffer[i] ^ bestScore.key;
	}

	// Return the best key, and the Cipher-Text decrypted with that key
	return {
		data: decryptionBuffer,
		key: bestScore.key,
		score: bestScore.score
	};
}

With this latest function we can call it with our cipher-text as input. The cipher-text used is from here.

1
2
3
4
5
6
7
8
9
10
let cipherText = Buffer.from('1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736', 'hex');

let decryptionResults = singleXorKeyDecryption(cipherText);

console.log('\nMost Likely Decryption Key:');
console.log(Buffer.from([decryptionResults.key]));

console.log('\nDecrypted Data:');
console.log(decryptionResults.data);
console.log(decryptionResults.data.toString('utf-8'));

Putting everything together, we can now decrypt our cipher-text:

1
2
3
4
5
6
Most Likely Decryption Key:
<Buffer 58>

Decrypted Data:
<Buffer 43 6f 6f 6b 69 6e 67 20 4d 43 27 73 20 6c 69 6b 65 20 61 20 70 6f 75 6e 64 20 6f 66 20 62 61 63 6f 6e>
Cooking MC's like a pound of bacon


What Next?

While this function might not seem that useful at first, it will be vital when we want to break repeating key XOR encryption.

If you want to learn how to do that, you can read how to here.

This post is licensed under CC BY 4.0 by the author.