Apply information theory to measure the complexity of encoded messages found within early modern cryptography and occult ciphers.

— by

Measuring the Ghost in the Code: Applying Information Theory to Early Modern Cryptography

Introduction

For centuries, the boundary between statecraft and the occult was blurred by the art of secret writing. From the shorthand of Elizabethan spymasters like Sir Francis Walsingham to the esoteric, “angelic” alphabets found in the grimoires of John Dee and Edward Kelley, early modern cryptography served as a vessel for both political survival and spiritual transcendence. But how do we distinguish a sophisticated cipher from mere scribbles or decorative mysticism? The answer lies in information theory—a field that allows us to quantify the “surprise” and “density” of an encoded message, effectively stripping away the shroud of history to reveal the mathematical structure underneath.

Applying Claude Shannon’s principles of entropy to historical documents allows modern researchers to move beyond speculative decryption. By measuring the information density of these occult and political ciphers, we can determine whether a script possesses the statistical hallmarks of a functional language or if it is a product of idiosyncratic ritual. This article explores how to apply these rigorous metrics to the mysterious, encoded artifacts of the 16th and 17th centuries.

Key Concepts

To analyze early modern ciphers, one must first grasp the core pillars of information theory as they apply to linguistics and cryptanalysis:

  • Shannon Entropy (H): This is the measure of uncertainty or randomness in a set of data. In a cipher, low entropy often suggests a simple substitution or a repetitive structure (like a prayer or a ritual formula), while high entropy suggests a more complex, polyalphabetic, or compressed encoding system.
  • Redundancy: Natural languages have high redundancy (e.g., the letter ‘u’ almost always follows ‘q’ in English). A cipher that retains the linguistic redundancy of the plaintext is vulnerable to frequency analysis. A “strong” cipher removes this redundancy to make the ciphertext appear as uniform noise.
  • Information Density: This measures how much “meaning” is packed into each symbol. In the early modern period, occult ciphers often sacrificed density for visual complexity, creating a signature “fingerprint” that separates them from state-level intelligence ciphers designed for efficiency.

Step-by-Step Guide

To evaluate the complexity of an encoded manuscript, follow this standardized analytical workflow:

  1. Digital Transcription: Convert the historical document into a machine-readable format. If the document is visual (such as a sigil or a symbolic cipher), map the symbols to a standardized alphanumeric index. Do not attempt to guess the meaning yet; treat the symbols as raw data.
  2. Frequency Distribution Analysis: Calculate the frequency of each unique symbol. Plot these on a histogram. A natural language will follow Zipf’s Law, where the most frequent symbol appears significantly more often than the second, and so on. If the distribution is flat, you are likely dealing with a high-complexity transposition cipher or a non-linguistic code.
  3. Calculate Shannon Entropy: Utilize the formula H = -Σ p(x) log2 p(x), where p(x) is the probability of a symbol appearing. Compare the entropy of the document against the known entropy of the target language (e.g., Latin, English, or French). If the entropy matches the target language, the cipher is likely a simple monoalphabetic substitution.
  4. Measure N-gram Variance: Look for the frequency of symbol pairs (bigrams) and triplets (trigrams). In real languages, specific patterns repeat. If these patterns are absent, the “cipher” may be a symbolic obfuscation designed for aesthetic or ritual effect rather than secure communication.
  5. Cross-Reference with Occult Taxonomy: If the mathematical entropy is low, but the visual complexity is high, cross-reference the symbols against known cryptographic manuals of the period (such as those by Johannes Trithemius). This helps determine if the “cipher” is a standardized occult system.

Examples and Case Studies

The Voynich Manuscript (The Benchmark of Complexity)

While the Voynich Manuscript predates the early modern period, it serves as the ultimate test case for information theory. Analyses have shown that while the manuscript possesses word-length patterns and Zipf-like distributions—suggesting it is not pure gibberish—it lacks the typical n-gram distribution of a natural language. Information theory here reveals a “middle ground”: the structure is too sophisticated to be a random hoax, yet it defies standard linguistic mapping.

John Dee’s “Enochian” Scripts

John Dee’s Angelic (Enochian) language provides a fascinating study in entropy. When subjected to frequency analysis, Enochian actually displays the characteristics of a synthetic language. Its entropy levels are consistent with a structured phonetic system. This suggests that the “occult” messages were not merely random noise but were constructed as a cohesive system, likely influenced by the phonetics of the languages Dee was already fluent in (Latin and English).

Common Mistakes

  • Ignoring Contextual Encoding: Researchers often assume a cipher is a standard substitution. However, early modern writers often used “nomenclators”—a hybrid of codebooks (representing whole words) and substitution ciphers (representing letters). Failing to account for this hybrid approach leads to incorrect entropy calculations.
  • Over-Fitting the Data: If you force a historical text to fit a modern linguistic model, you may hallucinate patterns where none exist. Always perform a “null hypothesis” test by applying the same analysis to a known random string of characters to ensure your results are statistically significant.
  • Neglecting Scribal Error: Early modern manuscripts are prone to errors caused by fatigue, poor lighting, or dialectal variations. A sudden spike in entropy in a specific folio might be the result of a scribe losing focus, not a shift in the cryptographic algorithm.

Advanced Tips

To push your analysis further, consider the role of “Steganographic noise.” Many occult ciphers were embedded within illustrations or decorated text. When measuring complexity, calculate the entropy of the visual space of the page, not just the text. Advanced tools now allow for “pixel-entropy” analysis, where the spatial arrangement of ink on parchment is treated as a secondary layer of information. This can reveal if the placement of decorative elements was intended to convey information to those initiated into a specific code.

Furthermore, use Conditional Entropy to measure the predictability of symbols based on the symbols that came before them. If a symbol in an occult manuscript is highly predictable, it suggests the document relies on fixed, ritualized phrases. If the conditional entropy remains high, the message likely contains novel information, marking it as a genuine attempt at communication rather than a rote liturgical recitation.

Conclusion

Information theory acts as a bridge between the subjective world of occult history and the objective world of mathematics. By quantifying the complexity of encoded messages, we gain the ability to sort the functional from the performative. Whether you are investigating the espionage tactics of a royal court or the cryptic logs of a Renaissance alchemist, Shannon’s metrics provide the necessary scaffolding to peel back the layers of historical deception.

The primary takeaway is this: complexity is not synonymous with secrecy. Often, the most secure early modern messages were those that mimicked the statistical profile of natural language, hiding in plain sight. Conversely, the most complex-looking occult sigils often contain very little information, serving instead as psychological anchors for the practitioner. By applying these quantitative lenses, we transform the study of history into a precise, evidence-based discipline.

Newsletter

Our latest updates in your e-mail.


Leave a Reply

Your email address will not be published. Required fields are marked *