★ Start Here The Manuscript Library What It Isn't Findings Curiosities Methodology Press Collaborate Resources
Comparative analysis

The Voynich in Context

How the Voynich Manuscript compares structurally to other undeciphered and poorly understood writing systems.

The Voynich Manuscript is not the only undeciphered script. Linear A, the Indus Valley script, Rongorongo, and Proto-Elamite all remain unread. Each presents different structural challenges. Placing the Voynich within this comparative framework clarifies what kind of problem it represents — and what strategies are most likely to succeed.

Comparative overview

Five Undeciphered Systems

Script Est. Date Script Type Bilingual Text Corpus Size Current Status BPE VMML
Voynich 1404–1438
carbon dated, UA 2009
Unknown
Disputed
None ~37,000 tokens
Beinecke MS 408
Undeciphered. Active research. Multiple competing hypotheses. 5.918
above alphabetic ceiling
Linear A c. 1800–1450 BCE
Minoan Crete
Syllabary
Probable
None known
Linear B related but distinct
~1,500 signs
fragmentary
Undeciphered. Syllabic values partially inferred from Linear B. Language unknown. ≈ 4.85
preliminary, small corpus
Indus Script c. 2600–1900 BCE
Harappan Civilization
Logographic/syllabic
Debated
None ~4,000 inscriptions
mostly seals, very short
Undeciphered. Contested whether it constitutes a full writing system. Not computed
corpus too short for reliable BPE
Rongorongo Pre-1722 CE
Easter Island
Possibly logographic
Uncertain
None ~14,000 glyphs
25 surviving tablets
Undeciphered. Relationship to oral tradition debated. Corpus severely limited. ≈ 5.09
highly uncertain, small corpus
Proto-Elamite c. 3200–2900 BCE
SW Iran
Logographic
Partial
None
contemporary with proto-cuneiform
~5,000 tablets
large but mostly numeric
Undeciphered. Accounting function partially understood; semantic content inaccessible. Not computed
predominantly numeric, unsuitable
Structural analysis

What Makes the Voynich Structurally Unusual

Among undeciphered scripts, three quantitative properties distinguish the Voynich from everything else in the comparative set.

1

Very high VMML — exceeds all tested alphabetic scripts

BPE Mean Morpheme Length of 5.918 places the Voynich above the alphabetic ceiling established across 63 language corpora spanning 35 families. Among the undeciphered scripts where VMML is computable, Linear A reaches approximately 4.85 (preliminary, small corpus) and Rongorongo approximately 5.09 (highly uncertain). The Voynich value is an outlier in both the deciphered and undeciphered comparative sets.

2

Consistent internal grammar at exceptional coverage

A 47-morpheme EBNF grammar covers 92% of 37,025 tokens. No other undeciphered script has been shown to exhibit this level of internal grammatical regularity at this coverage rate. The grammar is compact — 47 rules — yet achieves near-total corpus coverage. This either reflects a genuine generative grammar or an exceptionally consistent encoding procedure. Either interpretation is remarkable.

3

Edge-concentrated morpheme boundaries — unusually high BC

Boundary Concentration (BC) of 0.361 exceeds any tested natural language from the Philippine, Malayo-Polynesian, or Basque families. High BC indicates morpheme boundaries cluster at token edges — the signature of strong prefix/suffix morphology. For undeciphered scripts where morphological structure is unknown by definition, this measurement is particularly significant: it suggests we can detect morphological architecture without reading the script.

Physical evidence

What Codicological Analysis Has Established

Vellum dating 1404–1438 CE by carbon dating (University of Arizona, 2009). The vellum predates the text or is contemporary with it.
Scribal hands At least 5 distinct hands identified (Davis 2020), correlating imperfectly with Currier A/B sections.
Ink composition Iron gall ink, consistent with 15th-century European manuscript production.
Binding Consistent with 15th-century European bookbinding practices. No anachronistic materials detected.
Pigments Illustrations use azurite, verdigris, and minium — consistent with the medieval European palette and period.
UV examination No overwriting or corrections detected under ultraviolet light — unusual for a genuine working manuscript.

The absence of corrections is significant. Working manuscripts — medical herbals, astronomical texts, recipe compilations — typically show evidence of revision, annotation, and correction. The Voynich shows none. This is consistent with either a clean copy (suggesting an exemplar existed) or a text produced without error because its content was not semantically meaningful to the scribe.

Historical context

Comparison with Known Ciphers of the Period

Medieval cipher manuscripts exist as a documentary category. Two are particularly relevant to Voynich comparison: the Rohonc Codex (Hungarian, probable 16th–19th century, long considered undeciphered) and the Copiale Cipher (German, 18th century, deciphered by Knight and Megyesi in 2011 using statistical methods).

The Rohonc Codex was recently shown to exhibit a high degree of structural regularity measurably different from natural language — its statistics behave differently from both known languages and the Voynich. The Copiale, once deciphered, revealed an ophthalmological ritual text; statistical methods succeeded because the encoding was a direct substitution on a phonetic base.

The Voynich resists both approaches. Its structure is more language-like than known cipher manuscripts on multiple metrics, yet less decodable than known languages using any currently available method. It occupies a structurally distinct position in the comparative space — neither clearly a cipher nor clearly a natural language, but exhibiting properties that would be unusual in either category.

This is, precisely, why it remains unsolved after more than a century of serious cryptographic and linguistic attention.

"Researchers specializing in undeciphered scripts, codicology, or medieval manuscript analysis are particularly welcome to reach out. Comparative data from other scripts — especially Linear A and Rongorongo — would significantly extend this analysis."

contact@voynichlucidity.com →
↑ Back to top