About L.

The Work

The answer required building a comparison framework first. That framework — which segments corpora into byte-pair encoded tokens and computes morphological density, hapax ratio, token-type distribution, and affix boundary behavior across 29 language families — became the research itself.

The Voynich Manuscript is the test case that motivated the tool. The tool has outlasted any particular hypothesis about the manuscript, and can now be applied to other undeciphered scripts. The current focus is on validating the typological fingerprint methodology against known scripts before drawing firm conclusions about unknown ones.

Every dataset used in this research is available on Zenodo. Every claim on this site has gone through adversarial internal review before publication. The research is explicitly designed for replication — and L. would rather be refuted by good methodology than affirmed by credulous acceptance.

What L. does

+ Systematic BPE corpus comparison across language families
+ Adversarial self-criticism before any publication
+ Open data for all formal collaborators
+ Explicit uncertainty quantification in all claims
+ Zenodo archival of all datasets and pipeline versions
+ Respond to methodological criticism within one week

What L. doesn't do

− Propose decipherments or translations
− Accept findings without adversarial testing
− Publish without some form of peer review process
− Make claims beyond what the data supports
− Respond to "I think it says X" correspondence
− Treat statistical association as causal proof

The Approach

"Every finding goes through what we call 'hostile peer review' before publication. If a finding has a fatal flaw, it stays in the draft folder. The bar for going public is not confidence — it is surviving the strongest objection we can construct."

Adversarial self-criticism is not a quality-control step appended at the end of the process. It is the process. For each major finding, L. formulates the most rigorous, methodologically competent rebuttal possible before the finding is considered publishable. Not the most convenient rebuttal — the strongest one. The one that, if valid, would invalidate the result entirely.

This means findings take longer to appear here than they would in a less disciplined workflow. It also means that what does appear has been tested against its own failure modes. Draft folders accumulate. Publication lists move slowly. That asymmetry is intentional.

The practice emerged directly from observing how Voynich research fails. Proposals accumulate, peer circles form around them, and the proposals eventually collapse — not because the underlying ideas were necessarily wrong, but because they were not stress-tested before being invested in. L. runs the stress test first.

Research Philosophy

The typological approach used here is deliberately modest in its scope. Identifying a family-level structural signature is not the same as identifying a language. Identifying a language is not the same as reading the text. Identifying a statistical similarity is not the same as proving a causal relationship. These distinctions are maintained explicitly throughout the research and are not treated as rhetorical caveats — they are load-bearing methodological constraints.

The methodology is designed for falsifiability. Every metric has a defined threshold. Every threshold has an operational definition. Every claim maps to a specific dataset that can be downloaded, re-run, and challenged. If the methodology is wrong, it should be possible for another researcher to demonstrate that with the same tools. That kind of exposure is not a risk to be minimised — it is the condition of doing science rather than speculation.

L. would rather be refuted by good methodology than affirmed by credulous acceptance.

On Anonymity

The research should stand on its methodology and data, not on the credentials of the researcher. In a field where authority is frequently invoked as a substitute for evidence, anonymity is a small corrective — it forces engagement with the work itself.

L. is not anonymous for dramatic effect. The pseudonym reflects a genuine methodological commitment: conclusions should be evaluated on their internal logic, their data quality, and their testability — not on whether the person producing them has a professorship, a PhD, or a name that appears in other publications.

L. accepts correspondence from researchers who engage with the work on its merits. Requests to reveal identity are not entertained. Requests to discuss methodology are welcomed.

Published Work

All papers are deposited on Zenodo with full datasets. arXiv preprints available where indicated.

Paper 7 · Typological Analysis

Structural Fingerprinting of the Voynich Manuscript: A BPE-Based Typological Comparison Across 29 Language Families

The core methodological paper. Applies byte-pair encoding segmentation to 55 corpora across 29 language families and computes VMML, Boundary Concentration, and CBMI scores against the Voynich text. Identifies a Philippine-branch morphological profile — the only zone in typological space partially overlapping with the Voynich Discriminant Zone. Explicit uncertainty quantification and adversarial counterarguments are embedded in the paper body.

2026 Zenodo · Paper 7 Details

Paper 8 · Austronesian Validation

Austronesian Focus-Morphology and the Voynich Morpheme Boundary: A Corpus-Based Assessment

Extension paper testing the Paper 7 hypothesis against a broader Austronesian corpus, with particular attention to the morphological density gradient distinguishing Philippine-branch languages from other Austronesian subfamilies. Introduces the cross-text VMML instability finding (Tagalog: Δ = 0.336 across Rizal novels). Includes an independent replication protocol for external researchers and a permutation test establishing that Voynich structural signatures are not an artifact of corpus size.

2026 Zenodo · Paper 8 Details

Currently Testing

Corpus acquisition and validation in progress. Results will be added to the What It Isn't tracker as they meet the evidence threshold.

In progress

Formal literary Cebuano

Acquiring corpus equivalent to the Rizal novels used for Tagalog. Register bias in NLLB corpus identified; literary register test pending.

In progress

Kapampangan

High priority. Philippine-branch language with complex focus-morphology architecture. Corpus sourcing underway.

In progress

Hiligaynon

~9.3M speakers. Philippine voice system, distinct morphology. Would test the VMML gradient hypothesis against a third Philippine-branch data point.

In progress

Malagasy

Austronesian isolate (Madagascar). VOS word order, morphologically distinct from Philippine-branch. Geographic outlier test for whether high-VMML is a Philippine-specific or broader Austronesian property.

The Work

The Approach

Research Philosophy

On Anonymity

Published Work

Structural Fingerprinting of the Voynich Manuscript: A BPE-Based Typological Comparison Across 29 Language Families

Austronesian Focus-Morphology and the Voynich Morpheme Boundary: A Corpus-Based Assessment

Currently Testing

Technical Correspondence