L. is an independent researcher focused on computational typology of historical manuscripts. The methodology developed here — BPE-based structural comparison across language families — emerged from a simple question: where does the Voynich Manuscript sit in the typological space of known writing systems?
The answer required building a comparison framework first. That framework — which segments corpora into byte-pair encoded tokens and computes morphological density, hapax ratio, token-type distribution, and affix boundary behavior across 29 language families — became the research itself.
The Voynich Manuscript is the test case that motivated the tool. The tool has outlasted any particular hypothesis about the manuscript, and can now be applied to other undeciphered scripts. The current focus is on validating the typological fingerprint methodology against known scripts before drawing firm conclusions about unknown ones.
Every dataset used in this research is available on Zenodo. Every claim on this site has gone through adversarial internal review before publication. The research is explicitly designed for replication — and L. would rather be refuted by good methodology than affirmed by credulous acceptance.
"Every finding goes through what we call 'hostile peer review' before publication. If a finding has a fatal flaw, it stays in the draft folder. The bar for going public is not confidence — it is surviving the strongest objection we can construct."
Adversarial self-criticism is not a quality-control step appended at the end of the process. It is the process. For each major finding, L. formulates the most rigorous, methodologically competent rebuttal possible before the finding is considered publishable. Not the most convenient rebuttal — the strongest one. The one that, if valid, would invalidate the result entirely.
This means findings take longer to appear here than they would in a less disciplined workflow. It also means that what does appear has been tested against its own failure modes. Draft folders accumulate. Publication lists move slowly. That asymmetry is intentional.
The practice emerged directly from observing how Voynich research fails. Proposals accumulate, peer circles form around them, and the proposals eventually collapse — not because the underlying ideas were necessarily wrong, but because they were not stress-tested before being invested in. L. runs the stress test first.
The typological approach used here is deliberately modest in its scope. Identifying a family-level structural signature is not the same as identifying a language. Identifying a language is not the same as reading the text. Identifying a statistical similarity is not the same as proving a causal relationship. These distinctions are maintained explicitly throughout the research and are not treated as rhetorical caveats — they are load-bearing methodological constraints.
The methodology is designed for falsifiability. Every metric has a defined threshold. Every threshold has an operational definition. Every claim maps to a specific dataset that can be downloaded, re-run, and challenged. If the methodology is wrong, it should be possible for another researcher to demonstrate that with the same tools. That kind of exposure is not a risk to be minimised — it is the condition of doing science rather than speculation.
L. would rather be refuted by good methodology than affirmed by credulous acceptance.
The research should stand on its methodology and data, not on the credentials of the researcher. In a field where authority is frequently invoked as a substitute for evidence, anonymity is a small corrective — it forces engagement with the work itself.
L. is not anonymous for dramatic effect. The pseudonym reflects a genuine methodological commitment: conclusions should be evaluated on their internal logic, their data quality, and their testability — not on whether the person producing them has a professorship, a PhD, or a name that appears in other publications.
L. accepts correspondence from researchers who engage with the work on its merits. Requests to reveal identity are not entertained. Requests to discuss methodology are welcomed.
All papers are deposited on Zenodo with full datasets. arXiv preprints available where indicated.
The core methodological paper. Applies byte-pair encoding segmentation to 55 corpora across 29 language families and computes VMML, Boundary Concentration, and CBMI scores against the Voynich text. Identifies a Philippine-branch morphological profile — the only zone in typological space partially overlapping with the Voynich Discriminant Zone. Explicit uncertainty quantification and adversarial counterarguments are embedded in the paper body.
Extension paper testing the Paper 7 hypothesis against a broader Austronesian corpus, with particular attention to the morphological density gradient distinguishing Philippine-branch languages from other Austronesian subfamilies. Introduces the cross-text VMML instability finding (Tagalog: Δ = 0.336 across Rizal novels). Includes an independent replication protocol for external researchers and a permutation test establishing that Voynich structural signatures are not an artifact of corpus size.
Corpus acquisition and validation in progress. Results will be added to the What It Isn't tracker as they meet the evidence threshold.
L. responds to technical correspondence within one week. Methodological questions, replication inquiries, and corpus contribution offers are welcome. Decipherment proposals and requests for identity disclosure are not.