Research papers, datasets, and reference materials — free to download for academic and personal use. All materials are licensed under CC BY 4.0.
Peer-reviewed preprints published on Zenodo. Cite using the DOI.
Typological validation of the Voynich discriminant zone. 55+ corpora, 30+ language families. Identifies Tagalog as the sole natural-language corpus entering the Voynich CI. Six extended analyses.
⬇ Download DOCX Zenodo →Implications for Voynich Manuscript discrimination. Tests whether Tagalog's elevated VMML generalises across texts. Austronesian family expansion. Character permutation test with CI95.
⬇ Download DOCX Zenodo →Raw metric results from our corpus analysis. Open for replication and extension.
VMML, BC, and CBMI values for all 9 corpora in the Paper 8 Austronesian expansion — including Tagalog (2 texts), Ilocano, Cebuano, Indonesian, Malay, Basque, and the Voynich canonical reference.
⬇ Download CSVA single-page overview of the typological analysis — key metrics, selected results table, main findings. Print-ready (Ctrl+P → Save as PDF). Suitable for academic sharing.
⬇ Open & PrintUse these references when citing our research.
Silva, F. J. F. da. (2026). BPE morpheme-length clustering across 55 writing system corpora: Typological validation, Voynich discriminant zone, and six extended analyses. Zenodo. https://doi.org/10.5281/zenodo.20386119
Silva, F. J. F. da. (2026). BPE VMML cross-text instability in Tagalog: Implications for Voynich Manuscript discrimination. Zenodo. https://doi.org/10.5281/zenodo.20467972