01 · Corpus Contributors

Corpus Contributors

Do you have access to natural language corpora we haven't tested? We are particularly seeking formal literary text in Philippine-branch and Austronesian languages with sufficient token density for typological analysis.

Languages currently prioritized: formal literary Cebuano, Kapampangan, Hiligaynon, Malagasy, Ilocano, Waray. Minimum threshold: n ≥ 30,000 tokens. Corpus should be pre-modern or formal register where possible.

Cebuano Kapampangan Hiligaynon Malagasy n ≥ 30k tokens
Contact us
02 · Linguistic Experts

Linguistic Experts

We need expert assessment of our morphological density gradient hypothesis — specifically the claim that the Voynich text's morpheme boundary behavior is consistent with Philippine focus-morphology rather than agglutinative or isolating typologies.

Specialists sought: Philippine focus-morphology, Austronesian typology, historical linguistics of insular Southeast Asia, morphological complexity metrics. We can provide the full dataset and pipeline for independent review.

Austronesian typology Focus morphology Historical linguistics Peer assessment
Contact us
03 · Computational Researchers

Computational Researchers

Researchers with experience in BPE tokenization, unsupervised morpheme segmentation, or corpus linguistics who want to replicate or extend the analysis. Replication is explicitly encouraged — we would rather be refuted by good methodology than confirmed by bad faith agreement.

Full data access provided: raw corpus, tokenization pipeline, segmentation outputs, comparison matrices. Code is available on request pending formal collaboration agreement. No institutional affiliation required.

BPE tokenization Corpus linguistics Replication Full data access
Contact us
04 · Institutional Partners

Institutional Partners

Academic institutions or libraries interested in formal collaboration. The typological fingerprinting methodology developed for the Voynich Manuscript is generalizable — it can be applied to any sufficiently large undeciphered script corpus.

We can provide the complete analysis pipeline for manuscript collections, run typological analysis on institutional corpora, and co-author publications under appropriate academic frameworks. We welcome inquiries from linguistics departments, digital humanities centers, and rare manuscript libraries.

Academic institutions Libraries Digital humanities Co-authorship
Contact us

What We Offer Collaborators

Authorship

Co-authorship

Substantive contributions to methodology or corpus are credited as co-authorship on relevant publications.

Data

Full data access

All raw corpora, segmentation outputs, comparison matrices, and intermediate analysis files available to formal collaborators.

Documentation

Methodology docs

Complete technical documentation of the BPE pipeline, structural fingerprinting algorithm, and typological comparison framework.

Attribution

Attribution

All contributions acknowledged publicly in paper acknowledgments, Zenodo dataset records, and this site.

What we do not accept: We do not accept decipherment proposals, linguistic guesses, or "I believe it says X" correspondence. We are looking for methodological collaboration, not solutions. If you have a proposed reading of the text, voynich.ninja is the appropriate forum — not this initiative. Our work is typological, not translational.

Get in Touch

Use this form to introduce yourself and describe the nature of your potential contribution. We read all correspondence and respond to those that match our current research needs within one week.

All inquiries are confidential. We respond to technical correspondence within one week.