As before, this page will first list a number of observations about the Voynich MS syntax or grammar that may be found in the earlier literature, and then concentrate on analyses that have been performed more recently. The latter includes among others cluster analyses detecting variation of language 'usage' across the MS and long-range correlation studies which indicate that the MS text seems to behave just like normal language.
Several sources list 'odd features' of the MS text. In order to present a complete overview, they will be collected here. (Though this is not yet complete).
(Note: Tiltman treats f as a variant form of k and p as a variant form of t. In the following, characters or sequences in parentheses represent such variant forms).
'Unattached' finals scattered throughout language 'B' texts in considerable profusion; generally much less noticeable in Language 'A'.
Several authors have remarked that the MS curiously lacks repeated phrases of 2 or more words. Such repeated expressions would typically be expected, for example, in the herbal section, as may be observed in editions of medieval herbal MSs (4). Various different mechanisms can be proposed that would lead to this phenomenon (e.g. word transposition or scrambling). It could also be an indication that the text is meaningless. Furthermore, it may be partly due to errors in the text and the fact that orthography was not yet standardised, though one may seriously doubt whether that is sufficient to explain it. This is something that could be tested numerically, by taking an edited text and introducing errors and non-standard orthography. I do not believe that this has been attempted so far. In any case, the reason for the missing long repeats is so far not understood, and it remains a key issue.
The principle of cluster analysis has been explained briefly in the introductory page. Typically, it is applied to the MS text after it has been split into pages, in order to analyse the correspondence between the texts of the different pages. They tend to confirm Currier's split between A and B languages, but also demonstrate that there are additional details. For the time being, I just list the following contributions, most of which may be read on-line.
This technique, used by Brendan McKay and Mark Perakh, has been applied to a number of plain language texts and also the Voynich MS. While the principle behind the computed statistic is not fully understood, it clearly appears that meaningful texts generate a particular curve exhibiting one minimum, which is equally observed for the Voynich MS. If the words in the text are shuffled around arbitrarily, the curve changes into a flat line.
The tentative conclusion is that the text in the Voynich MS appears to represent something meaningful. The >>articles are available online
In a 2001 paper in Cryptologia (6) Gabriel Landini comes to a very similar conclusion.
Amancio et al. published a paper in 2013 (7), based on the results of applying 'big data' analysis techniques to the Voynich MS text. This paper is >>available on-line. It does not concentrate only on the Voynich MS, but looks at it as an example. The main conclusion is that the text in the Voynich MS is significantly different from a scrabled version of the same text, i.e. it is not an arbitrary sequence of words.
The important 2007 paper by A. Schinner (8) deserves a thorough discussion, which will be added here shortly. It treats several anomalies of the Voynich MS text, such as vertical patterns, and anomalous 'random walk' behaviour, which led the author to conclude that the text is possibly meaningless, and the result of a hoax.
Torsten Timm wrote a paper in 2014, with the most recent revision in Dec.2015, which is >>available on-line here in which he proposes a method by which the Voynich MS would have been created. This works along the lines of the vertical patterns mentioned above. Apart from the theory, it includes a large number of additional statistics.
In a study looking at different statistics of word pairs, where the pair of words in question may be adjacent in the text, or separated by any number of intermediate words, Fincher noted that the text in the Voynich MS does not at all behave in the same way as plain texts in several known languages. I don't know whether the paper is available on-line. The analysis is closely related to the problem of lack of repeating strings, and also suffers from possible inconsistent spelling and errors in the text or transliteration.
Sravana Reddy and Kevin Knight produced a publication in 2011 (9) with a large selection of old and new statistics of the text of the Voynich MS, titled "What we know about the Voynich MS".
Some of the statistics and results of Reddy and Knight are revisited, and shown graphically, on >>this page by Sean Palmer.
The publication in 2013 by Montemurro and Zanette (10) received quite a bit of attention in the media because, contrary to the work of Rugg and Schinner, it appears to demonstrate that the Voynich MS contains a meaningful message. To be discussed in more detail.
In 2014 a group of students under prof. Derek Abbott performed an extensive series of analyses which are presented in an online wiki.