Browse 200 verse formula clusters from the Finnic runosong corpus (4.36M verses, 292,092 poems). Formulas are groups of closely similar verse lines that recur across many poems — the building blocks of oral poetry.
In runosong (regilaul / Kalevala-meter poetry), singers composed by combining stock verse formulas. A formula cluster groups verse lines that are near-identical variants of each other, found across many poems, places, and singers.
The system analyzed 4.36 million verse lines using four similarity algorithms (Jaccard wordform overlap, TF-IDF lemma cosine, cross-lingual translation pivot, and character bigram similarity), then combined evidence from all four algorithms to identify clusters where verses are similar across multiple dimensions. The 200 largest clusters are shown here.
Members = total verse occurrences in the cluster. Places = distinct collection locations. The language bar shows the Estonian (blue) / Finnish (orange) proportion. Bilingual clusters appear in both language traditions.
When you select a formula, the system looks up its sample variant texts in the verse search index to find where they were collected. This covers a subset of the cluster's full membership (5 representative variants out of potentially thousands).