Abstract
We evaluate four count-based and predictive distributional semantic models of Ancient Greek against AGREE, a composite benchmark of human judgements, to assess their ability to retrieve semantic relatedness. On the basis of the observations deriving from the analysis of the results, we design a procedure for a larger-scale intrinsic evaluation of count-based and predictive language models, including syntactic embeddings. We also propose possible ways of exploiting the different layers of the whole AGREE benchmark (including both human- and machine-generated data) and different evaluation metrics.
Original language | English |
---|---|
Title of host publication | Proceedings of the 1st International Workshop on Ancient Language Processing (ALP) at RANLP |
Publisher | INCOMA Ltd., Shoumen, Bulgaria |
Pages | 49–58 |
Publication status | E-pub ahead of print - Sept 2023 |