Abstract: Probabilistic topic modeling algorithms like Latent Dirichlet Allocation (LDA) have become powerful tools for the analysis of large collections of documents (such as papers, projects, or funding applications) in science, technology an innovation (STI) policy design and monitoring. However, selecting an appropriate and stable topic model for a specific application (by adjusting the hyperparameters of the algorithm) is not a trivial problem. Common validation metrics like coherence or perplexity, which are focused on the quality of topics, are no...
(read more)
Topics: 
Information retrieval
Data mining
Machine learning