A survey on machine learning techniques in movie revenue prediction

SN Computer Science

With the growing number of literature on movie revenue prediction using machine learning techniques in recent years, a systemic review will help in strengthening the understanding of this research domain. Therefore, this article is aimed at determining the sources of data, the techniques, the features, and the evaluation metrics used in movie revenue prediction. We selected 36 relevant articles based defined inclusion and exclusion criteria. The review analysis found out that US cinema attracted the highest number of publications, followed by the Chinese cinema, Korean cinema, and Indian cinema in that order. We also found out that regression, classification and clustering data mining approaches were used in the reviewed articles, with regression and classification carrying the largest share. Furthermore, we observed that cast, number of screens, and genre, are the most widely used features in movie revenue prediction. We also identified multiple linear regression and support vector machines are the most commonly used prediction algorithms, while mean absolute percentage error, root-mean-square error, and average percentage hit rate are the evaluation metrics used the most. Our review identified some problems and research directions in movie revenue prediction.

comments powered by Disqus