From Computational Aesthetic Prediction for Images to Films and Online Videos

François Lemarchand


In the last decade, creating and sharing videos online has become a mainstream movement and has led to some creators generating one personal video per day, also called daily vlogging. Although robust solutions exist to suggest photographs based on aesthetic criteria, the rising number of online videos created and watched means that such recommendation systems are required more than ever for videos. The main purpose of this paper is to transfer the skill of computational aesthetic classification of photographs to videos while developing new ways of investigating video creation. Using a dataset of photographs rated on aesthetic criteria by an internet community and recently developed feature extraction algorithms, the computational aesthetic classifier is capable of state-of-the-art photograph classification depending on aesthetic preferences learnt from people’s ratings. On a test set of YouTube videos, the same system then displays satisfying aesthetic classification results that consist of an attempt to match the provided human aesthetic quality ratings. Achieving a transfer of skill from photograph to video classification, the computational classifier is used to analyze the evolution of aesthetics in feature films; this highlighted the aesthetic classifier’s visual preferences and caused some interesting patterns to emerge that were related to filmmakers’ decisions. Aesthetic classification makes it possible to observe the evolution of aesthetics over the careers of daily content creators thanks to their abundant and regular online video content. It can aid the investigation into the impact of aesthetics on the popularity of online videos using the available metadata about the internet audience's appreciation. This can also provide a new tool for video content creators to assess their work and assist them in the production of content of higher aesthetic quality.


: computational aesthetics; skill transferability; video classification; visual preferences.

Full Text:



Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijayanarasimhan, S. (2016, September 27). YouTube-8M: A Large-Scale Video Classification Benchmark. Retrieved from

Clark, C. (2014). The colors of motion. Retrieved from

Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2006). Studying aesthetics in photographic images using a computational approach. In A. Leonardis, H. Bischof, A. Pinz (Eds.) Computer Vision – ECCV

(pp. 288–301). Berlin, Germany: Springer-Verlag. doi:10.1007/11744078_23

Lemarchand, F. (2017). Fundamental visual features for aesthetic classification of photographs across datasets. Manuscript submitted for publication.

Lu, X., Lin, Z., Jin, H., Yang, J., & Wang, J. Z. (2014). RAPID: Rating pictorial aesthetics using deep learning. In Proceedings of the ACM International Conference on Multimedia - MM ’14 (pp. 457–466). doi:10.1145/2647868.2654927

Marchesotti, L., Perronnin, F., Larlus, D., & Csurka, G. (2011). Assessing the aesthetic quality of photographs using generic image descriptors. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1784–1791). doi:10.1109/ICCV.2011.6126444

Murray, N., Marchesotti, L., & Perronnin, F. (2012). AVA: A large-scale database for aesthetic visual analysis. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 2408–2415).

Niu, Y., & Liu, F. (2012). What makes a professional video? A computational aesthetics approach. IEEE Transactions on Circuits and Systems for Video Technology, 22(7), 1037–1049. doi:10.1109/TCSVT.2012.2189689

Romero, J., Machado, P., Carballal, A., & Santos, A. (2012). Using complexity estimates in aesthetic image classification. Journal of Mathematics and the Arts, 6(2–3), 125–136. doi:10.1080/17513472.2012.679514

Shulman, J. (2017). Photographs of films. Retrieved from

Tang, X., Luo, W., & Wang, X. (2013). Content-Based Photo Quality Assessment. IEEE Transactions on Multimedia, 15(8), 1930–1943. doi:10.1109/TMM.2013.2269899

Tzelepis, C., Mavridaki, E., Mezaris, V., & Patras, I. (2016). Video aesthetic quality assessment using kernel Support Vector Machine with isotropic Gaussian sample uncertainty (KSVM-IGSU). In 2016 IEEE International Conference on Image Processing: Procedeeings

(pp. 2410–2414). doi:10.1109/ICIP.2016.7532791

Yang, C.-Y., Yeh, H.-H., & Chen, C.-S. (2011). Video aesthetic quality assessment by combining semantically independent and dependent features. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing: Proceedings (pp. 1165–1168). doi:10.1109/ICASSP.2011.5946616


  • There are currently no refbacks.

Copyright (c) 2018 François Lemarchand