This workshop explores how convolutional and recurrent neural networks can be combined to generate effective descriptions of content within images and video clips.
Learn how to train a network using TensorFlow and the Microsoft Common Objects in Context (COCO) dataset to generate captions from images and video by:
Upon completion, you’ll be able to solve deep learning problems that require multiple types of data inputs.
Instructor: Luigi Troiano (troiano@unisannio.it)
Teaching Assistant: Elena Mejuto Villa (mejutovilla@unisannio.it)
Laboratorio Multimediale
Building B2
84084 Campus di Fisciano (Italy)
Palazzo Bosco Lucarelli
Primo Piano - Laboratorio Informatica
82100 Benevento (Italy)
Piazza dei Cavalieri, 7
56100 Pisa (Italy)