In this project, I've designed and trained a CNN-RNN (Convolutional Neural Network - Recurrent Neural Network) model for automatically generating image captions. The network is trained on the Microsoft Common Objects in COntext (MS COCO) dataset.
Examples of inference can be found in 3_Inference.ipynb.