Google wants to be able to create accurate, automatic captions for complex photos, and ... take an image and directly produce a sentence to describe it. In Google's words, the model "combines ...