Why Photo Metadata Matters
For online news consumers, the article you see is only part of the story. Every piece of content that a media organisation publishes online carries a complex wrapper of embedded metadata, information that content creators and publishers use to manage workflows and to share content with each other, and their audiences.
Some of the most important developments happening today in news metadata are updates to photo standards being spearheaded by the International Press Telecommunications Council (IPTC). We recently caught up with Brendan Quinn, IPTC Managing Director, to discuss how changes in photo metadata standards are strengthening revenue streams for photo agencies and photographers, and even changing how consumers interact with images online.
Within the news business, photo metadata has been attracting a lot of attention recently. Briefly, what is photo metadata, and why is “standardised” metadata important?
There are a lot of different types of players in the photo industry – from staff photographers at news organisations to freelancers working with an agency. But everyone wants to make sure that their rights are respected and that the credit information – the equivalent of a byline – and caption information that they provide with an image is used all the way through the workflow.
Our IPTC Photo Metadata standard seeks to address this by embedding metadata fields into the actual image file. Some media organisations still use what’s called a “sidecar file,” where you have the image file and then a separate file that has to be emailed or FTPed every time you move the image around. With IPTC Photo Metadata, you can actually put that metadata into the image file itself – whether it’s a JPEG, GIF, or something else.
How could this affect the average person?
Google has started extracting embedded metadata from photos. Now, when you do a search on Google Images, Google surfaces rights information and copyright information showing who took that picture and who owns it. The reason: if someone wants to reuse an image for their own website – like a mom and pop business – they can look at the Google Image search results and use the rights information to find out who owns the image and get a license to reuse it.
This obviously protects the rights of the photographer or image owner, but can embedded metadata also protect against the falsification of images?
Not by itself, but some IPTC members are working on tools around that sort of thing. Some are looking at making “fingerprints” of images and then using those fingerprints to detect image reuse, which can be used to work out whether those instances of reuse are licensed or not. One of our members in China called Yuanben has a system that sends license reminders to people automatically if its software discovers websites using an image without an apparent license.
Finally, there are more and more people [working on tools to detect] falsification of images. We have some fields in the IPTC Photo Metadata standard around tracking edits of an image, which can help determine whether the image came from a camera or whether it came from some sort of manipulation tool. But it's always an arms race. The moment someone invents an algorithm for detecting that something's been faked, someone else is going to invent an algorithm for avoiding that algorithm.
IPTC is also working on something called the Photo Metadata Crawler Project. What’s that?
The idea is to look at different news publishers around the world and see how well they embed metadata using IPTC metadata fields. What we would like – and what photographers, photo libraries and photos agencies would like – is for embedded photo metadata to be retained all the way out to the end publisher. In other words, when someone sees an image at The Telegraph, the New York Post, or wherever, they can look at the original metadata added to that photo and know where to go if they want to relicense it.
To do that, the publishers have to keep those metadata fields in the photos all the way through to when they publish them on their websites. Unfortunately, that doesn’t always happen. A few years ago, there was a big move to make websites as performant as possible by stripping them down to as few bytes as possible. Part of that meant removing embedded metadata from images to save space.
What we wanted to do was kind of call that out a bit more, to identify the people who are really good at keeping their metadata embedded, and the people who were less good. So, we are building a crawler to go out and look at the top sites for retaining image metadata and will eventually create a kind of leader board.
I understand why photo metadata is important for rights owners and for people looking to relicense images, but are there implications for news photo consumers?
IPTC Photo Metadata has lots of descriptive fields and location-based fields and in a new development that we just approved at our meeting in October, we now have a way of describing a particular region within an image. Before this, metadata was applied to the image as a whole. But now we can apply it to a particular spot – from this pixel to that pixel – and place metadata attributes against a square, a circle, or a polygon.
One reason why this is helpful is for cropping. Many automated cropping tools take a guess at the most interesting part of the photo, or simply crop the middle and chop off everything else. But if you can give a cropping hint to the software to say, ‘Well, actually, this is a bit of an arty picture and the area of interest is actually on the right-hand side,’ then the software can do the cropping based on that information.
Another application could be name tagging. For instance, think about all those pictures that you see on newspaper websites with a group of people lined up in a row. Underneath the photo it might say, ‘From right to left is this person, this person, and this person.’ With embedded metadata on image regions, you could just put your mouse cursor over each face, and a text box pops up, showing who it is. That's totally doable when you've got a standardised agreement on how a photo is marked up, and how the news agency or the photo agency embeds all of that and sends it with the image all the way through the workflow.