Investigating photographs
Introduction
Previous articles in this series talked about how to see how newly received files were filed, how to look at their date range and types of files, and the specifics about managing Word, Excel and PowerPoint documents. This article will discuss the proliferation of photographs, and how to harness the hidden data contained within (and why you would want to!).
On a construction of a major building in the years 2000-2001, photographs were taken but not catalogued, with the result that when we went to meet with them, they were only able to provide us with just two photographs. And that was because they used them on their office calendar! Eventually they found several thousand more, undeveloped, stored in a box and locked in a filing cabinet.
With the advent of digital photography, not just with digital camera but also mobile phones which can take photos, the number of pictures taken has not gone up only 10 times, but more like 100 times or a thousand times. However, this causes its own problems. If you have recently inherited all these images, how do you start to organise them? One way is through its metadata, also known as picture properties.
Picture properties
In traditional analogic photograph, properties of a picture used to be about the beauty of a set-up, whether the sunset was captured just right, or the pose of the people involved. It required looking at a picture and making a stylistic judgement.
Nowadays, digital photos keep a lot of information about how, and sometimes where, it was taken. Most cameras and mobile phones store instead each photo things such as:
- The make and model of the camera, sometimes down to the serial number,
- The time that the camera believes the picture was taken. Note that this can be quite different to the (probably wrong) date information shown in Windows Explorer, which may merely be the date it was loaded onto the computer. It does require that the camera be set up with the correct date – we have found that not to happen in two products we have worked on.
- Technical information about how a picture was taken, such as aperture and focus. See this article about the hidden information in iPhone pictures.
- For GPS-enabled cameras and phones, you can even get where a picture was taken. This can be plotted onto a map or animated through time to show where photos were taken through a project.
While stylistic judgements may be useful for individual photographs, the above metadata may be more useful for trying to organise thousands of them. But where to start?
Suggested actions
First of all, I would suggest cataloging the photos in Filecats Professional or Filecats Metadata with all of the photo metadata showing. This way, you have a permanent record of the pictures which you can then annotate or email to others, create quick analyses, or open the photos from the spreadsheet.
Secondly, I would identify duplicate photos. It is often the case that an organisation will try and store photos in a logical filing structure (e.g. Date/Location), but other people may download those pictures and then:
- Save it to their own filing structure, and maybe
- Rename the photos, making comparison with filenames useless.
I would therefore suggest that you identify the photo metadata which people are unlikely to have tampered with, and use that to identify duplicates. For example, if two files have identical:
- File size,
- Camera make/model,
- Time taken, according to the camera,
- Dimensions (Width and Height), and
- GPS co-ordinates (if available).
then they are bound to relate to the same photo, even if the filenames are different. These can be identified quickly by adding an extra column, adding a formula which combines these together in the one cell, and adding a COUNTIF to see if there is more than one of this combination in the one spreadsheet.
A variant can also be used to identify altered photographs. For example, if the camera is the same and the time taken is the same but the file size is different, then it is likely that one of those pictures has been altered (perhaps someone has added a caption, or annotated it).
Doing this on one project reduced the number of photographs to deal with down from 16,000 to 5,800.
Thirdly, I would use the GPS data to identify non-relevant photos. For one project, I found nearly 600 photos which were taken in a different country! These can then be disregarded.
Finally, I would move them all into one folder, whilst altering the filename to represent the photo date and the location where they had come from. This can be done in a spreadsheet using the MS-DOS COPY command, e.g.
COPY "C:\dir1\dir2\myphoto.jpg" "c:\AllPhotos\mydate`dir1`dir2`myphoto.jpg"
This formula can be easily calculated in Excel, copied into a batch file, and then executed.
If the date that you have used (“mydate”) is in Japanese format (i.e. yymmdd, e.g. 150616), then you hopefully have a reduced number of relevant photographs all in the one folder in date order, which should be easier to use. This folder can then of course be recopied.
Summary
Hopefully this article has given you some ideas about how to tame and use several thousands of photographs. Good use of photographic document properties is key, and being able to swiftly get them into a spreadsheet is essential.
If you want to tame your photographs, why not download a free 7-day trial of Filecats Professional (for Excel) or Filecats Metadata (if you don’t have Excel).
The last of this series of articles will show how to manage emails received in their native MSG format.