Metadata can be described as a subcategory of information about data—the type of information that makes the data easier to find and manage. Several metadata categories are commonly known: administrative, statistical, structural, and descriptive. Most digital files have metadata, including PDF files, songs, videos, documents, etc.
Yes, PNG files have metadata. The type of metadata is referred to as textual information chunks. Each “chunk” represents a group of descriptive data correlating to the image, including author, title, description, copyright, and so on.
PNG “chunks” would fall under the category of descriptive metadata. Since each type of metadata has its unique characteristics, it’s essential to know how PNG metadata presents itself. Once you familiarize yourself with the function of each chunk and its keywords, you will be able to look at the metadata of any PNG and know how to read and identify it.
A PNG file (Portable Network Graphics) is an image file format specifically for raster or bitmap graphics. The characteristics of PNG files are transportable, legally burdenless, lossless compression, and so on.
A PNG file begins with an 8-byte signature with the same decimal value: 137 80 78 71 13 10 26 10. This signature lets you know that the rest of the file contains an image composed of the textual information chunks, starting with IHDR and ending with IEND. Let’s break down the textual information chunks.
- The Parts in Each Chunk
- Chunk Types (Critical and Ancillary)
- Textual Information Chunk Keywords
Delving into each of these classifications will give us a full picture of PNG metadata.
A chunk is simply the unit of storage for the data, and each chunk of metadata comprises four parts.
- Length (4 Bytes): Pertains to the number of bytes strictly in the data field of the chunk. The length should be treated as an unsigned integer, though the value can not exceed 2ˆ31-1 bytes.
- Chunk Type (4 Bytes): Defined by a four-letter code consisting of upper- and lower-case ASCII letters. Each letter represents a different element of the chunk type and is case-sensitive.
- Chunk Data (Length Bytes): Bytes of data associated with the chunk type. There may not be any so that this length can be zero.
- CRC (4 Bytes): Cyclical Redundancy Check is computed using the bytes of the chunk type code and chunk data only. Even for chunks containing no data, the CRC always exists.
The case of the letter defines each letter of the four-letter code pertaining to the chunk type.
- The first letter case indicates the chunk’s subcategory, critical or ancillary. If it is uppercase, it is critical, and if it is lowercase, it is ancillary.
- The second letter case stipulates if the chunk is “public” or “private.” If it is uppercase, it is public, and if it is lowercase, it is private.
- The third letter case conforms to PNG specifications in that it must be uppercase. It is not currently defined but is intended to be defined in the future.
- The fourth letter case lets editors that do not recognize the chunk know it is safe to copy.
Chunks are divided into two main subcategories, critical and ancillary. Critical implies that the chunk is required to display the image successfully. Though ancillary chunks include such elements as background color, the aspect ratio of the image, gamma value, and so on, they are not considered necessary to display the gist of the image.
The textual information of the image or the primary metadata for PNG files falls under the ancillary subcategory. Focusing on critical chunks, you have four of these in a specific order. You can refer to the chunk type four-letter code to better understand each one.
- IHDR: Height, width and bit deep of the image
- PLTE: The list of colors or the palette
- IDAT: The data of the image. There can be multiple IDAT chunks that the data is split up in, which increases the file size but allows the PNG to produce steadily.
- IEND: The end of the image, so there is nothing in the data field of this chunk.
If an encoder or a decoder comes across a critical chunk that it does not recognize, they must immediately stop their action and warn the user. If the same happened with an ancillary chunk, they might continue reading the file without fear or warning.
The textual information chunk is where you look for the metadata of a PNG file. Each one contains a keyword in its first field that points to the information type in the string of text.
|Title||Name associated with the image; caption|
|Description||The written representation of the image|
|Copyright||Notice of legal right given to the creator or originator of the image|
|Creation Time||The exact time when the original image was created|
|Software||Software, program or application used in the creation of the image|
|Disclaimer||Disavowed or denial of legal claim|
|Warning||Notice of potentially unpleasant nature of the content|
|Source||The device used in the creation of the image|
|Comment||Miscellaneous or GIF conversion notation|
If you consider converting a JPEG file to a PNG file, one piece of metadata will get lost since it is not originally contained in the PNG file metadata. EXIF or Exchangeable Image File Format primarily pertains to photographs, and it includes information on the device used to take the picture, the geolocation of the picture, date, time, etc.
Every time you use a phone or digital camera to take a picture, it’s automatically saved as a JPEG, and the EXIF data is stored in your device. Once that photo is converted, the EXIF will not carry over. However, you can store this chunk of metadata elsewhere before conversion, like an extensible metadata platform (XMP).
PNG files do include an ancillary form of metadata that, while it may not be crucial to the display of the image, provides other data that may be pertinent to an individual looking for the information that textual information chunks provide.