Ruminations: Test if uploaded file is JPEG, PNG or TIFF

Friday, April 17, 2009

Test if uploaded file is JPEG, PNG or TIFF

I've been looking at some of uploads that went wrong on YayArt lately, and it turns out that people sometimes submit images with the wrong extension, e.g. "someimage.png" when it's really a JPEG. This confuses the image backend we're using to process large images, VIPS, so it reports back an error.

I did a bit of googling, and it seems the easiest way out is to simply check the first few bytes of the file for magic numbers. So here's a bit of Python code for checking for whether the file data belongs to a JPEG, PNG or TIFF image:

def is_jpg(data):
    return data[:2] == '\xff\xd8'

def is_png(data):
    return data[:8] == '\x89PNG\x0d\x0a\x1a\x0a'

def is_tiff(data):
    return data[:4] == 'MM\x00\x2a' or data[:4] == 'II\x2a\x00'

If the file is already on disk, you can grab the first few bytes with

f = open("somefile.jpg", 'r')
data = f.read(11)
if is_jpeg(data):
    ext = ".jpg"
elif is_png(data):
    ext = ".jpg"

Of course this won't test that the whole file is valid. But it's easier to do that afterwards with an image library once the extension is correct.

The magic numbers are documented in the specifications for the formats. You can also find some help for other formats in the source code of the file command on Unix systems.

Update: I'm liking this so much that I ended up putting it in a separate file and making a convenience function for getting an extension like '.jpg'. Grab the Python file here. I also added support for GIF. Here's another easy reference for magic file numbers.

Second update: I've updated the code, there was a bug detecting JPEGs from certain digital cameras that put Exif data in the first segment. Suffice to check the two first bytes of the JPEG, then the problem does not occur.

16 comments:

Malte NuhnJune 4, 2009 at 12:23 PM
i guess one shouldn't reinvent the wheel... why not use the "file" command for this?

> file IMG_0019.JPG

IMG_0019.JPG: JPEG image data, EXIF standard

os.system("...") should do the work
ReplyDelete
Replies
Ole LaursenJune 5, 2009 at 12:45 PM
Well, file is neat and I did consider it, but it won't work unless you already have the file on disk (despite my example, I would like to get the name right before I write it) and it's harder to reason about (can file crash? what can go wrong when you use os.system?).

Also, using os.system in a web app is a bit scary, you have to double check that no user entered data can ever end up in the command, at least not unescaped.

So that's why. :)

I recently found out you can feed data in chunks to the Python Imaging Library, so another possibility is to feed it one chunk and see what happens.
ReplyDelete
Replies
UnknownJuly 3, 2009 at 6:23 AM
Thank you, Ole, for a very useful piece of code! It'll be in the next version of sqlpython to allow browsing of image BLOBs straight from the database.

As for UNIX's `file`... that's nice, but I don't believe it exists on windows, so no use for a cross-platform app!
ReplyDelete
Replies
Ole LaursenJuly 3, 2009 at 11:54 AM
Cool! Glad you can use it. :)
ReplyDelete
Replies
rs238August 11, 2009 at 2:10 AM
Just what I was looking for- thanks for sharing.
ReplyDelete
Replies
Toni LähdekorpiSeptember 17, 2010 at 6:28 AM
You don't actually need to write the file to disk, you can just pass it thru `| file -`
ReplyDelete
Replies
Ole LaursenSeptember 17, 2010 at 3:49 PM
Toni: that's an interesting idea. Here's a little snippet for doing it in Python:

import subprocess
f = open("test.jpg", 'r')
data = f.read(11)
p = subprocess.Popen(["/usr/bin/file", "-", "--mime-type", "-b"], stdin=subprocess.PIPE)
print p.communicate(data)[0]
# outputs "image/jpeg"
ReplyDelete
Replies
Joel Parker HendersonDecember 3, 2010 at 1:18 AM
Translations for Ruby:

def jpeg?(data)
return data[0,2]=="\xff\xd8"
end

To read a file from disk:

f = File.open(filename,'rb' # read binary
data = f.read(11)
f.close
if jpg?(data)
ext = ".jpg"
end

More magic numbers are http://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html
ReplyDelete
Replies
ElectricVoyager666October 4, 2023 at 10:23 PM
bitlis
kastamonu
çorum
van
sakarya
V3YD
ReplyDelete
Replies
ŞekerKralı21October 16, 2023 at 5:45 AM
https://titandijital.com.tr/
afyon parça eşya taşıma
düzce parça eşya taşıma
erzincan parça eşya taşıma
elazığ parça eşya taşıma
MTGGX
ReplyDelete
Replies
SolarDreamerXY123456789OIWEOctober 20, 2023 at 6:07 AM
maraş evden eve nakliyat
osmaniye evden eve nakliyat
adıyaman evden eve nakliyat
istanbul evden eve nakliyat
ordu evden eve nakliyat
E5N
ReplyDelete
Replies
765E9Yareli73328December 25, 2023 at 5:57 AM
3896F
sinop sesli sohbet sesli chat
ardahan yabancı görüntülü sohbet
yalova canlı sohbet siteleri
adana rastgele sohbet uygulaması
bingöl ücretsiz sohbet uygulamaları
yozgat görüntülü sohbet kızlarla
sohbet
karabük canlı görüntülü sohbet uygulamaları
uşak sohbet odaları
ReplyDelete
Replies
41252AllisonF5B7CJanuary 6, 2024 at 4:01 PM
36BD6
Arbitrum Coin Hangi Borsada
Tumblr Beğeni Satın Al
Kripto Para Kazanma Siteleri
Binance Ne Kadar Komisyon Alıyor
Binance Yaş Sınırı
Bitcoin Mining Nasıl Yapılır
Onlyfans Takipçi Hilesi
Spotify Takipçi Hilesi
Linkedin Takipçi Satın Al
ReplyDelete
Replies
AnonymousFebruary 1, 2025 at 3:30 AM
88D81795C6
takipçi fiyat
ReplyDelete
Replies
AnonymousFebruary 2, 2025 at 7:37 PM
F2D8B16F3C
takipci atma
ReplyDelete
Replies
AnonymousFebruary 6, 2025 at 7:33 PM
9CD400AC01
bayan takipçi
101 Okey Yalla Hediye Kodu
Binance Referans Kodu
M3u Listesi
Razer Gold Promosyon Kodu
Google Yorum Satın Al
101 Okey Yalla Hediye Kodu
Pubg New State Promosyon Kodu
Razer Gold Promosyon Kodu
ReplyDelete
Replies

Add comment