Smart File Type Detection Using PHP

by on 9th February 2009 with 28 Comments

padlockIn most web applications today, there is a need to allow users to upload images, audio and video files. Sometimes, we also need to restrict certain types of files from being uploaded – an executable file being an obvious example.

Security aside, one might also want to prevent users from misusing the upload facility, e.g. uploading copyrighted music files illegally and using the service to promote piracy! In this article, we’ll look into a few ways in which we can achieve this.

File type detection using extension and MIME types

I am not going to talk about this in too much detail as after all, this is what we normally do when we want to restrict certain files. We simply get the MIME type of the file using $_FILES['myFile']['type'] and check if it’s of a valid type.

Or we might scan the last few characters of the file name and reject files ending with a certain extension. Unfortunately, these methods are hardly sufficient, as one can easily change the extension of a file to bypass this restriction. Furthermore, MIME type information is given by the browser and most browsers, if not all, determine the mime type based upon the file’s extension! Hence MIME types can be pretty easily spoofed too.

Let’s now explore some others ways which offer better fool-proofness.

Using Magic Bytes

The best way to determine the file type is by examining the first few bytes of a file – referred to as “magic bytes”. Magic bytes are essentially signatures that vary in length between 2 to 40 bytes in the file headers, or at the end of a file. There are several hundred types of files, and quite a few of them have several file signatures associated with them. You can see a list of file signatures over here.

Although inconsistent, this is our best bet in detecting file types reliably. This seemingly difficult task has been made really easy by a PECL extension called Fileinfo. As of PHP 5.3, Fileinfo is shipped with the main distribution and is enabled by default, so this is definitely a robust and simple way to detect and impose restrictions on the types of files uploaded.

Let’s now see how we can detect a file type using Fileinfo:

$file = "/path/to/file";

// in PHP 4, we can do:
$fhandle = finfo_open(FILEINFO_MIME);
$mime_type = finfo_file($fhandle,$file); // e.g. gives "image/jpeg" 

// in PHP 5, we can do:

$file_info = new finfo(FILEINFO_MIME);	// object oriented approach!
$mime_type = $file_info->buffer(file_get_contents($file));  // e.g. gives "image/jpeg"

switch($mime_type) {
	case "image/jpeg":
		// your actions go here...
}

Handling image uploads

If you intend to allow only image uploads, then you can use the inbuilt getimagesize() function to ensure that the user is actually uploading a valid image file. This functions returns false, if the file is not a valid image file.

//  Let's assume that the name attribute of the file input field you have used is "myfile"

$tempFile =  $_FILES['myFile']['tmp_name'];  // path of the temp file created by PHP during upload
$imginfo_array = getimagesize($tempFile);   // returns a false if not a valid image file

if ($imginfo_array !== false) {
    $mime_type = $imginfo_array['mime'];
    switch($mime_type) { 

	case "image/jpeg":
		// your actions go here...

    }
}
else {
    echo "This is not a valid image file";
}

Reading and interpreting magic bytes manually

If for some reason, you are not able to install Fileinfo, then you can still manually determine the file type by reading the first few bytes of a file and comparing them with known magic bytes associated with the particular file type. This process definitely has an element of trial and error, because there is still a chance that there are a few undocumented magic bytes associated with legitimate file formats. As a result, valid files could be rejected by your system. However it’s not impossible as a couple of years back, I was asked to work on a script that allowed only genuine mp3 files to be uploaded, and since we could not use Fileinfo, we resorted to this manual scanning. It took me a while to account for some of the undocumented magic bytes for mp3, but pretty soon, I got a stable upload script running.

Before I end, I would just like to part with a general word of caution: Make sure that you never call an include() with a file that was uploaded, as PHP code can very well be hidden as part of the picture, and the picture would pass your tests for file validation just fine, only to cause havoc when executed by the server.

Comments & Discussion

28 Comments

  • http://www.kavoir.com Yang Yang

    For images, a simpler way might be using getimagesize() which returns an array containing the mime type of the image.

  • Andy

    Useful tutorial…nice and clear with code thanks.

  • Pingback: You are now listed on FAQPAL()

  • http://www.pliggs.com Pliggs

    This is great, I am looking for a snippet like this to ensure that the uploads are images for an upcoming project, this may do the trick nicely.

    Thanks.

  • Paul-H

    Useful tutorial in fact. Thank’s for all :)

  • Developer

    Really nice article..
    It shows the depth of the knowledge..

    Thanks a lot

  • http://alexwoz.com Alex

    Not sure if this is typo or not (not the most knowledgeable w/ PHP), but…

    $tmpFile = $_FILES['myFile']['tmp_name']; // path of the temp file created by PHP during upload
    $imginfo_array = getimagesize($tempFile); // returns a false if not a valid image file

    getimagesize uses “tempFile” as its argument, while “tmpFile” was initialized above. Shouldn’t getimagesize use “tmpFile”?

  • http://www.kishorelive.com Kishore Nallan

    @Alex: Thanks for that, typo fixed!

  • http://slidex.co.il Itay

    Great post, just what I was looking for regarding file types.

    Thanks.

  • http://boards.sonypictures.com/boards/member.php?u=79902 Only coach handbags

    Thank You, Only coach handbags, [url= http://boards.sonypictures.com/boards/member.php?u=79902 ]Only coach handbags[/url], ouqfqx,

  • Alejandro

    mime_type returns

    “application/pdf; charset=binary”

    so the comparison won’t work.

    how can I fix this?

  • http://www.uwsp.edu/ATHLETICS/mbb/05-06/index.htm Pointer Men’s Basketball

    You you should change the page name Smart File Type Detection Using PHP | Design Shack to more better for your content you write. I loved the the writing all the same.

  • brandon

    failed:

    file_info = new finfo(FILEINFO_MIME); // object oriented approach!
    $mime_type = $file_info->buffer(file_get_contents($file));

    yeilds:

    Fatal error: Class ‘finfo’ not found

  • aris

    same problem with brandon.
    Where can we find this fininfo class and include it?

  • milly

    Thank you SO much for this. I’ve been messing with mime_content_type() and a bunch of other finicky methods, but this worked perfectly. Just what I needed.

  • http://www.yahoo.com/ Destrey

    That’s 2 cveler by half and 2×2 clever 4 me. Thanks!

  • Dejan

    Hi

    Nice tutorial.
    I wonder, if possible, to post the code regarding magic bytes for mp3, since I’m facing the same problem.
    I can see from one of the links in your post that the magic number for mp3 is ’49 44 33′, but still want to know if this is a reliable way to get the mime type.

    Regards

  • http://www.wesayhowhigh.com Jump

    Love the OOP way of finding the mime, never used that before. Cheers!

  • http://ihacklog.com 荒野无灯

    GOOD article ,I’ve translated it on my blog:http://ihacklog.com/?p=4693

  • Radiation

    Bah! Don’t use getimagesize() to validate images! It is very easy to get php code through getimagesize(). The best way to ensure that images are safe is to place them in a directory that doesn’t have executable permissions. And don’t ever! include() them.

  • http://www.shareguay.com sebastian

    genial :)

  • fkabeer

    I renamed exe file with jpg extension and then tried your code gave me jpg. Your extension cannot be trusted because any user can cheat with it. plz do not consider my words discouraging, it should make you work even better.

  • Aneeq

    The code to upload file in PHP is very simple, but we need to understand the flow which is a below.

    1. Browse the file from a local system
    2. Upload to server
    3. Server keeps it on a temporary path
    5. Copy from temporary to permanent path

    Create a file upload form

    Filename:

    Note:

    * An enctype attribute of the tag has been specified.
    * This attribute specifies which content-type to use when submitting the form
    * We have used “multipart/form-data” to upload binary data, like the contents of a file, to be uploaded
    * If proper enctype is not provided, upload will not work.
    * File upload is a huge security risk so you must check what type of files are being uploade

    Create a file upload script (upload-file.php)

    This will upload the file to the specified path.

    Note:

    * The default file upload size using a browser is usually 2MB so files larger than this size may not upload. You will have to alter the file upload setting on the server.
    * You need to set write permission to the folder where file needs to be upload.
    * In our case, the “uploads” folder needs to have a 777 permission on a linux/unix server.

    Source:

    http://phphelp.co/2012/05/18/how-to-upload-a-file-in-php/

    OR

    http://addr.pk/a478

  • Mike M

    Uploading a .doc file returns application/msword, whereas a .docx file returns type application/zip.

  • Ajit

    Good for image. But what to do for pdf. I can upload a exe file by renaming as pdf

  • Bob

    OK, so say you do upload an exe file by renaming it as .pdf. It’s going to show .pdf on the server, so if you try to access it from a browser, it won’t work because it’s not a valid pdf – and it won’t execute the exe file because it’s not an exe file. So am I missing something here? I’m not sure I understand the concern and how the file would be dangerous with the wrong extension.

  • http://astaza.com/ العاب

    thank you very much but how to detect if the uploaded is video file ?
    thanks

  • mikl

    3dinfo Jun 25, 11, 08:32AM | #1
    Joined: Jun 23, 11
    Threads: 1
    Posts: 7

    I was lucky to order an essay with the powered by Limited. They have a wide range of academic papers, including essay, reviews, dissertations, case studies, and even admission service. I was delighted with the services of . Though, I have encountered a lot of similar custom writing service web sites, by e Limited is much better in relation to price and quality balance. I ordered an essay – delivery of the essay was in time, although I was a bit disappointed with the support service – there were some technical problems or something like that. In general, my demands were met: nice prices, high quality of academic paper performance, fast delivery. It was worth it, indeed. I’d better say premium class service in comparison with the websites I had visited before. No cheating, nice discounts if you are either a regular customer, or a newcomer. is a nice academic writing solution, indeed.
    WritersBeware Edited by: WritersBeware Jun 25, 11, 09:38AM | #2

    3dinfo, f*ck you, spammer.

    = fraudulent, unqualified, deceptive, ESL site from Ukraine

    “Eugene,” get your spammers out of here, or you will have a serious problem on your hands.
    pheelyks Jun 25, 11, 10:03AM | #3

    (aka , aka UVO, aka ) is operated out of Ukraine and hires writers from countries like Jordan, Pakistan, India, and other places where $3-$7/page is considered an excellent wage. This wouldn’t be a problem if their writers actually knew how to speak (and write) fluent English, and if they didn’t institute a bunch of practices with the sole purpose of cheating their writers and their customers. Things being as they are, however, anyone that actually wants an academic quality paper written by a native (or at least fluent) English speaker should still well clear of this company.
    Twig Jun 25, 11, 10:23AM | #4
    Joined: May 10, 11
    Threads: 2
    Posts: 141

    3dinfo:
    was lucky to order an essay with the powered by Limited. They have a wide range of academic papers, including essay, reviews, dissertations, case studies, and even admission service. I was delighted with the services of . Though, I have encountered a lot of similar custom writing service web sites, by Limited is much better in relation to price and quality balance. I ordered an essay – delivery of the essay was in time, although I was a bit disappointed with the support service – there were some technical problems or something like that. In general, my demands were met: nice prices, high quality of academic paper performance, fast delivery. It was worth it, indeed. I’d better say premium class service in comparison with the websites I had visited before. No cheating, nice discounts if you are either a regular customer, or a newcomer. is a nice academic writing solution, indeed.

    It is very sad for a person of your caliber to promote a scamming site here.
    pheelyks Jun 25, 11, 10:36AM | #5

    Twig:
    a person of your caliber

    A person of what caliber, exactly? They came here to lie because they were paid to do so. Their post does not make them seem intelligent, honest, noble, or any other quality associated with people of “high caliber.” On that note, you yourself seem to be of increasingly questionable caliber–exactly what is it you think you add to this forum?
    Twig Jun 25, 11, 10:54AM | #6
    Joined: May 10, 11
    Threads: 2
    Posts: 141

    pheelyks:
    A person of what caliber, exactly?

    Low caliber.
    pheelyks:
    On that note, you yourself seem to be of increasingly questionable caliber–exactly what is it you think you add to this forum?

    I do not think you are the right person to ask me such a query. Sh*t
    pheelyks Jun 25, 11, 11:42AM | #7

    Twig:
    I do not think you are the right person to ask me such a query. Sh*t

    Any one here has every right to question someone else’s legitimacy. Your lack of explanation is more indicative of your character and potential nefariousness than a simple acknowledgment that you’re a Kenyan writer looking for a new way to scam customers.
    Twig Edited by: Twig Jun 25, 11, 12:08PM | #8
    Joined: May 10, 11
    Threads: 2
    Posts: 141

    pheelyks:
    that you’re a Kenyan writer looking for a new way to scam customers.

    Can you prove this. You are just fond of making unsubstantiated allegations. If you are sure I am a scamming writer, go ahead and expunge my username. Fu*k. You think you are the King/Prince of this forum. Provide evidence to show that I am indeed a scamming writer. How many students have I scammed? How many Pheelyks? Provide a list of their names and the amount of money.
    pheelyks Jun 25, 11, 05:09PM | #9

    Twig:
    Can you prove this.

    No.
    Twig:
    You are just fond of making unsubstantiated allegations

    No, I’m not, but when you have no apparent reason for posting here and refuse to answer a direct question pertaining to that purpose, I will voice my suspicions aloud.
    Twig:
    If you are sure I am a scamming writer, go ahead and expunge my username.

    I am not a moderator here, nor do I have any influence over the moderators (I don’t even know who they are). I am not sure that you’re a scamming writer, but that’s certainly what it seems like. Since you won’t share what you’re actually doing here, I am forced to guess.
    Twig:
    You think you are the King/Prince of this forum

    No.
    Twig:
    Provide evidence to show that I am indeed a scamming writer

    I don’t have any, and never claimed to. Again, all I have are my suspicions.

    If you’re not a scam writer, tell us what your purpose for being here is. It certainly isn’t to add anything useful to the discussions.
    Twig Edited by: Twig Jun 25, 11, 10:25PM | #10
    Joined: May 10, 11
    Threads: 2
    Posts: 141

    pheelyks:
    I don’t have any, and never claimed to. Again, all I have are my suspicions.

    Who really cares? If you have admitted that you are not the moderator, any future question(s) from you regarding my credibility will go unanswered. Meanwhile, keep on suspecting me. I am a writer with integrity, and writes for 20 private clients.
    Twig Edited by: Twig Jun 25, 11, 10:42PM | #11
    Joined: May 10, 11
    Threads: 2
    Posts: 141

    pheelyks:
    If you’re not a scam writer, tell us what your purpose for being here is.

    And what about you? Your reasons for being here are:
    1) To criticize others
    2) To make unsubstantiated claims
    3) To trade insults
    4) To write crap
    5) To show off that you are the most talented writer in the world. Do you possess a PhD in English Language? No.
    Do they really help anyone? No.
    craftywriter Jun 25, 11, 11:10PM | #12
    Joined: Jun 25, 11
    Posts: 3

    Twig, never argue with a fool (pheelyks), he will lower you to his level, and then beat you with experience.
    craftywriter Jun 25, 11, 11:34PM | #13
    Joined: Jun 25, 11
    Posts: 3

    pheelyks Jun 26, 11, 09:55AM | #14

    Twig:
    I am a writer with integrity, and writes for 20 private clients.

    Wow. 20. They must all be really pleased at the barely passing grades you are able to earn with those ESL skills.
    Twig:
    1) To criticize others

    Only when they deserve criticism. This site is devoted to exposing scam writers/companies, after all.
    Twig:
    2) To make unsubstantiated claims

    Not at all. i substantiate my claims as much as possible, and clearly acknowledge when I am voicing suspicions rather than facts.
    Twig:
    3) To trade insults

    I’d rather not, and actually “insult” other posters a lot less than they attempt to insult me. The fact that you take criticism of your language skills as an insult does not mean that it was my intent to insult; it is more evidence of your emotional state than of mine.
    Twig:
    4) To write crap

Subscribe
Membership
About the Author