file-type
Detect the file type of a Buffer/Uint8Array/ArrayBuffer
The file type is detected by checking the magic number of the buffer.
This package is for detecting binary-based file formats, not text-based formats like .txt
, .csv
, .svg
, etc.
We accept contributions for commonly used modern file formats, not historical or obscure ones. Open an issue first for discussion.
Install
npm install file-type
This package is a ESM package. Your project needs to be ESM too. Read more.
If you use it with Webpack, you need the latest Webpack version and ensure you configure it correctly for ESM.
Usage
Node.js
Determine file type from a file:
import {fileTypeFromFile} from 'file-type';
console.log(await fileTypeFromFile('Unicorn.png'));
//=> {ext: 'png', mime: 'image/png'}
Determine file type from a Buffer, which may be a portion of the beginning of a file:
import {fileTypeFromBuffer} from 'file-type';
import {readChunk} from 'read-chunk';
const buffer = await readChunk('Unicorn.png', {length: 4100});
console.log(await fileTypeFromBuffer(buffer));
//=> {ext: 'png', mime: 'image/png'}
Determine file type from a stream:
import fs from 'node:fs';
import {fileTypeFromStream} from 'file-type';
const stream = fs.createReadStream('Unicorn.mp4');
console.log(await fileTypeFromStream(stream));
//=> {ext: 'mp4', mime: 'video/mp4'}
The stream method can also be used to read from a remote location:
import got from 'got';
import {fileTypeFromStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const stream = got.stream(url);
console.log(await fileTypeFromStream(stream));
//=> {ext: 'jpg', mime: 'image/jpeg'}
Another stream example:
import stream from 'node:stream';
import fs from 'node:fs';
import crypto from 'node:crypto';
import {fileTypeStream} from 'file-type';
const read = fs.createReadStream('encrypted.enc');
const decipher = crypto.createDecipheriv(alg, key, iv);
const streamWithFileType = await fileTypeStream(stream.pipeline(read, decipher));
console.log(streamWithFileType.fileType);
//=> {ext: 'mov', mime: 'video/quicktime'}
const write = fs.createWriteStream(`decrypted.${streamWithFileType.fileType.ext}`);
streamWithFileType.pipe(write);
Browser
import {fileTypeFromStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const response = await fetch(url);
const fileType = await fileTypeFromStream(response.body);
console.log(fileType);
//=> {ext: 'jpg', mime: 'image/jpeg'}
API
fileTypeFromBuffer(buffer)
Detect the file type of a Buffer
, Uint8Array
, or ArrayBuffer
.
The file type is detected by checking the magic number of the buffer.
If file access is available, it is recommended to use fileTypeFromFile()
instead.
Returns a Promise
for an object with the detected file type and MIME type:
-
ext
- One of the supported file types -
mime
- The MIME type
Or undefined
when there is no match.
buffer
Type: Buffer | Uint8Array | ArrayBuffer
A buffer representing file data. It works best if the buffer contains the entire file, it may work with a smaller portion as well.
fileTypeFromFile(filePath)
Detect the file type of a file path.
The file type is detected by checking the magic number of the buffer.
Returns a Promise
for an object with the detected file type and MIME type:
-
ext
- One of the supported file types -
mime
- The MIME type
Or undefined
when there is no match.
filePath
Type: string
The file path to parse.
fileTypeFromStream(stream)
Detect the file type of a Node.js readable stream.
The file type is detected by checking the magic number of the buffer.
Returns a Promise
for an object with the detected file type and MIME type:
-
ext
- One of the supported file types -
mime
- The MIME type
Or undefined
when there is no match.
stream
Type: stream.Readable
A readable stream representing file data.
fileTypeFromBlob(blob)
Detect the file type of a Blob
.
The file type is detected by checking the magic number of the buffer.
Returns a Promise
for an object with the detected file type and MIME type:
-
ext
- One of the supported file types -
mime
- The MIME type
Or undefined
when there is no match.
import {fileTypeFromBlob} from 'file-type';
const blob = new Blob(['<?xml version="1.0" encoding="ISO-8859-1" ?>'], {
type: 'plain/text',
endings: 'native'
});
console.log(await fileTypeFromBlob(blob));
//=> {ext: 'txt', mime: 'plain/text'}
fileTypeFromTokenizer(tokenizer)
Detect the file type from an ITokenizer
source.
This method is used internally, but can also be used for a special "tokenizer" reader.
A tokenizer propagates the internal read functions, allowing alternative transport mechanisms, to access files, to be implemented and used.
Returns a Promise
for an object with the detected file type and MIME type:
-
ext
- One of the supported file types -
mime
- The MIME type
Or undefined
when there is no match.
An example is @tokenizer/http
, which requests data using HTTP-range-requests. A difference with a conventional stream and the tokenizer, is that it can ignore (seek, fast-forward) in the stream. For example, you may only need and read the first 6 bytes, and the last 128 bytes, which may be an advantage in case reading the entire file would take longer.
import {makeTokenizer} from '@tokenizer/http';
import {fileTypeFromTokenizer} from 'file-type';
const audioTrackUrl = 'https://test-audio.netlify.com/Various%20Artists%20-%202009%20-%20netBloc%20Vol%2024_%20tiuqottigeloot%20%5BMP3-V2%5D/01%20-%20Diablo%20Swing%20Orchestra%20-%20Heroines.mp3';
const httpTokenizer = await makeTokenizer(audioTrackUrl);
const fileType = await fileTypeFromTokenizer(httpTokenizer);
console.log(fileType);
//=> {ext: 'mp3', mime: 'audio/mpeg'}
Or use @tokenizer/s3
to determine the file type of a file stored on Amazon S3:
import S3 from 'aws-sdk/clients/s3';
import {makeTokenizer} from '@tokenizer/s3';
import {fileTypeFromTokenizer} from 'file-type';
// Initialize the S3 client
const s3 = new S3();
// Initialize the S3 tokenizer.
const s3Tokenizer = await makeTokenizer(s3, {
Bucket: 'affectlab',
Key: '1min_35sec.mp4'
});
// Figure out what kind of file it is.
const fileType = await fileTypeFromTokenizer(s3Tokenizer);
console.log(fileType);
Note that only the minimum amount of data required to determine the file type is read (okay, just a bit extra to prevent too many fragmented reads).
tokenizer
Type: ITokenizer
A file source implementing the tokenizer interface.
fileTypeStream(readableStream, options?)
Returns a Promise
which resolves to the original readable stream argument, but with an added fileType
property, which is an object like the one returned from fileTypeFromFile()
.
This method can be handy to put in between a stream, but it comes with a price.
Internally stream()
builds up a buffer of sampleSize
bytes, used as a sample, to determine the file type.
The sample size impacts the file detection resolution.
A smaller sample size will result in lower probability of the best file type detection.
Note: This method is only available when using Node.js. Note: Requires Node.js 14 or later.
readableStream
Type: stream.Readable
options
Type: object
sampleSize
Type: number
Default: 4100
The sample size in bytes.
Example
import got from 'got';
import {fileTypeStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const stream1 = got.stream(url);
const stream2 = await fileTypeStream(stream1, {sampleSize: 1024});
if (stream2.fileType?.mime === 'image/jpeg') {
// stream2 can be used to stream the JPEG image (from the very beginning of the stream)
}
readableStream
Type: stream.Readable
The input stream.
supportedExtensions
Returns a Set<string>
of supported file extensions.
supportedMimeTypes
Returns a Set<string>
of supported MIME types.
Supported file types
-
3g2
- Multimedia container format defined by the 3GPP2 for 3G CDMA2000 multimedia services -
3gp
- Multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services -
3mf
- 3D Manufacturing Format -
7z
- 7-Zip archive -
Z
- Unix Compressed File -
aac
- Advanced Audio Coding -
ac3
- ATSC A/52 Audio File -
ace
- ACE archive -
ai
- Adobe Illustrator Artwork -
aif
- Audio Interchange file -
alias
- macOS Alias file -
amr
- Adaptive Multi-Rate audio codec -
ape
- Monkey's Audio -
apng
- Animated Portable Network Graphics -
ar
- Archive file -
arj
- Archive file -
arrow
- Columnar format for tables of data -
arw
- Sony Alpha Raw image file -
asar
- Archive format primarily used to enclose Electron applications -
asf
- Advanced Systems Format -
avi
- Audio Video Interleave file -
avif
- AV1 Image File Format -
avro
- Object container file developed by Apache Avro -
blend
- Blender project -
bmp
- Bitmap image file -
bpg
- Better Portable Graphics file -
bz2
- Archive file -
cab
- Cabinet file -
cfb
- Compount File Binary Format -
chm
- Microsoft Compiled HTML Help -
class
- Java class file -
cpio
- Cpio archive -
cr2
- Canon Raw image file (v2) -
cr3
- Canon Raw image file (v3) -
crx
- Google Chrome extension -
cur
- Icon file -
dcm
- DICOM Image File -
deb
- Debian package -
dmg
- Apple Disk Image -
dng
- Adobe Digital Negative image file -
docx
- Microsoft Word -
dsf
- Sony DSD Stream File (DSF) -
dwg
- Autodesk CAD file -
elf
- Unix Executable and Linkable Format -
eot
- Embedded OpenType font -
eps
- Encapsulated PostScript -
epub
- E-book file -
exe
- Executable file -
f4a
- Audio-only ISO base media file format used by Adobe Flash Player -
f4b
- Audiobook and podcast ISO base media file format used by Adobe Flash Player -
f4p
- ISO base media file format protected by Adobe Access DRM used by Adobe Flash Player -
f4v
- ISO base media file format used by Adobe Flash Player -
flac
- Free Lossless Audio Codec -
flif
- Free Lossless Image Format -
flv
- Flash video -
gif
- Graphics Interchange Format -
glb
- GL Transmission Format -
gz
- Archive file -
heic
- High Efficiency Image File Format -
icc
- ICC Profile -
icns
- Apple Icon image -
ico
- Windows icon file -
ics
- iCalendar -
indd
- Adobe InDesign document -
it
- Audio module format: Impulse Tracker -
j2c
- JPEG 2000 -
jls
- Lossless/near-lossless compression standard for continuous-tone images -
jp2
- JPEG 2000 -
jpg
- Joint Photographic Experts Group image -
jpm
- JPEG 2000 -
jpx
- JPEG 2000 -
jxl
- JPEG XL image format -
jxr
- Joint Photographic Experts Group extended range -
ktx
- OpenGL and OpenGL ES textures -
lnk
- Microsoft Windows file shortcut -
lz
- Arhive file -
lzh
- LZH archive -
m4a
- Audio-only MPEG-4 files -
m4b
- Audiobook and podcast MPEG-4 files, which also contain metadata including chapter markers, images, and hyperlinks -
m4p
- MPEG-4 files with audio streams encrypted by FairPlay Digital Rights Management as were sold through the iTunes Store -
m4v
- Video container format developed by Apple, which is very similar to the MP4 format -
mid
- Musical Instrument Digital Interface file -
mie
- Dedicated meta information format which supports storage of binary as well as textual meta information -
mj2
- Motion JPEG 2000 -
mkv
- Matroska video file -
mobi
- Mobipocket -
mov
- QuickTime video file -
mp1
- MPEG-1 Audio Layer I -
mp2
- MPEG-1 Audio Layer II -
mp3
- Audio file -
mp4
- MPEG-4 Part 14 video file -
mpc
- Musepack (SV7 & SV8) -
mpg
- MPEG-1 file -
mts
- MPEG-2 Transport Stream, both raw and Blu-ray Disc Audio-Video (BDAV) versions -
mxf
- Material Exchange Format -
nef
- Nikon Electronic Format image file -
nes
- Nintendo NES ROM -
odp
- OpenDocument for presentations -
ods
- OpenDocument for spreadsheets -
odt
- OpenDocument for word processing -
oga
- Audio file -
ogg
- Audio file -
ogm
- Audio file -
ogv
- Audio file -
ogx
- Audio file -
opus
- Audio file -
orf
- Olympus Raw image file -
otf
- OpenType font -
parquet
- Apache Parquet -
pcap
- Libpcap File Format -
pdf
- Portable Document Format -
pgp
- Pretty Good Privacy -
png
- Portable Network Graphics -
pptx
- Microsoft Powerpoint -
ps
- Postscript -
psd
- Adobe Photoshop document -
pst
- Personal Storage Table file -
qcp
- Tagged and chunked data -
raf
- Fujifilm RAW image file -
rar
- Archive file -
rpm
- Red Hat Package Manager file -
rtf
- Rich Text Format -
rw2
- Panasonic RAW image file -
s3m
- Audio module format: ScreamTracker 3 -
shp
- Geospatial vector data format -
skp
- SketchUp -
spx
- Audio file -
sqlite
- SQLite file -
stl
- Standard Tesselated Geometry File Format (ASCII only) -
swf
- Adobe Flash Player file -
tar
- Tarball archive file -
tif
- Tagged Image file -
ttf
- TrueType font -
vcf
- vCard -
voc
- Creative Voice File -
wasm
- WebAssembly intermediate compiled format -
wav
- Waveform Audio file -
webm
- Web video file -
webp
- Web Picture format -
woff
- Web Open Font Format -
woff2
- Web Open Font Format -
wv
- WavPack -
xcf
- eXperimental Computing Facility -
xlsx
- Microsoft Excel -
xm
- Audio module format: FastTracker 2 -
xml
- eXtensible Markup Language -
xpi
- XPInstall file -
xz
- Compressed file -
zip
- Archive file -
zst
- Archive file
Pull requests are welcome for additional commonly used file types.
The following file types will not be accepted:
-
MS-CFB: Microsoft Compound File Binary File Format based formats, too old and difficult to parse:
-
.doc
- Microsoft Word 97-2003 Document -
.xls
- Microsoft Excel 97-2003 Document -
.ppt
- Microsoft PowerPoint97-2003 Document -
.msi
- Microsoft Windows Installer
-
-
.csv
- Reason. -
.svg
- Detecting it requires a full-blown parser. Check outis-svg
for something that mostly works.
Related
- file-type-cli - CLI for this module
Maintainers
Former