h264-interp-utils

1.1.1 • Public • Published

h264-interp-utils Tests JavaScript Style Guide

H.264 bitstreams are tricky to handle. This Javascript package helps.

It handles the creation and parsing of H.264's codec-private data. This codec-private data is stored in 'avcC' atoms in MPEG-4 streams, and in Tracks/Track/CodecPrivate elements in some Matroska streams (aka webm or EBML files). It is sometimes necessary to re-create this codec-private data from elements in a compressed video bitstream.

It handles the parsing of sequences of H.264 Network Access Layer Units (NALUs), formatted either in packet-transport or streaming Annex B format.

It offers functions for reading H.264's variable-length Exponential Golomb codes from its bitstream. With those functions it handles the parsing of Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) NALUs.

Install

Install with npm:

$ npm install --save h264-interp-utils

Installation with other package managers works similarly.

Why this package

The original reason to develop this package is to allow the reconstruction of 'avcC' atom data from MediaRecorder -emitted data. When using MediaRecorder with a MIME type like video/webm; codecs="avc1.42C01E", it generates a data stream without placing codec-private data in Tracks/Track/CodecPrivate elements. But, the experimental WebCodecs browser API requires that data to be passed to it in a config.description element. Hence the need to reconstruct it.

MediaRecorder-emitted video streams repeat the H.264 SPS and PPS NALUs at the beginning of the data for each intraframe. In Matroska parlance, these are keyframes. In H.264 parlance they are I frames. Each intraframe in simple low-latency MediaEncoder-emitted video streams also happens to be an Instantaneous Decoder Refresh (IDR) frame; decoding can begin at that point in the video stream without reference to any previous data.

The AvcC class in this package reconstructs the codec-private data from MediaRecorder's intraframe data stream, by interpreting the SPS and PPS NALUs in that datastream.

Usage

Start by including the module in your program.

const H264Util = require('h264-interp-utils')

Bitstream

The Bitstream object allows its user to retrieve data bit-by-bit from arrays of data. It's used to parse NALUs, and supports the H.264 variable-length exponential Golomb coding for signed and unsigned integers.

You give the constructor an array containing a single NALU, without any leading NALU delimiter.

const bitstream = new H264Util.Bitstream(array)
const aBit = bitstream.u_1()
const nextBit = bitstream.u_1()
const twoBits = bitstream.u_2()
const fiveBits = bitstream.u(5)
const aByte = bitstream.u_8()

/* variable-length exponential-Golomb coded integers */
const unsignedInt = bitstream.ue_v()
const signedInt = bitstream.se_v()

You may retrieve the number of remaining bits in your array with const bitCount = bitstream.remaining. Likewise, you may retrieve the number of bits already consumed with const bitsUsed = bitstream.consumed.

For debugging convenience, the bitstream.peek16 getter shows, in a text string, the next 16 bits.

Bitstream's constructor removes emulation prevention bytes from the array.

You may retrieve the stream, with emulation prevention bytes, with

const originalStream = bitstream.stream

You may write bits into the stream:

const dest = new Bitstream (4096)
dest.put_u_1(1)           /* one bit */
dest.put_u_1(0)           /* another bit */
const threebits = 6  
dest.put_u(threebits, 3)  /* three bits from a number */
const val = 7
let bitcount = dest.put_ue_v(val)/* exponential Golomb coded value, unsigned */
bitcount = dest.put_se_v(-val)   /* exponential Golomb coded value, signed */

bitccount = dest.put_complete()  /* done writing to Bitstream, tie it off */

You may copy bits from some other stream into the stream with copyBits():

const source = new Bitstream (nalu)
/* empty bitstream, same size as source */
const dest = new Bitstream(source.remaining) 
const startAtBit = 0
const copyBitCount = source.remaining
dest.copyBits(source, startAtBit, copyBitCount)

copyBits() is written to be reasonably efficient even for randomly bit-aligned copies.

NALUStream

NALUStream accepts a buffer containing a sequence of NALUs. They may be in

  • packet format, separated by 4, 3, or 2-byte NALU lengths
  • AnnexB stream format, separated by four-byte 00 00 00 01 or three byte 00 00 01 delimiters.

NALUStream's constructor takes both an array and an options object. The options object many contain any of these properties:

  • type, if present, has the value 'packet' or 'annexB'. Use it to declare the format of your sequence of NALUs. If you omit type, NALUStream attempts to determine the format by examining the first few NALUs. Try to avoid attempting that whenever possible.

  • boxSize, if present, can have values 4,3, or 2 for 'packet' streams, and 4 or 3 for 'annexB' streams. If you omit boxSize NALUStream attempts to determine the boxSize by examining the first few NALUs. Try to avoid attempting that whenever possible.

  • boxSizeMinusOne can be provided in place of boxSize for compatibility with 'avcC' atoms.

  • strict, if true, makes NALUStream throw more errors when it detects anomalous data.

Constructing a NALUStream might look like this. It's wise to catch errors thrown by the constructor.

try {
const nalus = new H264Util.NALUStream(array, {type:'annexB', boxSize: 4, strict: true})
} catch (error) {
  console.error(error)
}

You may iterate over the NALUs in a NALUStream like this:

for (const nalu of nalus) {                           
  /* handle each NALU */
}

or this, if you want both the NALUs and their raw counterparts with the leading delimiters still present

for (const n of nalus.nalus()) {
  const { nalu, rawNalu } = n
  /* handle each nalu */
}

Somewhat less efficiently you can iterate like this:

const naluArray = nalus.packets
for (let i= 0; i < naluArray.length; i++) {
  /* handle each naluArray[i] */
}

Some decoders (for example the VideoDecoder in WebCodecs) require their NALUs in packet format. You can convert a NALUStream to packet format like this. Notice that it changes the contents of the array passed in the constructor.

decoder.decode(nalus.convertToPacket())

NALUStream objects have type, boxSize, and boxSizeMinusOne properties. If you use the constructor to guess what sort of array you gave it, you can retrieve its guesses with those properties.

NALUStream objects have the packetCount property indicating how many NALUs are in the array.

SPS

SPS accepts a Stream Parameter Set NALU, and offers properties describing it. To construct an SPS object, give it an array containing a NALU. (It throws an error when you give it a NALU that's not an SPS, or that's garbled in a way that makes it impossible to decode.)

const sps = new H264Util.SPS(nalu)

Some of its useful properties are:

  • MIME: the MIME type of the video stream, a value like 'avc1.640029'.

  • profileName: a human-readable value like 'BASELINE' or 'EXTENDED' indicating the codec profile.

  • profile_idc: the profile indicator. 66 means baseline, 77 means main, and 88 means extended.

  • profile_compatibility: the constraints.

  • level_idc: the level indicator for the codec level.

  • picWidth, picHeight: the width and height of the pictures in the video stream.

  • cropRect: a rectangle object with x, y, width, height. In the cases where the pictures in the video stream have margins without imagery in them, the cropRect defines the useful area.

    Because H.264 streams have sizes that are multiples of 16x16 macroblocks, it can be necessary to crop the pictures when rendering them.

It has what can only be described as a mess of properties defined by the H.264 standard and needed by the H.264 decoder to make sense of the stream.

PPS

PPS accepts a Picture Parameter Set NALU, and offers properties describing it. To construct an PPS object, give it an array containing a NALU. (It throws an error when you give it a NALU that's not an PPS, or that's garbled in a way that makes it impossible to decode.)

const pps = new H264Util.PPS(nalu)

Most PPS properties describe the format of the pictures in the video stream in a format useful to the H.264 decoder.

Two of its more useful properties are:

  • entropy_coding_mode_flag: 0 for CALVC Huffman-style entropy coding, and 1 for CABAC Arps-style arithmetic coding.
  • entropyCodingMode: 'CAVLC' or 'CABAC', a human-readable description of the entropy coding.

It has what can only be described as a mess of properties defined by the H.264 standard and needed by the H.264 decoder to make sense of the stream.

AvcC

H.264 defines a set of codec-private data describing the data stream. Not all video data streams have a distinct set of codec-private data: it's optional in Matroska / webm / .mkv video streams.

It contains, embedded in it, one or more SPS and PPS elements. It can be reconstructed by parsing an SPS and including a PPS.

The AvcC class parses and reconstructs the codec-private data.

A typical use case is, given arrays containing SPS and PPS NALUs, create an avcC object.

const avcCObject = new H264Util.AvcC({pps:ppsArray, sps:spsArray})
const mime = avcCObject.MIME
/* this is the binary array to put into the `'avcC'` atom. */
const codecPrivateDataArray = avCObject.avcC

Another typical use case is, given a key frame payload from a Matroska SimpleBlock, create the codec-private data.

const avcCObject = new H264Util.AvcC({bitstream: payload})
const mime = avcCObject.MIME
/* this is the binary array to put into the `'avcC'` atom. */
const codecPrivateDataArray = avCObject.avcC

Slice

Slice accepts I-frame (type 5) and P-frame (type 1) NALUs, and offers a few properties from decoding them.

To construct a Slice object, give it an array containing a NALU, and optionally an AvcC object created from the same data stream.

(It throws an error when you give it a NALU that's not type 1 or 5.)

const slice = new H264Util.Slice(nalu, avcC)

Some of its useful properties are:

  • first_mb_in_slice: the number of the first macroblock in the present slice. This is zero for the first slice of a new frame.
  • slice_type: The type of this slice. 0,5: P slice. 1,6: B slice, 2,7: I slice, 3, 8: SP slice, 4,9: SI slice.
  • frame_num: The number of this frame in sequence after the most recent I-frame. This is only available if you provide an avcC object.
  • pic_parameter_set_id: the index of the PPS describing this slice

It is sometimes necessary to change a slice's pic_parameter_set_id. (A bug in Chrome's encoder makes it use multiple different values.)

const fixedNalu = slice.setPPSId(0)
const fixedSlice = new H264Util.Slice(nalu, avcC)
const ppsId = fixedSlice.pic_parameter_set_id /* will be 0 */

Still to do

  • Rework Bitstream and NALUStream to handle Javascript streams, not just static arrays of data.

Credits

Package Sidebar

Install

npm i h264-interp-utils

Weekly Downloads

16

Version

1.1.1

License

MIT

Unpacked Size

796 kB

Total Files

27

Last publish

Collaborators

  • olliejones