Indiscripts :: ByteStream: Your Swiss Army Knife for Binary Data in InDesign Scripts

What is ByteStream?

Imagine you're a digital archaeologist, but instead of digging through ancient ruins, you're excavating the hidden treasures buried deep within binary files. The new ByteStream class in IdExtenso is your trusty pickaxe, chisel, and magnifying glass all rolled into one elegant tool.

Whether you're reverse-engineering a proprietary file format, building a custom export pipeline, or just need to peek inside that mysterious binary blob that landed on your desk, ByteStream transforms what used to be a painful byte-by-byte slog into an almost magical experience.

Before diving into the magic, here's how to set up ByteStream in your IdExtenso project:

// Include IdExtenso framework
#include 'path/to/$$.jsxinc'
 
// Use the ByteStream class
#include 'path/to/etc/$$.ByteStream.jsxlib'
 
// Load the framework
$$.load();
 
// Your ByteStream code here...
 
// Don't forget to cleanup!
$$.unload();

The Magic of Format Strings

The heart of ByteStream's power lies in its format string syntax. Think of it as a recipe that tells the class exactly what ingredients to extract from your binary soup:

// Read a file header in one line
var myHeader = $$.ByteStream(myBinaryData);
var ret = {};
 
myHeader.read(
    "TAG:signature ASC*4:version U32:size U16*2:dimensions",
    ret);
 
// `ret` now contains:
// ret.signature, ret.version, ret.size, ret.dimensions[]

ByteStream speaks fluent binary with support for all the usual suspects:

• Integers: I08, U08, I16, U16, I24, U24, I32, U32 (signed/unsigned, 8 to 32 bits)
• Floating Point: F32, F64 (IEEE754 floats and doubles)
• Fixed Point: FXP, UFX, F2D (for when precision matters)
• Strings: STR, ASC, LAT, HEX (raw, ASCII, Latin, hexadecimal)
• Special: TAG (those 4-character Adobe/OTF identifiers)

Endianness Made Easy

Remember the dark days of manually handling big-endian vs little-endian byte order? ByteStream makes it as simple as adding a > (big-endian) or < (little-endian) to your format:

// Read the same data in different byte orders
var myBE = myStream.readU32();     // Default: big-endian
var myLE = myStream.readU32(true); // Explicit: little-endian
 
// Or use format strings for multiple values
var myData = {};
myStream.read("U32>:bigValue U32<:littleValue", myData);

Real-World Magic: Font File Surgery

Let's say you want to extract the font name from an OpenType font file. Here's how ByteStream turns a complex task into child's play:

// Read that mysterious .otf file
var myFontStream = $$.ByteStream($$.File.readBinary("./myfont.otf"));
 
// Skip to the 'name' table (simplified for demo)
const NAME_TB_START = 416; // <= Use the actual index!
myFontStream.jump(NAME_TB_START);
 
// Read the name table header
var myHead = {};
myFontStream.read("U16:format U16:count U16:storageOffset", myHead);
 
// Create a substream for the string storage area
var mySubStream = myFontStream.copy(NAME_TB_START + myHead.storageOffset);
 
// Loop through name records to find the font name (nameID = 1 or 4)
var myFontName="", rec={}, i;
for( i=-1 ; ++i < myHead.count && !myFontName ; )
{
    myFontStream.read(
      "U16:pID U16:encID U16:langID U16:nameID U16:length U16:offset",
      rec);
 
    // Look for font family name (nameID 1) or full name (nameID 4)
    if (rec.nameID === 1 || rec.nameID === 4)
    {
        // Jump to the string in the storage substream
        mySubStream.jump(rec.offset);
        myFontName = mySubStream.read("STR*" + rec.length);
        break;
    }
}
 
alert( "Font name: " + myFontName );

Note. — In a real-world implementation, the NAME_TB_START offset would be dynamically discovered by first parsing the OpenType font header and directory table to locate the name table entry. This involves reading the initial sfnt header and iterating through the table directory records until you find the one with tag 'name', then using its offset value. The hardcoded offset shown here is just for demonstration purposes.

Input vs Output Streams: Two Sides of the Same Coin

ByteStream is actually two classes disguised as one:

Input Streams (IStreams) - The Readers

// Create from binary data
var myReader = $$.ByteStream(myBinaryArray); // or string
myReader.peek("F32*3"); // Look ahead without moving
myReader.read("F32*3"); // Read and advance
myReader.backup();      // Save position
myReader.restore();     // Go back

Output Streams (OStreams) - The Writers

// Create (empty) for writing
var myWriter = $$.ByteStream();
myWriter.write("F32*3", [1.0, 2.5, 3.14]);
myWriter.writeU16(42);
var myBytes = myWriter.getBytes(); // Get final byte array

Note. — The new operator is not required (i.e. implicit) when creating a $$.ByteStream instance.

Advanced Sorcery: Structured Data

The real magic happens when you start using keys and counts (*N) together:

// Parse a complex structure in one go
var myImageData = {};
myStream.read(
  "TAG:signature STR*4:version U32:width U32:height U08*768:palette",
  myImageData);
 
// Now myImageData contains:
//   myImageData.signature = "PNG "
//   myImageData.version = "1.0 "
//   myImageData.width = 1920
//   myImageData.height = 1080
//   myImageData.palette = [r1,g1,b1, r2,g2,b2, ...]
// REM: 256 colors × 3 bytes = 768 bytes

Some performance tips and tricks:

1. Use shortcuts for simple reads:
readU16() is faster than read("U16")

2. Batch your operations:
read("U16*10") beats ten separate readU16() calls

3. Copy vs Clone:
Use copy() for shared data, clone() for independence

4. Static encoding:
$$.ByteStream.encode(myData, "U32*2") for quick conversions

As to error handling, ByteStream doesn't leave you hanging when things go wrong:

// Check if your format is valid before using it
if( !$$.ByteStream.isFormat("U16*3 F32") )
{
    alert( "Houston, we have a problem!" );
}
 
// Calculate how many bytes you'll need
var myByteCount = $$.ByteStream.sizeOf("STR*20 U32*5");
// Returns 40

Sometimes you need to work with hex data directly:

// Read 3 bytes as a 6-character hex string
var myColorHex = myStream.read("HEX*3"); // "FF0080" (bright magenta)
 
// Write hex string as bytes
var myColorStream = $$.ByteStream();
myColorStream.write("HEX*3", "FF0080"); // Writes [0xFF, 0x00, 0x80]

A Complete Real-World Example

Here's a practical example showing ByteStream in action with binary data manipulation:

try
{
    // Create some binary data: Fixed Point + string + byte + double
    var myData = '\x01\xFF\x3F\xFF' + 'abc' +
      '\x10' + String.fromBytes([64,9,33,251,84,68,45,24]);
 
    // Create an input stream
    var myIStream = $$.ByteStream(myData);
 
    // Parse structured data in one go
    var myResult = {};
    myIStream.read(
      "FXP:fixedValue STR*3:name U08:count F64:piValue",
      myResult);
 
    // Display results using IdExtenso's JSON formatter
    alert( "Parsed data:\r" + $$.JSON(myResult) );
    // Shows: {"fixedValue": 511.249984741211, "name": "abc",
    //               "count":16,"piValue":3.14159265358979 }
 
    // Create an output stream and write some data
    var myOStream = $$.ByteStream();
    myOStream.write("F64 STR:name FXP:fixedValue", myResult);
 
    // Get the final bytes
    alert("Written bytes:\r" + myOStream.getBytes());
 
}
catch(e)
{
    $$.receiveError(e);
}

The ByteStream class transforms InDesign scripting from a world where binary data was your enemy into one where it's your best friend. Whether you're building the next great InDesign plugin, reverse-engineering file formats, or just trying to understand what's inside that binary file, ByteStream is your faithful companion.

Ready to dive deeper? The ByteStream class source code is available at github.com/indiscripts/IdExtenso. As always, the IdExtenso framework provides the path to advanced InDesign scripting.

GitHub Links:
→ ByteStream Notice
→ ByteStreamDemo.jsx (in /tests subfolder)
→ IdExtenso root page

Indiscripts

Automating InDesign since 2009

ByteStream: Your Swiss Army Knife for Binary Data in InDesign Scripts