Sunday, 15 January 2023

Working-With-Byte-Arrays

In this post I will show how to read/write different data type values to and from a byte buffer. This should prove useful when working with block data or pages. In this scenario, pages describes database pages or similar block data mechanisms.


Top

In this post...

Top

How To Handle Typed Values

The aim is to read/write different typed values from a byte array. In C++ one can achieve this by using a union. In C# we achieve this by using a struct/class using explicit layout. Explicit layout is the key here as it ensures types are aligned correctly.

The basic idea is to have a number of byte values offset from zero to maximum bytes required. Then add basic types, such as int 16, int 32, int 64, float, etc all at offset zero. Then one can set a float value and read of the four byte values. The same can be done for most basic types, set the value and the read the corresponding byte values. To read typed values, simply populate the byte values accordingly then read the basic type info (int 16, int 32, int float, etc).

For those with a COM background this approach is similar to the VARIANT type.

Top

The Union Value Type


using System;
using System.Runtime.InteropServices;
namespace Blog
{
  [StructLayout(LayoutKind.Explicit)]
  public struct Value
  {
    [FieldOffset(0)] public byte B0;
    [FieldOffset(1)] public byte B1;
    [FieldOffset(2)] public byte B2;
    [FieldOffset(3)] public byte B3;
    [FieldOffset(4)] public byte B4;
    [FieldOffset(5)] public byte B5;
    [FieldOffset(6)] public byte B6;
    [FieldOffset(7)] public byte B7;

    [FieldOffset(0)] public float Float;
    [FieldOffset(0)] public double Double;
    [FieldOffset(0)] public Int16 Int16;
    [FieldOffset(0)] public Int32 Int32;
    [FieldOffset(0)] public Int64 Int64;
  }
}

The above struct declaration shows eight bytes, B0-B7 for data transfer. If using the decimal type this will need to be expanded to include 16 bytes! As can be seen, B0-B7 is byte aligned and offset accordingly.

Note also, that basic types float, double, Int16, etc all start at offset zero. So, if one sets the Float field, the corresponding 4 bytes can be read from B0-B3. To reverse the action, set fields B0-B3, then read the float value.

It is imperative that one knows how many bytes are required for a typed value. The above structure should cover most data types bar decimal. DateTime values can also be stored by first converting to long and using the Int64 field or bytes B0-B7 to recreate.

Top

Testing The Union Value Type

The above can be tested using the following code...

static void RawTest()
{
  Value value = new Value();
  byte[] data = new byte[128];

  int cursor = 0;
  
  // *********************************************************************************************
  // Writing values to the data byte array.
  // *********************************************************************************************
  Int16 v1 = Int16.MaxValue - 1067;
  Int32 v2 = Int32.MaxValue - 10067;
  float v3 = float.MaxValue - 1.07896f;

  // Write an Int16 value to the data byte array.
  // First set the Value's Int16 field, then read two Value bytes.
  value.Int16 = v1;
  data[cursor++] = value.B0;
  data[cursor++] = value.B1;

  // Write an Int32 value to the data byte array.
  // First set the Value's Int32 field, then read four Value bytes.
  value.Int32 = v2;
  data[cursor++] = value.B0;
  data[cursor++] = value.B1;
  data[cursor++] = value.B2;
  data[cursor++] = value.B3;

  // Write a float value to the data byte array.
  // First set the Value's Float field, then read four Value bytes.
  value.Float = v3;
  data[cursor++] = value.B0;
  data[cursor++] = value.B1;
  data[cursor++] = value.B2;
  data[cursor++] = value.B3;

  
  // *********************************************************************************************
  // Reading values from the data byte array.
  // *********************************************************************************************
  cursor = 0;
  
  // Read an Int16 from the data byte array.
  // Set Value's first two byte fields, then read the Value's Int16 field.
  value.B0 = data[cursor++];
  value.B1 = data[cursor++];
  var v1Result = value.Int16;

  // Read an Int32 from the data byte array.
  // Set Value's first four byte fields, then read the Value's Int32 field.
  value.B0 = data[cursor++];
  value.B1 = data[cursor++];
  value.B2 = data[cursor++];
  value.B3 = data[cursor++];
  var v2Result = value.Int32;

  // Read a float from the data byte array.
  // Set Value's first four byte fields, then read the Value's Float field.
  value.B0 = data[cursor++];
  value.B1 = data[cursor++];
  value.B2 = data[cursor++];
  value.B3 = data[cursor++];
  var v3Result = value.Float;

  Console.WriteLine(v1);
  Console.WriteLine(v1Result);
  Console.WriteLine(v2);
  Console.WriteLine(v2Result);
  Console.WriteLine(v3);
  Console.WriteLine(v3Result);
}
Top

Writing Values To A Byte Array

Figure 2 illustrates a byte array filled with values from the previous test code. As the diagram illustrates, data values are byte aligned accordingly. That is, a 16 bit-integer requires 2 bytes, a float or 32-bit integer requires 4 bytes. This approach works well for data blocks comprising of byte arrays. Simply create a byte array, write values and save to disk. Conversely, load a byte array block from disk into memory, then proceed to read actual values.

Top

Improving The Value Interface

The current method of reading/writing values is somewhat verbose. One needs to track the byte array offset (cursor) and the offset to add following a read or write. In addition, explicitly reading and writing to the Value's byte fields (B0...BN) is tiresome and error-prone. The following should help alleviate these problems...

  • Specify the byte array and initial offset.
  • Specify a cursor that is relative to the specified initial offset.
  • Update the cursor accordingly following a read or write operation.
  • Allow the cursor's position to be set manually, the cursor will always be relative to the specified offset.

The following class, ByteBuffer, implements the above features.

using System;

namespace Blog
{
  /// <summary>
  /// Allows values to read/written to/from a byte array.
  /// Specify a byte buffer and initial offset in the constructor.
  /// Data will be read/written at this offset.
  /// The class uses a cursor to indicate current read/write position.
  /// The cursor is always offset by the offset specified in the constructor.
  /// </summary>
  public class ByteBuffer
  {
    private Value _value = new Value();
    private int _internalCursor;
    private int _offset;
    private readonly byte[] _buffer;

    public int Cursor => _internalCursor - _offset;

    public ByteBuffer(byte[] buffer, int offset)
    {
      _buffer = buffer;
      _offset = offset;
      _internalCursor = offset;
    }

    public ByteBuffer SetCursor(int position)
    {
      _internalCursor = _offset + position;
      return this;
    }

    public ByteBuffer Int16(Int16 value)
    {
      _value.Int16 = value;
      _buffer[_internalCursor++] = _value.B0;
      _buffer[_internalCursor++] = _value.B1;
      return this;
    }

    public ByteBuffer Int16(out Int16 result)
    {
      _value.B0 = _buffer[_internalCursor++];
      _value.B1 = _buffer[_internalCursor++];
      result = _value.Int16;
      return this;
    }

    public ByteBuffer Int32(Int32 value)
    {
      _value.Int32 = value;
      _buffer[_internalCursor++] = _value.B0;
      _buffer[_internalCursor++] = _value.B1;
      _buffer[_internalCursor++] = _value.B2;
      _buffer[_internalCursor++] = _value.B3;
      return this;
    }

    public ByteBuffer Int32(out Int32 result)
    {
      _value.B0 = _buffer[_internalCursor++];
      _value.B1 = _buffer[_internalCursor++];
      _value.B2 = _buffer[_internalCursor++];
      _value.B3 = _buffer[_internalCursor++];

      result = _value.Int32;
      return this;
    }

    public ByteBuffer Float(float value)
    {
      _value.Float = value;
      _buffer[_internalCursor++] = _value.B0;
      _buffer[_internalCursor++] = _value.B1;
      _buffer[_internalCursor++] = _value.B2;
      _buffer[_internalCursor++] = _value.B3;
      return this;
    }

    public ByteBuffer Float(out float result)
    {
      _value.B0 = _buffer[_internalCursor++];
      _value.B1 = _buffer[_internalCursor++];
      _value.B2 = _buffer[_internalCursor++];
      _value.B3 = _buffer[_internalCursor++];
      result = _value.Float;
      return this;
    }
  }
}
Top

Using The ByteBuffer Class

Using the ByteBuffer class is fairly straightforward. Simply call the constructor with a byte array and offset. No data can be read or written before the offset.

A sample test/driver program now follows...

static void Main(string[] args)
{
  byte[] data = new byte[128];
  ByteBuffer buffer = new ByteBuffer(data, 5);

  Int16 v1 = Int16.MaxValue - 1067;
  Int32 v2 = Int32.MaxValue - 10067;
  float v3 = float.MaxValue - 1.07896f;

  int bytesWritten = buffer
    .Int16(v1)
    .Int32(v2)
    .Float(v3)
    .Cursor;
  Console.WriteLine($"{bytesWritten} bytes written to buffer.");

  int bytesRead = buffer
    .SetCursor(0)
    .Int16(out var v1Read)
    .Int32(out var v2Read)
    .Float(out var v3Read)
    .Cursor;

  Console.WriteLine($"{bytesRead} bytes read from buffer.");
  Console.WriteLine($"v1Write:{v1} - v1Read:{v1Read}");
  Console.WriteLine($"v2Write:{v2} - v2Read:{v2Read}");
  Console.WriteLine($"v3Write:{v3} - v3Read:{v3Read}");
}

The above code wites the following to the console.

Notice how the test program uses an offset of 5 when constructing the ByteBuffer. The correct bytes read/wrriten of ten is still returned. The first five bytes in this example will be zeroed.

Top

No comments:

Post a Comment