JPEG SOUL
Challenge Information
- Challenge Name: JPEG SOUL
- Category: Steganography
- Author: tanish_fr
- Hint: "My soul will guide you, even through what seems insignificant"
- File: jpeg-soul.jpg (733x859 JPEG, 286KB)
Solution
Initial Reconnaissance
The image is a reproduction of a Raja Ravi Varma painting ("Woman Holding a Fruit"). Standard steg tools (binwalk, strings, EXIF analysis, steghide/stegseek, jsteg) yielded no results. There was no data appended after the JPEG EOF marker, no hidden comments, and no embedded files.
Key Observation: 4 Quantization Tables
Parsing the JPEG structure revealed an unusual feature: 4 quantization tables (QT0, QT1, QT2, QT3) were defined, but only 2 were actually used by the image components:
- Component 1 (Y/Luminance): uses QT0
- Component 2 (Cb): uses QT1
- Component 3 (Cr): uses QT1
QT2 and QT3 are defined but never referenced by any component. This is the "soul" of the JPEG -- the quantization tables that define image quality.
Decoding the Hint
The hint says: "My soul will guide you, even through what seems insignificant"
- Soul = The quantization tables (the "soul" of a JPEG that determines its quality/character)
- Insignificant = Least Significant Bit (LSB)
The Technique: LSB Steganography in Quantization Tables
The flag is hidden in the least significant bits (LSBs) of the quantization table values, read in natural (row-major) order (not zigzag order as stored in the file).
JPEG files store quantization tables in zigzag scan order. The tables must first be converted to natural 8x8 matrix order (row by row, left to right, top to bottom) using the standard JPEG zigzag mapping.
Extraction Process
- Parse all 4 DQT (Define Quantization Table) markers from the JPEG file
- Extract the 64 values from each quantization table (stored in zigzag order)
- Convert each table from zigzag order to natural (row-major) 8x8 order
- Extract the LSB (bit 0) of each value from all 4 tables in sequence: QT0, QT1, QT2, QT3
- Group the resulting 256 bits into bytes (MSB first)
- Read the ASCII text
Result
The 256 LSB bits (32 bytes) decode as:
| Byte Range | Source | Hex | ASCII |
|---|---|---|---|
| 0-7 | QT0 | fff9036e2b850548 |
(non-printable) |
| 8-15 | QT1 | e014000000000000 |
(non-printable) |
| 16-23 | QT2 | 454841587b6a7033 |
(flag data in unused table) |
| 24-31 | QT3 | 675f73336372747d |
(flag data in unused table) |
The flag is encoded entirely in the LSBs of QT2 and QT3 (the two unused quantization tables). QT0 and QT1 (the tables actually used for image compression) contain non-flag data in their LSBs.
Why Standard Tools Failed
- jsteg/stegseek/steghide: These tools look for data hidden in DCT coefficients, not in quantization table values
- Pixel LSB extraction: JPEG is lossy compression, so pixel-level LSBs are meaningless
- binwalk: Only looks for embedded file signatures, not steganographic data
- The technique is custom and exploits an unusual JPEG structural feature (unused quantization tables)
Key Insight
The critical realization was that:
- Having 4 quantization tables when only 2 are needed is suspicious
- The unused tables (QT2 and QT3) closely resemble standard JPEG luminance and chrominance tables with subtle modifications
- The LSBs of these table values, read in natural (not zigzag) order, encode ASCII text
Verification Script
#!/usr/bin/env python3
import struct
zigzag_order = [
0, 1, 8, 16, 9, 2, 3, 10, 17, 24, 32, 25, 18, 11, 4, 5,
12, 19, 26, 33, 40, 48, 41, 34, 27, 20, 13, 6, 7, 14, 21, 28,
35, 42, 49, 56, 57, 50, 43, 36, 29, 22, 15, 23, 30, 37, 44, 51,
58, 59, 52, 45, 38, 31, 39, 46, 53, 60, 61, 54, 47, 55, 62, 63
]
with open('jpeg-soul.jpg', 'rb') as f:
data = f.read()
# Parse all QT tables
qt_tables = {}
i = 0
while i < len(data) - 1:
if data[i] == 0xFF and data[i+1] == 0xDB:
length = struct.unpack('>H', data[i+2:i+4])[0]
seg = data[i+4:i+2+length]
j = 0
while j < len(seg):
table_id = seg[j] & 0xF
qt_tables[table_id] = list(seg[j+1:j+65])
j += 65
i += 2 + length
elif data[i] == 0xFF and data[i+1] not in (0x00, 0xFF):
if data[i+1] == 0xD8:
i += 2
elif i + 3 < len(data):
i += 2 + struct.unpack('>H', data[i+2:i+4])[0]
else:
i += 1
else:
i += 1
# Convert zigzag to natural order and extract LSBs
all_lsbs = []
for table_id in range(4):
zz = qt_tables[table_id]
natural = [0] * 64
for k in range(64):
natural[zigzag_order[k]] = zz[k]
all_lsbs.extend([v & 1 for v in natural])
# Convert to bytes
result = bytearray()
for k in range(0, len(all_lsbs), 8):
byte = 0
for j in range(8):
byte = (byte << 1) | all_lsbs[k + j]
result.append(byte)