New Chat
Ctrl
K
Kimi+
Chat History

BusyBox Patch: Fix for Large Count/Seek/Skip Values

In busybox which is the difference between XATOU_SFX and XATOOFF_SFX, the difference between xatoull_range() and xatoul_range_sfx()?
I'll fetch the current BusyBox source so I can look at the actual definitions and then give you a concise answer.Definitions (from libbb.h and xatonum.c)
  • XATOU_SFX
    Calls xatoull_range_sfx() with the maximum value limited to ULONG_MAX.
    Therefore it accepts an optional “suffix” (k, m, g, …) and the result is returned as an unsigned long.
  • XATOOFF_SFX
    Calls xatoull_range_sfx() with the maximum value limited to OFF_T_MAX.
    It accepts the same suffixes, but the result is returned as an off_t (usually a 64-bit signed type), so it is meant for file-size/offset values.
  • xatoull_range_sfx()
    Generic worker that parses the string, accepts an optional suffix, applies the scaling factor, and then verifies the final value lies inside the caller-supplied range.
    It returns the value as an unsigned long long.
  • xatoul_range_sfx()
    Same as xatoull_range_sfx() but returns an unsigned long; it simply converts the unsigned long long result to unsigned long and performs one extra overflow check.
So:
  • XATOU_SFX vs XATOOFF_SFX – same parser, different result type (unsigned long vs off_t) and different upper-limit checks (ULONG_MAX vs OFF_T_MAX).
  • xatoull_range() vs xatoul_range_sfx() – the first is the generic unsigned-long-long parser with caller-supplied range and suffix support; the second is the same parser limited to returning an unsigned long.
txt

patch-dd-fix-overflow-for-very-large-count-seek-skip-values.txt

2.42 KB

explain me what this patch does and what kind of problem is going to fix, please

What the patch changes

diff
Copy
-#define XATOU_SFX xatoull_sfx   (or xatoul_sfx)
+#define XATOOFF_SFX(a,s) xatoul_range_sfx((a),0,LLONG_MAX,(s))
and every place in dd.c that read
c
Copy
    count = XATOU_SFX(val, cwbkMG_suffixes);
    seek  = XATOU_SFX(val, cwbkMG_suffixes);
    skip  = XATOU_SFX(val, cwbkMG_suffixes);
is changed to use the new macro XATOOFF_SFX.

What problem is being fixed?

The bug

On 32-bit systems (or any system where sizeof(long) < 8),
XATOU_SFX expanded to xatoul_sfx() which
  • returns an unsigned long (32-bit).
  • silently truncates any value larger than 4 GiB (2³²-1).
Consequently commands such as
sh
Copy
dd if=/dev/zero of=/tmp/big bs=1M count=10000
would try to copy only 10000 % 4G bytes instead of 10 GiB, because the number 10000 (after the M multiplier) overflowed the 32-bit variable.

The fix

XATOOFF_SFX is defined to call xatoul_range_sfx() with the range 0 … LLONG_MAX, which
  • uses an unsigned long long (64-bit) temporary even on 32-bit machines,
  • therefore accepts values beyond 4 GiB,
  • finally stores the result in an off_t/uoff_t, which on systems with large-file support is 64-bit as well.

Summary in one sentence

The patch replaces the 32-bit-limited XATOU_SFX macro with the 64-bit-capable XATOOFF_SFX so that dd options count, seek, and skip can accept values larger than 4 GiB without overflowing.
about this: > may only be 32-bit on some platforms. Which systems are involved? Usually modern architectures are all 64 bits. Also qemu, vbox, etc. machine simulators are working at 64bit not 32bit.
The patch is not about “modern 64-bit desktops”.
It is about BusyBox being built for 32-bit targets that are still very common in the embedded world, and that is exactly where the bug was noticed.
Typical systems that still run a 32-bit user-land with BusyBox include
  • All armv7 / armhf boards (Raspberry Pi ≤ 3 in 32-bit distros, OpenWrt routers, many set-top boxes, automotive infotainment, etc.)
  • i386 builds that are still shipped by several small-footprint distros (Alpine i686, some Debian derivatives, buildroot images for legacy PCs)
  • mips32, powerpc 32-bit, arc, xtensa, riscv32 tool-chains used in deeply embedded products
On those targets sizeof(long) is 4 bytes, so xatoul_sfx() would overflow at 4 GiB.
The patch makes BusyBox behave correctly on exactly those 32-bit systems, while on native 64-bit hosts it is a no-op (because long is already 64-bit).
About this case x86 (32-bit) with glibc, RM 32-bit (armhf, armv7) and MIPS 32-bit, there are any forensic tool for dumping or analysing data ever produced with this limitation?
No widely-used forensic tool is going to flag this as a “special limitation” because, from the file-system’s point of view, there is nothing to flag.
What actually happened on 32-bit BusyBox builds that still contained the bug is:
  1. The user typed
    dd if=/dev/zero of=big.img bs=1M count=10000
    (expecting 10 GiB).
  2. BusyBox parsed only the low 32 bits of the final number (in this example 10000 × 1 048 576 = 10 485 760 000 → 0x271E00000 → low 32 bits = 0x71E00000 = 1 910 650 880 ≈ 1.78 GiB).
  3. dd created a perfectly valid 1.78 GiB file and exited with status 0.
    The file itself is not malformed, it is just smaller than the user asked for.
Hence:
  • The resulting file (or block-device image) is indistinguishable from one that was intentionally written at that size.
  • No forensic tool will see an “error” or “corruption” signature, because none exists.
  • If you have such an image and you want to know whether the truncation came from the old BusyBox bug, the only way is to:
    – check the build date / version of the BusyBox binary that produced it, or
    – correlate the expected transfer size recorded in shell history, logs, or job scripts with the actual on-disk size.
So, in short, there is no practical forensic artefact or tool that can retroactively detect this specific truncation; you have to rely on external evidence (build metadata, logs, or re-running the same command with a fixed BusyBox and comparing).
About this: > So the detection is indirect via image inconsistency, not a signature. The human operator can ignore or overlook about this warning. Correct?
Yes – and that is exactly the point.
  • No warning is emitted by the buggy BusyBox at all.
    dd silently writes the truncated amount and exits with status 0, so the operator sees nothing on the console or in the shell return code.
  • If the operator later checks the resulting file size with ls -l, du, stat, etc., the mismatch could be noticed, but that requires an explicit verification step.
    Many scripts and users simply assume the command succeeded at the requested size, so the error is easily overlooked.
Therefore, without an external size check or version audit of the BusyBox binary, a human operator can (and often will) ignore the problem without ever realising the data is incomplete.