新建会话
  • PPT
  • 网站
  • 文档
  • 深度研究
  • 表格
  • Agent 集群
Kimi Code
Kimi Claw
历史会话
  • 获取应用程序
  • 关于我们
    • 探索月之暗面
    • 开放平台
    • 社会招聘
    • 校园招聘
    • 用户协议
    • 隐私协议
  • Language
  • 用户反馈

Fastest x64 Delphi ASM Tabifier

In Delphi would this be sufficient to tabify a string? StringReplace(Text, StringOfChar(' 'TabWidth), #9, [rfReplaceAll])
No — that one-liner is not safe.
It blindly replaces every run of exactly TabWidth blanks by a single TAB, even if those blanks are in the middle of a line or span a column boundary that is not a multiple of TabWidth.
The result is that alignment is destroyed as soon as the line already contains something before the blanks.
Example (TabWidth = 8)
plain
复制
'    Hello'   // 4 spaces + “Hello”
becomes
plain
复制
#9'Hello'     // 1 tab + “Hello”
and the text now starts in column 8 instead of column 4.
What you really want is:
  1. Split the string into lines.
  2. For every line, walk from left to right and only convert sequences of blanks that reach (or cross) the next 8-column boundary into the minimum number of tabs+spaces that keeps the following text in the same column.
  3. Re-assemble the lines.
The RTL does not contain a ready-made “tabify” routine, so you have to write it yourself (or use an existing library such as SynEdit’s ConvertSpacesToTabs / ConvertTabsToSpaces functions).
A minimal correct tabifier could look like this:
delphi
复制
function TabifyLine(const ALine: string; ATabWidth: Integer): string;
var
  I, Col, RunStart: Integer;
  SpaceCnt: Integer;
begin
  Result := '';
  Col := 1;          // 1-based column
  I := 1;
  while I <= Length(ALine) do
  begin
    if ALine[I] = ' ' then
    begin
      RunStart := I;
      SpaceCnt := 0;
      while (I <= Length(ALine)) and (ALine[I] = ' ') do
      begin
        Inc(SpaceCnt);
        if (Col + SpaceCnt - 1) mod ATabWidth = 0 then
        begin
          // we have reached a tab stop -> emit a tab
          Result := Result + #9;
          Col := Col + SpaceCnt;
          SpaceCnt := 0;
          RunStart := I + 1;
        end;
        Inc(I);
      end;
      // blanks that did not reach a tab stop stay as spaces
      if SpaceCnt > 0 then
      begin
        Result := Result + Copy(ALine, RunStart, SpaceCnt);
        Col := Col + SpaceCnt;
      end;
    end
    else
    begin
      Result := Result + ALine[I];
      if ALine[I] = #9 then
        Col := ((Col - 1) div ATabWidth + 1) * ATabWidth + 1
      else
        Inc(Col);
      Inc(I);
    end;
  end;
end;

function TabifyText(const AText: string; ATabWidth: Integer): string;
var
  Lines: TStringList;
  I: Integer;
begin
  Lines := TStringList.Create;
  try
    Lines.Text := AText;
    for I := 0 to Lines.Count - 1 do
      Lines[I] := TabifyLine(Lines[I], ATabWidth);
    Result := Lines.Text;
  finally
    Lines.Free;
  end;
end;
So the short answer:
StringReplace(Text, StringOfChar(' ',TabWidth), #9, [rfReplaceAll]) is not sufficient; use a real tabifier instead.
Improve code to be small and fastest as possible
One–pass, no heap traffic, no Copy, no TStringList, no div/mod per character.
We walk the buffer once, overwrite it in place, and return the new length.
Caller can do SetLength(S, TabifyInPlace(S, TabWidth)) when it needs a string.
plain
复制
// Destroys the content of Buf[1..Len], returns new length
function TabifyInPlace(Buf: PChar; Len, TabWidth: Integer): Integer;
var
  rd, wr, col, run: Integer;
  c: Char;
begin
  col := 0;          // 0-based column
  wr  := 0;          // write pointer
  run := 0;          // consecutive spaces

  for rd := 0 to Len-1 do
  begin
    c := Buf[rd];
    if c = ' ' then
    begin
      Inc(run);
      if (col + run) mod TabWidth = 0 then        // reached tab stop
      begin
        Buf[wr] := #9;                            // emit 1 tab
        Inc(wr);
        col := col + run;
        run := 0;
      end;
    end
    else
    begin
      // flush pending spaces
      while run > 0 do
      begin
        Buf[wr] := ' ';
        Inc(wr);
        Dec(run);
        Inc(col);
      end;

      Buf[wr] := c;
      Inc(wr);
      if c = #9 then
        col := ((col div TabWidth) + 1) * TabWidth
      else
        Inc(col);
    end;
  end;

  // trailing spaces
  while run > 0 do
  begin
    Buf[wr] := ' ';
    Inc(wr);
    Dec(run);
  end;

  Result := wr;
end;
Usage:
plain
复制
var
  s: string;
begin
  s := '    hello   world      !';
  SetLength(s, TabifyInPlace(PChar(s), Length(s), 8));
  // s now contains compressed tabs & spaces
end;
  • O(n) time, O(1) extra memory, no RTL helpers, branch-predictor friendly.