How the string dispatching is implemented in AsmBB

0
#
(ツ) admin
Last edited: 02.03.2023 by admin
1698
03.12.2022

In order to process the requests, AsmBB has to analyze the URL and to dispatch the control to the respective subroutines.

This process consists of string comparison with a set of defined commands and when the string is identified the respective processing procedure is called.

The naive solution used in the earlier versions of AsmBB is to simply compare the words from the URL with all strings from the commands set and to branch to the respective handler. Something like this:

mov     ecx, UserAvatar   ; the address of the procedure.
stdcall StrCompNoCase, eax, txt "!avatar" ; Compare strings case insensitive. 
jc      .exec_command2    ; CF=1 if string matches.

mov     ecx, UserLogin    ; if not, proceed with the next keyword.
stdcall StrCompNoCase, eax, txt "!login"
jc      .exec_command

mov     ecx, UserLogout
stdcall StrCompNoCase, eax, txt "!logout"
jc      .exec_command

mov     ecx, RegisterNewUser
stdcall StrCompNoCase, eax, txt "!register"
jc      .exec_command

mov     ecx, ChangePassword
stdcall StrCompNoCase, eax, txt "!changepassword"
jc      .exec_command

The string comparison in StrLib is pretty fast, but nevertheless I wanted to have it even faster and what is even more important these long chains of comparisons are not very readable.

The old, ugly comparison chains can be seen in the old versions of the code. For example commands.asm:429..513 or 860c9871f2ba9737 or e50ed7321983c4ad as a part of major core refactoring process.

Pearson's hash function has been chosen, because it is pretty fast and what is even more important it allows adjusting of the hash function in order to provide perfect hashing without collisions. The hash is one byte long, that makes the hash tables to need only 256 elements.

As long as the hash tables are static, a macro has been created in order to build them in compile time. Every hash table element contains 8-byte structure:

struct TPHashItem
  .pKeyname    dd ?  ; pointer to the string with the key name. if 0 the table cell is empty.
  .procCommand dd ?  ; pointer to the procedure that handles this key.
ends

The macro that builds the hash table is PList and is defined in the file render2.asm:35..83

The usage of this macro is pretty straightforward:

PList tablePostCommands, tpl_func,                 \
      "!markread",        MarkThreadRead,          \
      "!post",            PostUserMessage,         \
      "!edit",            EditUserMessage,         \
      "!edit_thread",     EditThreadAttr,          \
      "!del",             DeletePost,              \
      "!by_id",           PostByID,                \
      "!history",         ShowHistory,             \
      "!restore",         RestorePost,             \
      "!echoevents",      EchoRealTime,            \    
      "!search",          ShowSearchResults2
end if

The first two arguments are the name of the created hash table and pointer to the used Pearson's hash function. (The hash function is simply a 256 bytes array full of numbers from 0..255, shuffled in random order - like this). The next arguments are a pairs of string keyword and address of procedure to be call for the respective command.

N.B. Actually the hash function is not exactly "randomly shuffled". The numbers are shuffled in a way that provides unique hashes for all strings we want to put in the hash table. This way, the hash collision resolving can be ommited and the performance of the hash table lookup increased.

In order to understand better how this macro works, I will simplify it by removing the debugging code, error handling and the characters low case conversion:

macro PList table, Pfunc, [key, proc] {
common

  table dd 256 dup(0,0)  ; this is the hash table itself 256 elements 2 dwords each.
                         ; initially the table if full of zeroes.

forward   ; for every pair of key/handler

  local ..keynm, ..len, ..hash, ..char

; define the string with the keyword.

  ..keynm db key
  ..len = $ - ..keynm
         db 0

; compute the hash for the given string

  ..hash = 0
  repeat ..len
    load ..char byte from ..keynm + % - 1   ; a char from the string.
    ..hash = ..hash xor ..char              ; 
    load ..hash byte from Pfunc + ..hash    ; Pearson's hash function
  end repeat

; Fill the respective cell in the hash table.
; Every cell is of type TPHashItem

  store dword ..keynm at table + ..hash * 8   ; the pointer to the string.
  store dword proc at table + ..hash * 8 + 4  ; the pointer to the handling procedure.
}

Currently in AsmBB four such hash tables are defined: two in the commands.asm:

tablePreCommands for the commands that stay at the beginning of the URL,

tablePostCommands for the commands that stay at the end of the URL,

And two in the render2.asm:

tableRenderCmd for the template commands and

tableSpecial for the sub-commands.

With the defined tables, searching of a particular string inside is pretty simple and very fast:

The hash H of the unknown string is computed. Then if HashTable[H].pKeyname == 0 then the string is not located in the table. If the cell is not 0, then the string is compared with HashTable[H].pKeyname. If the strings match then the string is located in the table and the respective HashTable[H].procCommand has to be call. If the strings does not match, this is hash collision and the string is not located in the table.

This algorithm is implemented in the procedure SearchInHashTable.

This procedure is defined as:

proc SearchInHashTable, .hName, .pTable

Where .hName is a handle or pointer to the searched string and .pTable is a pointer to the respective hash table.

The procedure returns CF=0 if the string was not found in the table and CF=1 if the string was found. In the latter case, pointer to the respective handling procedure (the .procCommand field) is returned in ECX.

Example of usage can be seen in commands.asm:461..462 or commands.asm:515..516.

Attached files:
FileSizeUploadedDownloadsMD5 hash
tutorial.png28472 bytes02.03.2023397d558f71491b62ca7346f483f504d31f2

How the string dispatching is implemented in AsmBB

0
#