asmBB Forum : How the string dispatching is implemented in AsmBB
<img src="Wh ÎÿsèÿD://asmbb.org/images/title.svg" alt="Title img">
<h1>AsmBB is fast and lightweight web forum engine</h1>
tag:asmbb.org,2022-11-19:Thread152022-12-03T04:25:07Zadmin on How the string dispatching is implemented in AsmBBtag:asmbb.org,2022-11-19:Post222022-12-03T04:25:07Z
<p>In order to process the requests, <strong>AsmBB</strong> has to analyze the URL and to dispatch the control to the respective subroutines.
</p>
<p>This process consists of string comparison with a set of defined commands and when the string is identified the respective processing procedure is called.
</p>
<p>The naive solution used in the earlier versions of AsmBB is to simply compare the words from the URL with all strings from the commands set and to branch to the respective handler. Something like this:
</p>
<pre><code class="">mov ecx, UserAvatar ; the address of the procedure.
stdcall StrCompNoCase, eax, txt "!avatar" ; Compare strings case insensitive.
jc .exec_command2 ; CF=1 if string matches.
mov ecx, UserLogin ; if not, proceed with the next keyword.
stdcall StrCompNoCase, eax, txt "!login"
jc .exec_command
mov ecx, UserLogout
stdcall StrCompNoCase, eax, txt "!logout"
jc .exec_command
mov ecx, RegisterNewUser
stdcall StrCompNoCase, eax, txt "!register"
jc .exec_command
mov ecx, ChangePassword
stdcall StrCompNoCase, eax, txt "!changepassword"
jc .exec_command
</code></pre>
<p>The string comparison in StrLib is pretty fast, but nevertheless I wanted to have it even faster and what is even more important these long chains of comparisons are not very readable.
</p>
<p>The old, ugly comparison chains can be seen in the old versions of the code. For example <a href="https://asm32.info/fossil/asmbb/artifact?ln=429..513&name=860c9871f2ba9737">commands.asm:429..513</a> or <a href="https://asm32.info/fossil/asmbb/artifact?ln=576-611&name=860c9871f2ba9737">860c9871f2ba9737</a> or <a href="https://asm32.info/fossil/asmbb/info/e50ed7321983c4ad">e50ed7321983c4ad </a> as a part of major core refactoring process.
</p>
<p><a href="https://en.wikipedia.org/wiki/Pearson_hashing">Pearson's hash</a> function has been chosen, because it is pretty fast and what is even more important it allows adjusting of the hash function in order to provide <a href="https://en.wikipedia.org/wiki/Perfect_hash_function">perfect hashing</a> without collisions. The hash is one byte long, that makes the hash tables to need only 256 elements.
</p>
<p>As long as the hash tables are static, a macro has been created in order to build them in compile time. Every hash table element contains 8-byte structure:
</p>
<pre><code class="">struct TPHashItem
.pKeyname dd ? ; pointer to the string with the key name. if 0 the table cell is empty.
.procCommand dd ? ; pointer to the procedure that handles this key.
ends
</code></pre>
<p>The macro that builds the hash table is PList and is defined in the file <a href="https://asm32.info/fossil/asmbb/artifact?ln=35..83&name=05177c019c7e4d29">render2.asm:35..83</a>
</p>
<p>The usage of this macro is pretty straightforward:
</p>
<pre><code class="">PList tablePostCommands, tpl_func, \
"!markread", MarkThreadRead, \
"!post", PostUserMessage, \
"!edit", EditUserMessage, \
"!edit_thread", EditThreadAttr, \
"!del", DeletePost, \
"!by_id", PostByID, \
"!history", ShowHistory, \
"!restore", RestorePost, \
"!echoevents", EchoRealTime, \
"!search", ShowSearchResults2
end if
</code></pre>
<p>The first two arguments are the name of the created hash table and pointer to the used Pearson's hash function. (The hash function is simply a 256 bytes array full of numbers from 0..255, shuffled in random order - like <a href="https://asm32.info/fossil/asmbb/artifact?ln=6..22&name=05177c019c7e4d29">this</a>). The next arguments are a pairs of string keyword and address of procedure to be call for the respective command.
</p>
<p><strong>N.B.</strong> Actually the hash function is not exactly "randomly shuffled". The numbers are shuffled in a way that provides unique hashes for all strings we want to put in the hash table. This way, the hash collision resolving can be ommited and the performance of the hash table lookup increased.
</p>
<p>In order to understand better how this macro works, I will simplify it by removing the debugging code, error handling and the characters low case conversion:
</p>
<pre><code class="">macro PList table, Pfunc, [key, proc] {
common
table dd 256 dup(0,0) ; this is the hash table itself 256 elements 2 dwords each.
; initially the table if full of zeroes.
forward ; for every pair of key/handler
local ..keynm, ..len, ..hash, ..char
; define the string with the keyword.
..keynm db key
..len = $ - ..keynm
db 0
; compute the hash for the given string
..hash = 0
repeat ..len
load ..char byte from ..keynm + % - 1 ; a char from the string.
..hash = ..hash xor ..char ;
load ..hash byte from Pfunc + ..hash ; Pearson's hash function
end repeat
; Fill the respective cell in the hash table.
; Every cell is of type TPHashItem
store dword ..keynm at table + ..hash * 8 ; the pointer to the string.
store dword proc at table + ..hash * 8 + 4 ; the pointer to the handling procedure.
}
</code></pre>
<p>Currently in AsmBB four such hash tables are defined: two in the <a href="https://asm32.info/fossil/asmbb/file?name=source/commands.asm&ci=tip">commands.asm</a>:
</p>
<p><a href="https://asm32.info/fossil/asmbb/artifact?ln=62..88&name=6be967c6298c1a8b">tablePreCommands</a> for the commands that stay at the beginning of the URL,
</p>
<p><a href="https://asm32.info/fossil/asmbb/artifact?ln=90..103&name=6be967c6298c1a8b">tablePostCommands</a> for the commands that stay at the end of the URL,
</p>
<p>And two in the <a href="https://asm32.info/fossil/asmbb/file?name=source/render2.asm&ci=tip">render2.asm</a>:
</p>
<p><a href="https://asm32.info/fossil/asmbb/artifact?ln=111..119&name=05177c019c7e4d29">tableRenderCmd</a> for the template commands and
</p>
<p><a href="https://asm32.info/fossil/asmbb/artifact?ln=122..157&name=05177c019c7e4d29">tableSpecial</a> for the sub-commands.
</p>
<p>With the defined tables, searching of a particular string inside is pretty simple and very fast:
</p>
<p>The hash H of the unknown string is computed. Then if <code>HashTable[H].pKeyname == 0</code> then the string is not located in the table. If the cell is not 0, then the string is compared with <code>HashTable[H].pKeyname.</code> If the strings match then the string is located in the table and the respective <code>HashTable[H].procCommand</code> has to be call. If the strings does not match, this is hash collision and the string is not located in the table.
</p>
<p>This algorithm is implemented in the procedure <a href="https://asm32.info/fossil/asmbb/artifact?ln=1709..1763&name=6be967c6298c1a8b">SearchInHashTable</a>.
</p>
<p>This procedure is defined as:
</p>
<pre><code class="">proc SearchInHashTable, .hName, .pTable
</code></pre>
<p>Where <code>.hName</code> is a handle or pointer to the searched string and <code>.pTable</code> is a pointer to the respective hash table.
</p>
<p>The procedure returns <strong>CF=0</strong> if the string was not found in the table and <strong>CF=1</strong> if the string was found. In the latter case, pointer to the respective handling procedure (the .procCommand field) is returned in <strong>ECX</strong>.
</p>
<p>Example of usage can be seen in <a href="https://asm32.info/fossil/asmbb/artifact?ln=461..462&name=6be967c6298c1a8b">commands.asm:461..462</a> or <a href="https://asm32.info/fossil/asmbb/artifact?ln=515..516&name=6be967c6298c1a8b">commands.asm:515..516</a>. </p>
admin