]> jfr.im git - solanum.git/blame - doc/technical/hostmask.txt
check bans and quiets for cmode -n/nonmember PRIVMSG
[solanum.git] / doc / technical / hostmask.txt
CommitLineData
212380e3
AC
1The hostmask/netmask system.
2Copyright(C) 2001 by Andrew Miller(A1kmm)<a1kmm@mware.virtualave.net>
212380e3
AC
3
4Contents
5========
6* Section 1: Motivation
7* Section 2: Underlying mechanism
8 - 2.1: General overview.
9 - 2.2: IPv4 netmasks.
10 - 2.3: IPv6 netmasks.
11 - 2.4: Hostmasks.
12* Section 3: Exposed abstraction layer
13 - 3.1: Parsing masks.
14 - 3.2: Adding configuration items.
15 - 3.3: Initialising or rehashing.
16 - 3.4: Finding IP/host confs.
17 - 3.5: Deleting entries.
18 - 3.6: Reporting entries.
19
20Section 1: Motivation
21=====================
22Looking up config hostnames and IP addresses(such as for I-lines and
23K-lines) needs to be implemented efficiently. It turns out a hash
24based algorithm like that employed here performs well on the average
25case, which is what we should be the most concerned about. A profiling
26comparison with the mtrie code using data from a real network confirmed
27that this algorithm performs much better.
28
29Section 2: Underlying mechanism
30===============================
312.1: General overview
32---------------------
33In short, a hash-table with linked lists for buckets is used to locate
34the correct hostname/netmask entries. In order to support CIDR IPs and
35wildcard masks, the entire key cannot be hashed, and there is a need to
36rehash. The means for deciding how much to hash differs between hostmasks
37and IPv4/6 netmasks.
38
392.2: IPv4 netmasks
40------------------
41In order to hash IPv4 netmasks for addition to the hash, the mask is first
42processed to a 32 bit address and a number of bits used. All unused bits
43are set to 0. The mask could be in the forms:
441.2.3.4 => 1.2.3.4 32
451.2.3.* => 1.2.3.0 24
461.2 => 1.2.0.0 16
471.2.3.64/26 => 1.2.3.64 26
48The number of whole bytes is then calculated, and only those bytes are
49hashed. (e.g. 1.2.3.64/26 and 1.2.3.0/24 hash the same).
50When a complete IPv4 address is given so that an IPv4 match can be found,
51the entire IP address is first hashed, and looked up in the table. Then
52the most significant three bytes are hashed, followed by the most
53significant two, the most significant one, and finally the 'identity hash'
54bucket is searched(to match masks like 192/7).
55
562.3: IPv6 netmasks
57------------------
58As per IPv4 netmasks, except that instead of rehashing with a one byte
59granularity, a 16 bit(two byte) granularity is used, as 16 rehashes is
60considered too great a fixed offset to be justified for a (possible)
61slight reduction in hash collisions.
62
632.4: Hostmasks
64--------------
65On adding a hostmask to the hash, all of the hostmask right of the next
66dot after the last wildcard character in the string is hashed, or in the
67case that there are no wildcards in the hostmask, the entire string is
68hashed.
69On searching for a hostmask match, the entire hostname is hashed, followed
70by the entire hostmask after the first dot, followed by the entire
71hostmask after the second dot, and so on. Finally, the 'identity' hash
72bucket is checked, to catch hostnames like *test*.
73
74Section 3: Exposed abstraction layer
75====================================
76Section 3.1: Parsing masks
77--------------------------
78Call "parse_netmask()" with the netmask and a pointer to an irc_inaddr
79structure to be filled in, as well as a pointer to an integer where the
80number of bits will be placed.
81Always check the return value. If it returns HM_HOST, it means that the
82mask is probably a hostname mask. If it returns HM_IPV4, it means it was
83an IPv4 address. If it returns HM_IPV6, it means it was an IPv6 address.
84If parse_netmask returns HM_HOST, no change is made to the irc_inaddr
85structure or the number of bits.
86
87Section 3.2: Adding configuration items
88---------------------------------------
89Call "add_conf_by_address" with the hostname or IP mask, the username,
90and the ConfItem* to associate with this mask.
91
92Section 3.3: Initialising and rehashing
93----------------------------------------
94To initialise, call init_host_hash(). This only needs to be done once on
95startup.
96On rehash, to wipe out the old unwanted conf, and free them if there are
5c0df0e7 97no references to them, call clear_out_address_conf().
212380e3
AC
98
99Section 3.4: Finding IP/host confs
100----------------------------------
101Call find_address_conf() with the hostname, the username, the address,
102and the address family.
103To find a d-line, call find_dline() with the address and address family.
104
105Section 3.5: Deletiing entries
106------------------------------
107Call delete_one_address_conf() with the hostname and the ConfItem*.
108
109Section 3.6: Reporting entries
110------------------------------
111Call report_dlines, report_exemptlines, report_Klines() or report_Ilines()
112with the client pointer to report to. Note these walk the hash, which is
113inefficient, but these are not called often enough to justify the memory
114and maintenance clockcycles to for more efficient data structure.