]>
Commit | Line | Data |
---|---|---|
212380e3 | 1 | The hostmask/netmask system. |
2 | Copyright(C) 2001 by Andrew Miller(A1kmm)<a1kmm@mware.virtualave.net> | |
3 | $Id: hostmask.txt 6 2005-09-10 01:02:21Z nenolod $ | |
4 | ||
5 | Contents | |
6 | ======== | |
7 | * Section 1: Motivation | |
8 | * Section 2: Underlying mechanism | |
9 | - 2.1: General overview. | |
10 | - 2.2: IPv4 netmasks. | |
11 | - 2.3: IPv6 netmasks. | |
12 | - 2.4: Hostmasks. | |
13 | * Section 3: Exposed abstraction layer | |
14 | - 3.1: Parsing masks. | |
15 | - 3.2: Adding configuration items. | |
16 | - 3.3: Initialising or rehashing. | |
17 | - 3.4: Finding IP/host confs. | |
18 | - 3.5: Deleting entries. | |
19 | - 3.6: Reporting entries. | |
20 | ||
21 | Section 1: Motivation | |
22 | ===================== | |
23 | Looking up config hostnames and IP addresses(such as for I-lines and | |
24 | K-lines) needs to be implemented efficiently. It turns out a hash | |
25 | based algorithm like that employed here performs well on the average | |
26 | case, which is what we should be the most concerned about. A profiling | |
27 | comparison with the mtrie code using data from a real network confirmed | |
28 | that this algorithm performs much better. | |
29 | ||
30 | Section 2: Underlying mechanism | |
31 | =============================== | |
32 | 2.1: General overview | |
33 | --------------------- | |
34 | In short, a hash-table with linked lists for buckets is used to locate | |
35 | the correct hostname/netmask entries. In order to support CIDR IPs and | |
36 | wildcard masks, the entire key cannot be hashed, and there is a need to | |
37 | rehash. The means for deciding how much to hash differs between hostmasks | |
38 | and IPv4/6 netmasks. | |
39 | ||
40 | 2.2: IPv4 netmasks | |
41 | ------------------ | |
42 | In order to hash IPv4 netmasks for addition to the hash, the mask is first | |
43 | processed to a 32 bit address and a number of bits used. All unused bits | |
44 | are set to 0. The mask could be in the forms: | |
45 | 1.2.3.4 => 1.2.3.4 32 | |
46 | 1.2.3.* => 1.2.3.0 24 | |
47 | 1.2 => 1.2.0.0 16 | |
48 | 1.2.3.64/26 => 1.2.3.64 26 | |
49 | The number of whole bytes is then calculated, and only those bytes are | |
50 | hashed. (e.g. 1.2.3.64/26 and 1.2.3.0/24 hash the same). | |
51 | When a complete IPv4 address is given so that an IPv4 match can be found, | |
52 | the entire IP address is first hashed, and looked up in the table. Then | |
53 | the most significant three bytes are hashed, followed by the most | |
54 | significant two, the most significant one, and finally the 'identity hash' | |
55 | bucket is searched(to match masks like 192/7). | |
56 | ||
57 | 2.3: IPv6 netmasks | |
58 | ------------------ | |
59 | As per IPv4 netmasks, except that instead of rehashing with a one byte | |
60 | granularity, a 16 bit(two byte) granularity is used, as 16 rehashes is | |
61 | considered too great a fixed offset to be justified for a (possible) | |
62 | slight reduction in hash collisions. | |
63 | ||
64 | 2.4: Hostmasks | |
65 | -------------- | |
66 | On adding a hostmask to the hash, all of the hostmask right of the next | |
67 | dot after the last wildcard character in the string is hashed, or in the | |
68 | case that there are no wildcards in the hostmask, the entire string is | |
69 | hashed. | |
70 | On searching for a hostmask match, the entire hostname is hashed, followed | |
71 | by the entire hostmask after the first dot, followed by the entire | |
72 | hostmask after the second dot, and so on. Finally, the 'identity' hash | |
73 | bucket is checked, to catch hostnames like *test*. | |
74 | ||
75 | Section 3: Exposed abstraction layer | |
76 | ==================================== | |
77 | Section 3.1: Parsing masks | |
78 | -------------------------- | |
79 | Call "parse_netmask()" with the netmask and a pointer to an irc_inaddr | |
80 | structure to be filled in, as well as a pointer to an integer where the | |
81 | number of bits will be placed. | |
82 | Always check the return value. If it returns HM_HOST, it means that the | |
83 | mask is probably a hostname mask. If it returns HM_IPV4, it means it was | |
84 | an IPv4 address. If it returns HM_IPV6, it means it was an IPv6 address. | |
85 | If parse_netmask returns HM_HOST, no change is made to the irc_inaddr | |
86 | structure or the number of bits. | |
87 | ||
88 | Section 3.2: Adding configuration items | |
89 | --------------------------------------- | |
90 | Call "add_conf_by_address" with the hostname or IP mask, the username, | |
91 | and the ConfItem* to associate with this mask. | |
92 | ||
93 | Section 3.3: Initialising and rehashing | |
94 | ---------------------------------------- | |
95 | To initialise, call init_host_hash(). This only needs to be done once on | |
96 | startup. | |
97 | On rehash, to wipe out the old unwanted conf, and free them if there are | |
98 | no references to them, call clear_out_address_conf(). | |
99 | ||
100 | Section 3.4: Finding IP/host confs | |
101 | ---------------------------------- | |
102 | Call find_address_conf() with the hostname, the username, the address, | |
103 | and the address family. | |
104 | To find a d-line, call find_dline() with the address and address family. | |
105 | ||
106 | Section 3.5: Deletiing entries | |
107 | ------------------------------ | |
108 | Call delete_one_address_conf() with the hostname and the ConfItem*. | |
109 | ||
110 | Section 3.6: Reporting entries | |
111 | ------------------------------ | |
112 | Call report_dlines, report_exemptlines, report_Klines() or report_Ilines() | |
113 | with the client pointer to report to. Note these walk the hash, which is | |
114 | inefficient, but these are not called often enough to justify the memory | |
115 | and maintenance clockcycles to for more efficient data structure. |