]> jfr.im git - irc.git/blob - software/RELEASES/ircservices/achurch.org/services/lists/ircservices/2000/000875.html
rename -> *.git
[irc.git] / software / RELEASES / ircservices / achurch.org / services / lists / ircservices / 2000 / 000875.html
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
2 <HTML>
3 <HEAD>
4 <TITLE> [IRCServices] Closer looks at services databases?
5 </TITLE>
6 <LINK REL="Index" HREF="index.html" >
7 <LINK REL="made" HREF="mailto:ircservices%40ircservices.za.net?Subject=%5BIRCServices%5D%20Closer%20looks%20at%20services%20databases%3F&In-Reply-To=">
8 <META NAME="robots" CONTENT="index,nofollow">
9 <META http-equiv="Content-Type" content="text/html; charset=us-ascii">
10 <LINK REL="Previous" HREF="000867.html">
11 <LINK REL="Next" HREF="000884.html">
12 </HEAD>
13 <BODY BGCOLOR="#ffffff">
14 <H1>[IRCServices] Closer looks at services databases?</H1>
15 <B>Chris Knipe</B>
16 <A HREF="mailto:ircservices%40ircservices.za.net?Subject=%5BIRCServices%5D%20Closer%20looks%20at%20services%20databases%3F&In-Reply-To="
17 TITLE="[IRCServices] Closer looks at services databases?">cgknipe at mweb.co.za
18 </A><BR>
19 <I>Sat Oct 14 13:13:39 PDT 2000</I>
20 <P><UL>
21 <LI>Previous message: <A HREF="000867.html">[IRCServices] Ircd's and Services....
22 </A></li>
23 <LI>Next message: <A HREF="000884.html">[IRCServices] Closer looks at services databases?
24 </A></li>
25 <LI> <B>Messages sorted by:</B>
26 <a href="date.html#875">[ date ]</a>
27 <a href="thread.html#875">[ thread ]</a>
28 <a href="subject.html#875">[ subject ]</a>
29 <a href="author.html#875">[ author ]</a>
30 </LI>
31 </UL>
32 <HR>
33 <!--beginarticle-->
34 <PRE>Hi Andrew,
35
36 Just for interest sakes (I know we have talked about this *MUCH* in the
37 past), but just what exactly is ment by services &quot;reading once, and writing
38 many&quot;. I totally like do not understand that interpretation of what exactly
39 services is doing, and howcome this is the case in the scenario of a &quot;data
40 update&quot; or &quot;data retrieval&quot;.
41
42 If a nickname is registered, read not at all, write many
43 If a nick is changed, read once, write perhaps more than which is required?
44 If a memo is read, read ALLOT, write one bit?
45 etc.
46
47 Shouldn't there perhaps rather be looked into the ammount of information
48 services writes to the databases? I admit yes, I have not looked at the
49 code seeing that I don't have it at the moment, but from what I remember
50 back the last time I looked at it, yes, the database code was a mess, and
51 because of that, I don't understand it. Which brings me back to the
52 question of what exactly does services do with its database?
53
54 A quick example (I am talking under speculation here - as I said before, I
55 did not look at the code, and I wont bother looking at the code, because its
56 to compicated for a avarage programmer which I hope I can be seen as).
57
58 Say we have a nick registered in the database (moo). moo registers his
59 nick, and this time, the entire nickserv structure is copied into the
60 database for the nickname. However, now, user moo does a update. Lets say,
61 /msg nickserv set email <A HREF="http://www.ircservices.za.net/mailman/listinfo/ircservices">blah at blah.com</A>
62
63 My question is this: How exactly does services process this update?
64 Surely, if you use the nickname, like in this case moo as a reference, can
65 it be that severe on large networks if you just update a string in the
66 database? Sure, I can understand that if you delete the record entirely,
67 and regenerate it for the update, that it is a long process, but updates
68 should NOT be a complicated procedure.
69
70 In the past, I made (I think), quite a big effort on getting services to
71 support SQL based databases, and it is exactly for this reason. As far as
72 physical searches (locating of the user's registration details) through the
73 database goes, they are 1) accellerated, and 2) they are cached. This is
74 what makes SQL databases good for high-information stores (allot of
75 records).
76
77 Secondly, the records can also be automatically assigned registration ID.
78 Which, normally is ALLOT smaller than the actual nick (less bits in size,
79 the data travels faster through the wires). A simple scenario ( I realise I
80 am missing settings here, this is only as example), a database structure on
81 SQL may be:
82
83 NickID (int)
84 NickName (varchar)
85 EMailAddress (varchar)
86 SetKillImmed (bit) !!!!!!
87 SetKillQuick (bit) !!!!!!
88 SetKillWhatever (bit).
89
90 The basic idea behind this, is to use the smallest possible size of data in
91 the db. The bits, is either a 1 or a 0, and because the NickID is a
92 integer, it is also smaller than the NickName which is varchar.
93
94 Microsoft's SQL Server uses the following sizes for its tables: (This varies
95 from every type of Database, like MS-SQL, ProgreSQL, MySQL, DB1, DB2, DB3,
96 etc) These are the default field lengths for data to be stored as native
97 file storage type (nullable data is the same length as nonnull data, and
98 character data is always stored in character format).
99
100 Data type Default length (characters)
101 bit 1
102 binary Length defined for the column
103 varbinary Length defined for the column
104 image 0
105 datetime 8
106 smalldatetime 4
107 float 8
108 real 4
109 int 4
110 bigint 8
111 smallint 2
112 tinyint 1
113 money 8
114 smallmoney 4
115 decimal *See footnote
116 numeric *See footnote
117 uniqueidentifier 16
118 timestamp 8
119
120 * Numeric data types with fixed precision and scale.
121 Precision Storage bytes
122 1 - 9 5
123 10-19 9
124 20-28 13
125 29-38 17
126
127 The point at which I am getting, is that services is allocating allot of
128 bytes in its tables to chars which is not needed, and therefor, also needs
129 to update more than it needs to, aswell as send more information over the
130 IRC network than what is needed.
131
132 Secodly, by utilising a identifier (such as NickID set in my example), your
133 updates, sending of memos, and various other procedures related to a
134 nickname (for example), is ALLOT faster.
135
136 Say, I issue UPDATE table SET SetKillImmed='1' WHERE NickID='12' (The user
137 does a /msg nickserv set kill immed). This is 50 (If I counted correctly),
138 characters that services needs to send to the database. The error checking
139 is a simple matter of checking if return value is success or failure (7
140 characters), so in total, the entire update procedure, will send and receive
141 57 characters. Can this be said for the DBs currectly in use?
142
143 Furthermore, with regards to SQL's caching abilities.
144
145 I have a 166MMX, 80MB Ram, Running Windows NT 4.0 atm. In idle state, my PC
146 uses: CPU Load: 15%, Memory in Use (Pagefile): 138MB (Peaked at 150MB)
147
148 I also run MS SQL 2000, and my example is in a table I use to archive the
149 BugTraq Mailing list. The table looks as follows (to give you an idea of
150 the sizes):
151 ID (int) IDENTITY (1, 1) NOT NULL ,
152 MessageID (nvarchar(150)) NOT NULL ,
153 InReplyTo (nvarchar(150)) NULL ,
154 MsgDate (nvarchar(40)) NOT NULL ,
155 MsgFrom (ntext) NOT NULL ,
156 MsgSubject (ntext) NULL ,
157 MsgBody (ntext) NULL
158
159 For a indication of the specific field sizes:
160 int: (whole number) data from -2^31 (-2,147,483,648) through 2^31 - 1
161 (2,147,483,647).
162 nvarchar: Variable-length Unicode data with a maximum length of 4,000
163 characters.
164 ntext: Variable-length Unicode data with a maximum length of 2^30 - 1
165 (1,073,741,823) characters.
166
167 As you can realise from this, one record in the table, allocates quite allot
168 of data. BTW: the table currectly holds 1523 records. Now, let us do some
169 statistics:
170
171 First, a lookup without utilising the cached index on the table, the command
172 executed:
173 SELECT * FROM BugTraq WHERE ID='67'
174 Event Class: Duration: CPU: Reads: Writes:
175 SQL:StmtCompleted 140 10 0 0
176
177 Now, the same lookup BUT, I am using the cached index:
178 SELECT * FROM BugTraq WITH (INDEX(IX_BugTraq)) WHERE ID='67'
179 Event Class: Duration: CPU: Reads: Writes:
180 SQL:StmtCompleted 70 10 0 0
181
182 As you can see, the load stays the same, BUT, the duration (aka speed) of
183 the query droped by 50%, thus, giving better performance. Now, let us
184 update?
185 UPDATE BugTraq SET InReplyTo='something' WHERE ID='67'
186 Event Class Duration: CPU: Reads: Writes:
187 SQL:StmtCompleted 231 30 0 8
188
189 Note, writes 8, reads = 0!!
190
191 In other words, by using something like SQL, you will once more, SAVE on
192 performance and speed, because of the fact that the location of the record
193 in the database does NOT need to be located...
194
195 As a closing, I would just like to ask if someone DID in fact do detailed
196 analysis of what exactly is happening in the services database? As I've
197 shown you here, a SQL based database CAN and WILL increase the services
198 performance, aswell as to LOWER the actual ammount of information that is
199 needed to be transmitted to and from the services to the IRC Network. Can
200 the same results be given for the dbs services are currently using, and if
201 so, can I be proven wrong that a PROPERLY IMPLICATED SQL based Services
202 database, will NOT enhance IRC Services?
203
204 This read once / write many times in my books, is a pile of horse .... It
205 is of no relavence whether there is being read or written to the database.
206 The question is about WHAT is written / read, and WHERE in the database is
207 it being sent to, or read from, and if services actually KNOW where the
208 information is that needs to be updated.
209
210 Thus, my closing questions are:
211 1) Can someone provide me with the structures of all the dbs so that I can
212 properly port it to SQL.
213 2) Can there be added SQL support in a non-public version of Services for
214 speed analysis
215 3) How and where does services read / write / update information in its
216 database
217 4) How does Services keep record of where in its databases information is
218 stored.
219 5) Does anyone have specs (detailed) about the speed of the services
220 database at the current moment? (Note, I'm talking about accurate speed
221 analysis of the databases IN USE by services, accessing the information the
222 SAME way services does.) Surely, if you are looking at the database, you
223 CANT tell me SQL is not going to enhance performance if you DONT have
224 information backing up your accusations...
225
226 If for some reason you still believe I am wrong about SQL Andrew, let us
227 test this, and compare our results? I'll bet you R100 :P SQL is
228 multithread aswell, which adds ALLOT more speed into services by allowing it
229 to make more than one query at a time (for starters).
230
231 ---
232 Regards,
233 Chris Knipe
234 Cell: (083) 430-8151
235
236
237
238
239 ---------------------------------------------------------------
240 To unsubscribe, send email to <A HREF="http://www.ircservices.za.net/mailman/listinfo/ircservices">majordomo at ender.shadowfire.org</A>
241 with &quot;unsubscribe ircservices&quot; in the body, without the quotes.
242
243
244 </PRE>
245
246 <!--endarticle-->
247 <HR>
248 <P><UL>
249 <!--threads-->
250 <LI>Previous message: <A HREF="000867.html">[IRCServices] Ircd's and Services....
251 </A></li>
252 <LI>Next message: <A HREF="000884.html">[IRCServices] Closer looks at services databases?
253 </A></li>
254 <LI> <B>Messages sorted by:</B>
255 <a href="date.html#875">[ date ]</a>
256 <a href="thread.html#875">[ thread ]</a>
257 <a href="subject.html#875">[ subject ]</a>
258 <a href="author.html#875">[ author ]</a>
259 </LI>
260 </UL>
261
262 </body></html>