Aucf-cs.426 net.news utcsrgv!utzoo!decvax!duke!ucf-cs!whm Thu Feb 18 00:40:47 1982 Reducing Costs Here are some thoughts that I've got about ways to reduce the costs of a Usenet connection. The following suggestions are not based on solid fact, but the hand-dialing of ucf-cs to Duke often finds me peeking at /usr/spool/uucp to see if I've got any mail, so I think I've got a general feel for what file sizes and transfer times are being dealt with. It seems to me that a great deal of time is being wasted by uucico in looking for the files to transmit. Duke!trt's directory reorganization has speeded things up somewhat, but the fundamental concept of the X file and the D file seems to lie at the root of the problem. The first idea is to have a new file format, an E file (don't get transfixed) for instance. The E file would contain what is currently in the X file and the D file in one package. This might not be suitable for some programs run by uuxqt, but for most sites, I'd say that 99.9 percent of their uux traffic is rmail and rnews, and I don't really see a problem in training rmail and rnews to deal with the E files. (Actually, this might be transparent to rmail and rnews.) An E file would look like an X file and a D file stuck together, e.g.: U whm ucf-cs F D.dukeB3087 I D.dukeB3087 C rmail ...... And the letter follows here. My rough estimations are that for files transferred from Duke to UCF, a four second overhead is involved per set, so this change would cut 2 seconds for each set. Not much, but it adds up. The real idea though is about news. I'll be very surprised if nobody has thought of this before, but I've never heard it discussed. I analyzed our uucp logfiles over the last couple of months and by comparing bytes transferred with real time for a session, I determined that about 20-35% of the total session time is not used for data transfer, but for overhead on each end. I arrived at these figures by looking at the the SYSLOG file. I assume the time wasted would be .25*(25-35%) if the transfer was at 300 baud rather than at 1200. Since Usenet is a store-and-forward network, I would imagine that for most sites not in the "central ring", their news queues up on the system(s) that they get it from, and when they are contacted (either call or called), they might have 50-100 articles waiting for them. Each of these articles is represented by an X file and a D file. Implementation of an E file scheme would save 2 seconds per article, or about two to four minutes per session. That's cut the overhead by half, but what about the other half. If a system only calls for news once or twice a day, the news builds up, so why not send it all in one big file, or perhaps several files of sufficient size to minimize file transfer protocol overhead. Basically, instead of queueing up a uux request each time that rnews reads an article to redistribute, rnews would append the article onto the end of a file for the system(s) that receive it. (Each system has its own file.) The real problem is that of shipping out the "batched" file. A very naive solution would be to build a file for a system until it got larger than X bytes, and then queue a rbnews (read batch news) uux request for the particular system and file in question. The problem with sending a file after it reaches a certain size is that a system could call up and not get all the news that has come in because the file hasn't reached the highwater mark. This system wouldn't present any problems for the system that is polled, as a cron entry could be made to cause the partially filled file to be queued for uux just before the poll is made to the remote system. This system seems to be fairly easy to implement, wouldn't require the use of E files (although they would still help for mail), and would be well suited to the Usenet philosophy. I'm not really that familiar with news internals, but I would think that the modifications involved would be: Addition of code to write news articles into a batched file rather than queueing a uux request. Devise a data-independent format for the batch. Modify rnews so that it can unroll the batch. For polled systems, add cron entries and hooks in news to queue any partially filled files for uux. For polling systems, determine what a reasonable size for the batch file is. This could be in the .sys file entry for a system. (Batch file sizes should also be considered wrt the probability of a session being prematurely terminated.) Unc!smb has suggested compaction of news articles, the compaction program supposedly gets about 45% compaction on english text, so working with batched compacted files might be worth looking at also. I would have liked to be able to send firmer figures and more thought- out ideas, but I've been thinking about this since the Netnews meeting at Usenix, and haven't found the time yet, so I thought it would be better to send something half-baked now rather than to wait a month or two. Comments? Bill Mitchell Univ. of Central Florida ----------------------------------------------------------------- gopher://quux.org/ conversion by John Goerzen of http://communication.ucsd.edu/A-News/ This Usenet Oldnews Archive article may be copied and distributed freely, provided: 1. There is no money collected for the text(s) of the articles. 2. The following notice remains appended to each copy: The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996 Bruce Jones, Henry Spencer, David Wiseman.