|
ISSN 1320-0682 | |||
| Volume 02 | April 1995 | |||
Jim W. Davidson and Ian C. Goulter
Central Queensland University
Rockhampton, QLD 4702, Australia
Email: davidj@topaz.cqu.edu.au
This paper describes an adaptive learning system in which a rule-based computer program is developed to design the layout geometry of rectilinear pipe networks. The learning system consists of three components: the agitator, the rule formulator, and the production system. The agitator component of the learning system experiments with initial starting solutions proposed to the network layout problem by making small modifications to the layout solutions at random. When a random modification results in an improved solution, the rule formulator derives a rule that will perform the same modification if that same geometric pattern occurs in future. Rules developed in this manner are applied by the production system as the program continues experimenting with the problem. The performance of each rule, in terms of its ability to produce improvements to layout geometry, is monitored through subsequent iterations of the algorithm. Rules that are found to behave poorly are discarded through a process referred to as "forgetting".
The learning system is not a generic process. The system has been developed for a specific application and many of the procedures of the algorithm are derived directly from characteristics of the specific problem under consideration. Therefore, the possibility of applying the algorithm in its current form to other problems may be limited. The specific application for which the algorithm was developed is a simplified version of the looped water distribution network problem for flat terrain. The objective of this problem is to design the least cost pipe network that is a subset of a rectilinear grid, known as the Hanan grid, formed by extending lines in an east-west and north-south direction through a set of vertices that define the problem. This set of vertices, referred to as the basic vertices, consist of a single source node and several demand nodes. Figure 1(a) shows the basic vertices (shown as solid circles) and the Hanan grid formed by east-west and north-south lines. The grid derives its name from Hanan's work on rectilinear Steiner graphs [3]. (In the representation in Figures 1-3, links are shown separated by a small space so that individual links which constitute the Hanan grid can be clearly identified.)
Water distribution systems usually incorporate loops to ensure some degree of reliability in the event of pipe failure. In the simplified version of the problem considered in this paper, the probability of a link failing while another link that has already failed is being repaired, is assumed to be sufficiently remote that the possibility of more than one link failing at the same time may be neglected. Therefore, the only reliability constraint that is imposed on the design of the layout geometry is that none of the demand nodes can be isolated from the source if any single link fails. This constraint is referred to as the "single link failure constraint" elsewhere in this paper.
Determining the optimal size of components in the network is another aspect of the reliability problem. In actual practice, it is possible to vary the diameters of the pipes that constitute each link and thereby alter the cost per length of any link. Although several methods have been proposed for the optimal component sizing of looped distribution networks, under consideration of these cost and reliability objectives, for example, [1], [4], and [5], the issue is still subject to considerable debate and there is little agreement on either the most appropriate technique or the desired reliability objective. The debate on these issues is not resolved in this paper. Instead, for the purpose of demonstrating the adaptive learning procedure which is the core part of the work, the objective is the more simple one of minimising the total length of the network subject to the single link failure constraint discussed earlier. In this implementation of the algorithm, the cost of each link is implicitly assumed to be directly proportional to its length.
As stated in the introduction, the learning system requires initial starting solutions. However, randomly chosen subsets of the Hanan grid, that are feasible with respect to the single link failure constraint, are extremely rare even for small problems. A method for generating feasible starting solutions has been incorporated in the algorithm. The method assigns to each link in the complete Hanan grid a unique preference value derived at random. Links are then removed from this Hanan grid, in order of preference from the least preferred to most preferred, until a "threshold point", where any further removal of links results in an infeasible solution, is reached. The solution at the threshold point is the least length feasible solution for that randomly generated preference order. From this standpoint, the problem can be viewed as one of determining the appropriate preference order for links that will produce the optimal solution at the threshold point.
Figure 1: Hanan grid and preference and threshold solutions
Figure 1(b) is an example of a layout derived from the layout in Figure 1(a) as obtained using the "preference and threshold" process just described. As noted previously, each link in the Hanan grid has a preference value associated with it, and those links with the lowest preference value are removed first. As each link is removed, a test is performed to determine if the single link failure constraint has been violated. The layout shown in Figure 1(b) was formed by removing 23 links from the Hanan grid in the order dictated by the preference values which were obtained at random. To this point, the layout does not violate the single link failure constraint.
According to the particular preference order for this problem, the next link to be removed is the link shown as the dashed line in Figure 1(b). However, the removal of this link results in an infeasible solution, because if the link indicated by the arrow in Figure 1(b) fails, node isolation will occur if the link represented by the dashed line has been removed. Therefore, the layout in Figure 1(b) represents the threshold point for the preference order considered here. However, the layout can still be improved by removing any "redundant links", (that is, links which form branched or non-looped components of the layout). If such redundant links are removed from Figure 1(b), the layout shown in Figure 1(c) is produced.
Figure 2: Links are removed while new links are simultaneously added
If the preference order is changed by assigning new preference values to any of the links, the process of determining the threshold point will have to be repeated. Small changes in the preference order can produce changes in the layout when the new threshold point is determined. These layout changes occasionally result in improved solutions. The agitator uses a randomised search technique in which a starting solution is modified and replaced if the modification produces an improved solution. The modification process consists of selecting a small number of links at random and changing the position of those links in the preference order. A new threshold point is then determined and redundant links are removed to obtain a new layout. This process was performed on the layout in Figure 1(c) to obtain the layout in Figure 1(d). The process of generating a new solution through random modifications to an existing solution and replacing the existing solution with the new solution, if the new solution represents an improvement, is similar to a (1+1)-ES evolution strategy [6,7].
The changes produced by one iteration of the agitator component illustrated in Figure 1(c) and 1(d) can continue to produce further improvements beyond that shown in Figure 1(d). However, problems begin to occur in this process as the solutions improve. Initially, generated random solutions such as the solution in Figure 1(c), have a high degree of redundancy which can easily be eliminated without violation of the single link failure constraint. Random changes are, therefore, likely to produce improved solutions. However, as solutions improve, much of the redundancy is eliminated and improvements must often consist of a more complex set of changes in order to maintain feasibility. These complex changes are unlikely outcomes of the random alteration of the preference order. For example, the section of the network shown in Figure 2(a) can be improved to produce the section in Figure 2(b), but this improvement requires the elimination of one or more of the four links eliminated in Figure 2(a) while simultaneously adding both of the two links introduced in Figure 2(b). Only one of the four links needs to be eliminated because the remaining links will be removed by the elimination of redundant links as described previously. Improvements that require specific links to be introduced at the same time as others are eliminated are not impossible to produce with the proposed randomised search method used by the agitator, but these types of improvements are less probable than changes that involve only the removal of links.
The use of the agitator in repeated iterations has demonstrated that the randomised search method on which it is based is subject to diminishing returns with increasing effort. As the solutions improve, more iterations become necessary to produce improvements. Paradoxically, as the solutions improve, the changes required to produce further improvements often become obvious to someone observing the progress of the program. Previous work with rule-based systems has demonstrated that rules can be formulated to perform complex types of manipulations similar to those needed in the later stages of the improvement process [2]. The learning system was devised to both formulate and use rules for this purpose.
Rules are derived by comparing the geometry of a solution to the same solution after it has been improved by the agitator. The pattern formed - by the links that were altered and those links immediately adjacent to the altered links - is extracted. This pattern consists of information indicating whether a link was present or absent in the unaltered layout at the specific locations on the Hanan grid where changes take place. If the networks in Figure 1(c) and 1(d) are compared, the subset of the network that has been altered is shown in Figure 3(a). Figure 3(b) is formed when information regarding the links immediately adjacent to the links in Figure 3(a) is included in the pattern. (A solid line in Figure 3(b) indicates that the link was present at the specified location of the Hanan grid and a dashed line indicates that the link was not present.) Figure 3(c) is a similar representation for the same links in the Hanan grid after the layout has been altered. The patterns in Figures 3(b) and 3(c) represent a rule in which the pattern in Figure 3(b) is the antecedent, or "if", clause and the pattern in Figure 3(c) the consequent, or "then", clause.
Once the patterns in Figure 3(b) and 3(c) are recorded in the rule base, the rule can be used to alter a new layout solution. The rule is used by substituting the pattern of links in Figure 3(b) by the pattern in Figure 3(c). If a link is present at every position marked with a solid line in Figure 3(b) in the particular current layout solution under consideration, and no link is found in the layout solution at every position indicated with a dashed line, then the substitution can take place. During the comparison, the pattern in Figure 3(b) can be offset to any position in the Hanan grid. In other words, the rule is not only tested in the row and column positions in the Hanan grid where the pattern was first encountered, but it is to be offset to any row and column position where a match might occur. The length of the corresponding links in this comparison need not be the same in the layout solution as in the pattern in Figure 3(b), only the presence or absence of links is required to match for the rule to be applicable.
The third component of the learning system is the production system that applies the rules and resolves conflicts between rules. The control strategy of the production system is based on the irrevocable best-first approach described in [2], which has been modified such that each time a rule is used, the resulting solution is tested and the effectiveness of the rule is recorded. In this study, the rules are organised into four categories depending on their performance:
Acceptable rules are, according to their recorded history of use, used frequently, never produce increased total length and have never produced infeasible solutions.
The algorithm for the autonomous experimentation to generate rules is as follows:
The objective of the algorithm is ultimately to produce a rule-based program that is capable of transforming any starting solution to the optimal solution and not simply to find the optimal solution to a single given instance of a problem. It is not known at this time whether a finite set of rules can be derived that will accomplish this task and no obvious criterion for stopping the rule generation process suggests itself at this time.
In an improved version of the algorithm, the best solution to the problem under consideration that has been found to date is stored. The agitator is used in Step 3 to a maximum number of iterations. (The maximum number is arbitrary. Favourable results have been found with a maximum number of iterations set between 50 and 100.) If an improved solution is found before the maximum number of iterations is reached the algorithm proceeds as described previously. If, however, the maximum number of iterations is reached and no improved solution has been found, rules are derived as described in Step 4 by comparing the current solution and the stored best solution. In test trials with the improved algorithm, in cases where the optimal solution was known, the algorithm would find the optimal solution and store that solution. When difficult patterns in layout geometry occurred and the agitator failed to produce an improved solution after the maximum number of iterations, the algorithm was still able to derive new rules to these difficult patterns by comparing the current solution with the optimum.
The rule bases grow in size rapidly. Using a 486 33MHz computer, a learning session of one hour typically produces 1,000 rules. As the rule base improves in size, the ability of the production system to improve starting solutions increases, but the time required to obtain the improved solution also increases. The speed of the production system can be improved if, at periodic intervals, the algorithm is halted and all rules found to be behaving poorly are removed from the rule base. As mentioned previously, this process is referred to as "forgetting". In the learning system, as it is currently implemented, a portion of all categories of rules are forgotten based on their frequency of use, except for those rules defined as acceptable rules which are retained. Acceptable rules constitute roughly 10% of the rules learned in trials in which the starting rule base is empty.
Table 1: Comparison of rule base performance
total length (m)
Table 1 compares the results of two rule-based programs of similar size on the same set of starting solutions. Although the rule base used to improve the results in column 3 is smaller (359 rules) than the rule base used to obtain column 2 (430 rules), the smaller rule base was developed over a longer period using forgetting to eliminate rules with poor performance. Both rule bases used the first 10 solutions in column 1 as starting solutions for their respective learning algorithms. The smaller rule base that was developed using the forgetting algorithm clearly outperforms the larger rule base. The smaller rule base found the known optimal solution (194275.2 m) in 24 of the 30 trials as indicated by the asterisk in Table 1, while the larger rule base found the optimal solution in 8 of the 30 trials.
The rule-based learning system presented in this paper is based on two concepts. The first concept is that rules are derived from successful instances of randomised search on the assumption that modifications to solutions that were successful in the past are likely to be successful if repeated. The second concept is that the assumption that previous success will be repeated is a testable hypothesis. The performance of rules derived from past successes can be monitored and assessed. Rules that are not performing as well as expected can be rejected. The frequency of use of a rule, one of the performance criteria, may decrease if a more effective and conflicting rule is learned. A rule base of this type can be regarded as adaptive and self-organising in that it is able to forget obsolete knowledge as better knowledge is acquired.
One drawback of the method is that it is highly specific to the problem that has been considered, particularly in regard to the rule formulation process. Current research is directed at making the technique more universal.
Adaptive Learning in the Design of Pipe Network Layout Geometry
This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
l2h -dir test j_davids.tex.
The translation was initiated by Pam Milliken on Tue Sep 24 10:33:23 EST 1996