Neyman-Scott-based water distribution network modelling

Residential water demand is one of the most difficult parameters to determine when modelling drinking water distribution networks. It has been proven to be a stochastic process which can be characterised as a series of rectangular pulses having set intensity, duration and frequency. Such parameters can be determined using stochastic models such as the Neyman-Scott rectangular pulse model (NSRPM). NSRPM is based on resolving a non-linear optimisation problem involving theoretical moments of the synthetic demand series (equiprobable) and of the observed moments (field measurements) statistically establishing the measured demand series. NSRPM has been applied to generating local residential demand. However, this model has not been validated for a real distribution network with residential demand aggregation, or compared to traditional methods (which is dealt with here). This paper compares the results of synthetic stochastic demand series (calculated using NSRPM applied to determining pressure and flow rate) to results obtained using traditional simulation methods using the curve of hourly variation in demand and to actual pressure and flow rate measurements. The Humaya sector of Culiacan, Sinaloa, Mexico, was used as study area.


Introduction 1 2
Software programs have been developed recently for the detailed modelling of drinking water distribution systems' hydraulic behaviour.Residential water demand is one of the hydraulic variables used in such models; it has been idealised as a variable that varies hourly using a smooth hourly demand variation curve (HDVC).The HDVC is used in practically all known public domain and commercially-available drinking water distribution network modelling software programs (i.e.EPANET, InfoWorks, ScadRED).However, this curve does not accurately reflect reality at residential service level.Residential demand is sporadic, characterised by sudden demand pulses, and tends to have a stochastic nature (Buchberger et al., 2003;Alvisi et al., 2003;Alcocer-Yamanaka, 2007), especially when considering time scales in the order of seconds.Models involving a stochastic focus have thus been recently developed to represent residential water demand, such as the Poisson rectangular pulse (PRP) (Buchberger and Wu 1995;Buchberger et al., 2003) and the Neyman-Scott rectangular pulse models (NSRPM) (Alvisi et al., 2003;Alcocer-Yamanaka et al., 2008).To be applied, the PRP model requires directly registering the instantaneous water demand (with a one-second time interval), whilst the NSRPM considers temporal disaggregation of demand so that different registration time intervals can be used.These models were primarily applied in the hydrology field to generate synthetic series, representing rainfall or storm events.The demand series so enerated have statistical parameters, such as mean, variance, covariance and probability distribution, which are similar or identical to those of the observed demand series.NSRPM's main advantage is that it can work with different registering time intervals.Estimating parameters and generating synthetic demand series enable minimising the amount of information necessary to model residential water demand.

Application site
Deterministic and stochastic models were applied to the drinking water network in the Humaya sector of the city of Culiacan, Sinaloa, in this study due to the large amount of field data available (Alcocer et al. 2008a;Tzatchkov et al. 2005).The available data included pressure, flow rate and water quality measurement at the supply sources and internal points in the water distribution network, as well as regulation tank water heights.The area has two supply sources; one consists of a single well, yielding an average 51 L/s flow rate, and a group of eight wells having 200 L/s maximum capacity.There are two regulation tanks, one having with a 3,000 m 3 capacity (82.63 m above sea level) and the other 2,000 m 3 (80.00m above sea level).The study area's population was estimated to be 85,483 in 2005 at the time the field measurements were made.This figure was based on the number of service connections (20,353) in each suburb and subdivision and the crowding index (4.20 inhabitants/service connection), according to information from the Culiacan Municipal Sewer and Drinking Water Authority.According to local water utility reports, the physical leaks primarily occur at household connections and account for around 30% water loss.
Residential demand measurements were available for 69 homes, having a one-minute time step and three day average period.According to such statistics, this number of homes was equal to the representative sample size for the 20,353 service connections, 95% of them being residential (95% confidence interval, 5% margin of error).

Model used
The geometric data for the drinking water supply network, including the diameters of all the pipes (2 to 18 in), were fed into an EPANET programme.The stochastic model used covered one week (168 hours).The results obtained using deterministic and stochastic models were compared to field measurements.Figure 1 shows the location of the nodes and links analysed within the Humaya sector of Culiacan, Sinaloa; however, due to space and time constraints, this work only discusses a few nodes and links.An hourly variation curve is an idealised model of water demand.It was generated by the National Water Commission (Mexico) using measurements of water demand in residential and commercial areas in hydraulically-isolated sectors of some distribution networks called hydrometric districts or district metering areas (DMA) (Figure 2).It should be noted that the methodology used for constructing this curve included users' water demand and leaks within the networks being analysed.The curve was smooth; however, when comparing this curve to continuous measurement of household demand, it was found that the smooth form of this curve did not represent the real situation.

Stochastic focus
Recognising that demand is random has led some researchers (Buchberger and Wu 1995) to propose that demand follows a Poisson process as time elapses; Buchberger et al., (2003) verified this hypothesis.It is not homogeneous because demand varies considerably throughout the day.Each water demand event can be represented as a rectangular pulse whose height represents its intensity and whose width represents its duration.Demand simulation models have been developed recently, allowing water demand series to be generated using certain stochastic criteria, i.e. the PRP model (Buchberger et al., 2003).These have been based on the following basic parameters: arrival rate λ (representing individual pulses' mean frequency), the mean intensity of the pulses μx, intensity variance Var(μx), mean pulse duration η and variance of such duration Var(η).Although an PRP model is not limited to one-second time intervals, the parameters needed were obtained by using demand measurements having a one-second time step.One-second time step measurements have the advantage of directly monitoring residential demand's real-time evolution but require the use of sophisticated measurement and data storage equipment.Analysing the data involves a high computational demand (Buchberger et al., 2003).Techniques developed in recent years have been geared towards indirectly estimating parameters λ, μx, Var(μx), η and Var(η) for demand data spanning longer intervals, especially when space and time disaggregation is required (Alcocer-Yamanaka et al., 2008a, 2008b;Alcocer-Yamanaka and Tzatchkov 2009;Guercio et al., 2001;Rodríguez-Iturbe et al., 1984, 1987).Estimating parameters was based on establishing an objective function expressing the relationship between statistical moments in observed data series and the model's theoretical moments.This objective function was minimised through non-linear programming techniques, yielding the desired parameters.
NSRPM has been used to model water demand by Alvisi et al., (2003) and Alcocer-Yamanaka et al., (2008a).A certain number C of internal pulses characterises each event where C is a random number having mean μC .The second-order moments of aggregated process Y i (h) were as follows (Entekhabi and Brass, 1990): where λ -1 represented the mean time between two events, β -1 was the mean time between each individual pulse and the start of the event, η -1 was mean pulse duration, μx pulses' mean intensity and h was the analysed aggregation/disaggregation interval.
Once the expressions of NSRPM have been defined, the objective function is formulated as: where F1, F2 and F3 were the mean, variance and covariance of the observed values, respectively, being functions of parameter vector ξ = (λ, η, β).These observed moments were directly obtained from the registered field data in their tabular and graphical form (Figure 3).
The analysis interval had to be established when formulating NSRPM to implement the optimisation scheme (the time interval was one minute in this study).Next, the objective function was minimised through non-linear mathematical programming.Such minimisation yielded values for each of the model's parameters ( decision variables in the optimisation).
The stochastic patterns so generated had to be randomly assigned to demand at each of the model's node (each node has a different number of houses) to apply NSRPM to a drinking water distribution network.This resulted in introducing demand patterns consisting of 10,080 data points, corresponding to each minute which elapsed during one week.Assigning stochastic patterns must also consider the households' socioeconomic level; households were divided into three groups: lower socioeconomic, middle socioeconomic and upper socioeconomic levels.A separate set of stochastic patterns were generated for each group.The necessary parameters for generating synthetic series for the 69 households was initially determined in which the temporal variation of demand was recorded; 50 synthetic series were subsequently generated for assembling and validation.Assembling meant generating 50 series and calculating the average values for the data's statistical moments.By comparing the observed moments and assembled series' moments for each hourly block, it was determined whether the corresponding synthetic series should be accepted and used in the stochastic simulation model.This established that in cases where the difference between the moments' values (observed and assembled) was large, the synthetic series would not be considered as valid for the analysed pattern, would be discarded, and new series generated.However, when the difference between moments was close to zero, both the process and generated synthetic series were considered valid.

Generating synthetic series
The statistical parameters {λ, x, μ, C, η, β} involved in the theoretical moments represented in equation 4 (objective function) were determined after observing moments for the households where the measurements had been obtained from field data.These parameters were then introduced into NSRPM.
The series was generated from the public domain model found in the Rainfall Data Modeling Portal (RDMP) (Mellor, 2007).As generating these series was a stochastic event, then a certain number of simulations had to be performed within NSRPM, each using different random number generation seed.The synthetic series obtained with NSRPM were compared to the series obtained in the field for verification purposes.
The calculated parameters were used to generate 50 synthetic series with data every minute, for each eight-hourly block and each of the 69 households.Thus, 27,600 synthetic series were created covering one week's worth of water demand and represented demand patterns for the 69 households analysed.The results obtained from the optimisation, and required by the Neyman-Scott model, have been presented by Hernández (2009).Each of the model's nodes had a number of assigned houses; each house was assigned a stochastic pattern and a mean level of demand based on the number of houses at the node.The demand levels were assigned to the nodes regarding the areas being covered.The assigned demands were obtained from the previously-generated 69 simulated demand patterns.It should be mentioned that the synthetic patterns corresponded to household demand and each simulated pattern corresponding to a particular household provided EPANET input.The 69 synthetic demand patterns were classified into three socioeconomic levels and households within each socioeconomic level were assigned randomly selected demand patterns.Each pattern contained 10,080 pieces of data, representing demand with a one-minute time step and seven-day duration.

Comparing results from the models to field measurements
Pressure and flow rate measurements were taken in the field at various nodes and pipes in the system where the deterministic and stochastic models were applied.Due to space constraints, a limited amount of data is presented in this paper.Figure 4 shows the comparison between measured pressures and those obtained with both models at node 165.The pressure variation in the deterministic model was strictly cyclic, as expected given that base demand was fixed and its pattern was introduced by the HDVC, having 26.00 maximum and 22.00 minimum head meter pressure.Pressure variation was abrupt and high in the field measurements, contrary to the smooth pressure variation predicted by the HDVC model in the stochastic model; however, some values obtained by the stochastic model were much lower than the observed field values.
One possible explanation for the abnormally low pressures obtained by the stochastic model was that being an extended period (quasi-dynamic) model, the EPANET model was insufficient to represent highly variable water demand so that a truly dynamic model would have been needed.Such analysis was beyond the scope of this paper, however.
Figure 5 shows the flow rate in link 2957 (supply to a zone in the analysed area) having a 12-inch diameter where one-way flow rates were obtained.Similar to the pressure variation explained before, flow variation in this link was abrupt and high in both the field measurements and the stochastic model.Flow rate and pressure patterns were quite variable in this model.Sudden changes were caused by the random generation of demand patterns.This caused certain instants (in the order of minutes) to have high demand followed by near zero demand in the next minute.Another advantage of the stochastic model was that it allowed estimating leakage in the network.The HDVC included physical loss and, when compared to direct measurements, it was possible to observe leakage when the HDVC and the mean flow rate were above the curve representing the real level of user water demand.

Conclusions
This paper has demonstrated the application of stochastic concepts to modelling residential water demand patterns.NSRPM was applied to a hydraulic simulation model resembling the measured pressure and flow pattern for a drinking water distribution network, compared to a traditional HDVC approach.The results of this work lay the foundations for a new, simple and practical tool for engineers and researchers designing and maintaining drinking water distribution systems.This model could be implemented by incorporating these methods into a module within commercial and public domain computer programmes such as EPANET.

Figure 1 .
Figure 1.Location of the network of nodes and links analysed here

Figure 2 .
Figure 2. Hourly demand variation curve for all of Mexico(Tzatchkov 2007)

Figure 3 .
Figure 3. Recorded water demand at a home, in tabular and graphical form

Figure 4 .
Figure 4. Comparing the pressures recorded at node 165 with the values from the models