Soil Dynamics and Earthquake Engineering, 1999, in press. 

COMMON PROBLEMS IN AUTOMATIC DIGITIZATION OF STRONG MOTION ACCELEROGRAMS 

M.D. Trifunac, V.W. Lee and M.I. Todorovska 
Civil Engineering Department, Univ. of Southern California, Los Angeles, CA 90089-2531 

Copyright:     M.D. Trifunac, V.W. Lee and M.I. Todorovska before puplication and Elsevier Ltd. after publication.

 

Key Words: automatic digitization of accelerograms, accelerogram data processing, strong motion data, accelerogram image processing.


ABSTRACT

Common problems encountered in automatic digitization of strong motion accelerograms, recorded on film, are presented and discussed. These include synchronization of the time scale for the three components of motion, nonuniform film speed, trace following in case of scratches or trace crossings, distortions from high contrast preprocessing of the scanned image, and trace "rotation" resulting from rotated position of the scanned film record. Procedures for correcting or eliminating these problems are suggested. The image processing hardware has developed so much during the past 20 years, that at present it exceeds the technical requirements for processing strong motion accelerograms. The problems described in this paper result from lack of training of the operators and lack of quality control in the process, which still seems to be esoteric and highly specialized. This situation may have been caused by the low demand by the engineering profession for high quality and large volume of strong motion data.
 
 

TABLE OF CONTENTS 

 
 
LIST OF FIGURES
 
Figure 1 Evolution in the hardware capabilities for automatic digitization of accelerograms, cost reduction and operator time. 
Figure 2 First ten seconds of the 1994 Northridge earthquake accelerogram at Sylmar Converter Station-East, Valve Hall (ground floor). 
Figure 3  The 1994 Northridge earthquake accelerogram at Sylmar Converter Station-East (free-field): (a) first two seconds and (b) beginning of the vertical trace. 
Figure 4 Illustration of a bitmap image of a trace beginning for different threshold levels (200, 215 and 235), and the corresponding outcomes of automatic digitization. 
Figure 5  The vertical trace of the 1994 Northridge earthquake accelerogram at Sylmar Converter Station-Valve Group 1-6 (basement): (a) the beginning; (b) the beginning enlarged by macro-photography, showing gradual build up of optical density; (c) the beginning enlarged by a repeated Xerox process and the "old" digitized version of the record. The dashed line shows the "old" digitization.
Figure 6 Beginning of the accelerogram of the 20 March, 1994, Northridge aftershock at Sylmar Converter Station, Valve Group 7: (a) ground floor, and (b) free-field site. Two digitized versions, the "old" and the "new", are shown (by a dashed and a solid line), and the time delay between the two. 
Figure 7  Same as Figure 6 but for the accelerograms at Los Angeles Dam, West Abutment. The difference between the two digitized versions in choosing the beginning is significant. In the "old" version, the beginning seems to be chosen arbitrarily. 
Figure 8 Appearance of the scanned image of the record and the outcome of automatic trace following for different choices of threshold level. This example corresponds to an image scanned at 600 dpi resolution with 256 level gray scale. Notice the spurious peak in the digitized data due to noise in the scanned image for lower threshold level (190). For higher threshold levels (e.g., 230), the trace image becomes discontinuous. For clarity, the horizontal scale is stretched three times relative to the vertical. 
Figure 9 Beginning of the vertical trace of the 1994 Northridge at Sylmar Converter Station-East (free-field), and two digitized versions of the record. The "old" version is delayed by 0.053 s, and has a spurious peak near 0.4 s.
Figure 10 An illustration of "pulses" in the digitized data due to scratches on the film and nonuniform optical density of a trace. 
Figure 11 An illustration of errors in digitized data near intersecting acceleration and fixed traces: the transverse trace of the 1994 Northridge record at Sylmar Converter Station-East, free-field, ~9.5 s after trigger. 
Figure 12 (a) The "old" and "new" digitized versions near the beginning of the vertical trace of the 1994 Northridge record at Sylmar Converter Station, Valve Hall Floor. The difference between the two versions is probably due to smoothing effects of the automatic trace following algorithm, more pronounced for lower threshold levels. (b) Illustration of differences in data digitized by the LeAuto software caused by different threshold levels, for a portion of the trace shown in part (a). The image was scanned with 600 dpi resolution and 256 level gray scale. 
Figure 13 (a) Segment of the 20 March aftershock record at Sylmar Converter Station-East, free-field, scanned from film. (b) An enlarged portion of the vertical trace after repeated Xeroxing, showing associated distortions. Also the "old" and "new" versions of digitized data are shown. The differences may be due to high contrast preprocessing of the scanned image in producing the "old" version. 
Figure 14 Illustration of a record imperfectly aligned with the scanner. There is an angle a between the coordinate system of the scanner (XSCAN-YSCAN) and the accelerogram coordinate system (xREC-yREC). The two fiducial marks can be used to evaluate a and correct for it. 
Figure 15 Differences in the "old" and "new" versions of a digitized record showing relative rotation by an angle a (see Figure 14). The "new" version was corrected for this angle with the help of the fiducial marks shown in Figure 14. 
 
 

1. INTRODUCTION

The hardware and software for automatic digitization of strong motion accelerograms benefited from the rapid growth of the image processing technology since the 1970's. Modern scanners were first used to scan large number of accelerograms in the late 1970's. The first system ran on a Data General NOVA computer and used OPRONICS rotating drum scanner. Although the resolution of this scanner was 12.5 x 12.5 microns (2032 dpi), it was actually operated at a four times smaller resolution of 50 x 50 microns (508 dpi) which was found to be optimal for this type of application. The cost of hardware for this system in 1977 was $180,000 (Trifunac and Lee, 1979). About 10 years later, Personal Computers and inexpensive flat-bed scanners (at first, HP ScanJet II Plus, with 300 dpi optical resolution) appeared, and the automatic digitization software was modified to work on a PC (first on IBM PC AT; Lee and Trifunac, 1990). Today this system is running on a Pentium II PC with HP ScanJet 4c (600 dpi optical resolution). The cost of the most modern hardware system is under $5,000. Figure 1 illustrates the progress in hardware capabilities (scanner resolution, CPU speed, capacity of hard disk storage, typical scanning time, typical operator time, and cost of one system). In the hardware capabilities, there have been dramatic improvements in CPU speed and in hard disk capacity. Even more dramatic has been the reduction of cost of the hardware system and reduction of operator time since 1970’s.

Judging by the current trends in computer and information technologies, the possibilities for further improvements of the digitization system and for new developments are essentially unlimited. However, advanced hardware and software alone do not guarantee high quality of the digitized data. Critical for high quality processed data remains to be an experienced operator and rigorous quality control. We have found recently many errors in commercially processed records of the 1994 Northridge earthquake and one aftershock, which we believe are due to operator inexperience, false expectation that the software will perform all the tasks perfectly and automatically, and lack of quality control. Errors caused by lack of operator knowledge of the empirical procedures in the automatic trace following software, bad operator judgement and lack of quality control can be so severe that the quality of data from automatic digitizers may be inferior even to the old hand digitized accelerograms (Trifunac and Lee, 1973). Errors of such nature may be very common in digitized strong motion data worldwide, due to the large volume of recorded analogue data, and pressure to reduce the backlogs.

Empirical modeling of strong ground motion amplitudes and studies aiming to interpret the physical nature of strong motion depend on the availability of a large number of uniformly and accurately processed accelerograms. The amount of data for such analyses is growing, as more instruments are deployed and more earthquakes are recorded. For example, we had 186 uniformly processed three-component accelerograms in the mid 1970's, about 550 in the early 1980's, and about 2000 in the early 1990's. With such a large number of records, digitized and processed by many different organizations, it is becoming increasingly difficult to control the quality and uniformity of the data. At present, the digitized strong motion data worldwide is of nonuniform quality and often lacks complete supporting documentation. Recording strong motion is a long term process, changes in personnel responsible for network operation and data processing are inevitable, and it is often impossible to recover the original information. Unless this practice changes soon, it will become difficult to recover and preserve data it took so long to record, and to carry out large scale regression analyses and many new specialized studies.

Contrary to the popular perception, the bottleneck in data processing and dissemination is not in the digitization of analogue records, but in verification, quality control, inclusion in a database, and the lack of clearly defined priorities. Groups which gather, process and disseminate data are rarely involved in large scale data analyses, and so do not have first hand experience of what constitutes complete and necessary data. Thus the supporting information of processed accelerograms is often incomplete. For example, instrument characteristics are rarely supplied in adequate detail, so that more advanced instrument correction algorithms cannot be implemented (Novikova and Trifunac, 1991, 1992; Todorovska at al., 1995; Todorovska, 1998). Transducer characteristics are not calibrated periodically in field conditions; in many instances, nominal values are used, or 10 to 20 years old values supplied by the manufacturer (Todorovska et al, 1998). Cross-referencing data files with detailed information on the earthquake source and recording site characteristics is often not available or is incomplete.

The purpose of this paper is to point out some common problems encountered in automatic digitization of accelerograms, to describe their origin and how to avoid them, and to illustrate their appearance in digitized records. The intended audience are both those that produce and those that use strong motion data, the former to recognize and avoid these problems, and the latter to be aware of these problems in analyzing the data. The presented examples come from digitization of real strong motion accelerograms, and are typical of errors we found in commercially digitized recordings of the 17 January, 1994, Northridge, California, earthquake (ML=6.4) and its 20 March, 1994, aftershock at the Van Norman complex of the Los Angeles Department of Water and Power (Lindvall-Richter-Benuska Associates, 1995). We redigitized these records for compatibility with records of the same earthquake in this area, which we digitized earlier. We will refer to the two versions of a record as "old" and "new." In this paper, we compare the two versions and try to explain the differences. For better understanding of the discussed problems, we describe briefly our system for automatic digitization and the principles around which the software was designed, and refer the readers to references which offer more detail. We could find no detailed description of the system used to process the "old" versions, in professional journals or in technical reports. So, in many instances we only guess how the "old" version was digitized.


2. AUTOMATIC DIGITIZATION PROCEDURES

The main tasks in automatic digitization of accelerograms are: (1) converting the film image into a digital bitmap with a defined binary or gray-scale level for each pixel (corresponding to the optical density of the film record), (2) creating line segments out of the bitmap by automatic trace following (given a threshold gray level and minimum trace width in pixels, specified by the operator), (3) interactive editing of the set of line segments created automatically by the computer programs performing the previous task, and (4) trace concatenation, in case of long records and several separately scanned pages, and writing the trace coordinates (in the scanner coordinate system) and other information (e.g., scanning resolution and trace type) into a disk file, in format recognizable by the software for further processing (involving instrument and baseline corrections, Trifunac 1971; 1972). These tasks are performed by four computer programs with respective generic names: Film, Trace, TV and Scribe (Trifunac and Lee, 1979). These programs have evolved with the evolution of hardware. Numerous major changes and improvements have been added in the late 1980's (Lee and Trifunac, 1990) and again in 1995/96, but the sequence of tasks has not changed.

Our current software package, LeAuto, consists of computer programs LeFilm, LeTrace, LeTV and LeScribe. LeFilm is interactive, menu driven program which drives the scanner to produce a gray-scale bitmap image of the film record (256 levels of gray is the default mode; the other two modes are binary black-and-white and gray-scale at 16 levels of gray). The operator interactively chooses a minimum size rectangular window containing all the necessary information to be saved for further processing, and an optimum threshold level and minimum trace width for the automatic trace following by the next program. The operator also defines the starting points for each trace and trace type. Program LeTrace performs the trace following and is entirely automatic. Program LeTV is entirely interactive and menu driven. The trace editing is the most time consuming part of the procedure. The program efficiency and the quality of the output depend on the operator training and experience. The operator intervention consists of deletion of line segments resulting from imperfections of the film record (scratches and dirt particles), joining consecutive segments of a particular trace, adding and deleting points in a line segment, and also completely redigitizing selected portions of traces at a threshold level different from the one used by LeTrace. Most of the modifications of this program have aimed to help the operator make decisions and ease the editing process as much as possible. Most of the examples discussed in the next section of this paper have been extracted from various operations performed by this program. The last program, LeScribe, can be executed in an automatic or interactive mode. The former is used for short, one page, film records. The later is used for multiple page records and allows operator intervention and visual control of joining the trace segments from different pages.

Figure 2 shows an example of a film record (the first 10 s of an accelerogram record of the Northridge earthquake at the Van Norman Complex). It has three acceleration traces (L, V and T), one baseline (B) and two two-pulse-per-second (2PPS) traces (the segmented lines at the top and bottom). It is seen that the T-trace intersects at many places with the baseline. Significant operator intervention and use of a special automatic digitization software (employing dynamic optimization of threshold levels and monitoring the top or bottom edges of a trace) had to be employed to digitize this trace.


3. COMMON PROBLEMS ENCOUNTERED IN AUTOMATIC DIGITIZATION

3.1 Delays in Digitized Time Series for Different Components of Motion

The position of the first digitized point of an acceleration traces determines the origin of its time coordinates. Because of the common trigger mechanism, all the three traces (L, V and T) start to be recorded simultaneously, but if the first point is not digitized properly, there will be a time delay in the digitized time series. Synchronization of the origin time and of the running time scales for all the recorded components in a film record is crucial for many applications, for example in inverse analyses of the earthquake source mechanism ("random" time delays in the digitized data affect the numerical stability of the inversion), applications that require linear combination of the recorded components of motion, such as computation of the radial and transfer components of motion (affect the accuracy of peak amplitudes in the rotated directions; Todorovska and Trifunac, 1997), in analyses of building response from wave propagation viewpoint, or in correction of accelerograms for cross-axis sensitivity and transducer misalignment (these corrections are meaningless unless the three components are synchronized; Wong and Trifunac, 1977; Todorovska 1998; Todorovska et al., 1995; 1998).

The difficulty in selecting the first point to be digitized is due to the fact that the traces are weak immediately after trigger (while the light bulb warms up; during ~0.1 s), and the position of the first digitized point changes with different choice of threshold level. This is illustrated in Figure 3 and Figure 4. Part (a) of Figure3 shows the first 2 s of a film record and part (b) shows an enlargement of the beginning of the V trace. Figure 4 (an illustration of an image in LeTV) shows the position of the first digitized point and the bitmap (raw digitized data) for different choices of threshold levels of gray (between 200 and 240). It is seen that for lower threshold level, the trace width increases and the trace beginning "moves" to the left, while the opposite is true when the threshold level is increased. This situation is further complicated by the fact that the trace "darkness" on the film, translated to trace "thickness" after scanning, is in general different for each trace on the same film, and depends not only on time after trigger but also on the amplitude of recorded motions (larger amplitude results in "lighter", i.e. "thinner" trace for any given constant threshold level). The examples in Figures 3 and 4 are from a record of the 1994 Northridge earthquake; gray threshold levels of 180 to 240 were common in digitization of the Northridge records.

Because of the above, choosing one common threshold level for the entire accelerogram (which was the default procedure in the old versions of program LeTrace) will result in variations of the origin time by at least several pixels. Other parameters specified for the trace following, e.g. minimum trace width (usually selected as 2 to 4 pixels for digitization with 600 dpi) will also contribute errors to the selection of the first point, and the overall uncertainty of the origin of the time axes can exceed 0.01 to 0.02 s (two to five pixels for 600 dpi digitization resolution). One way to reduce these errors is intervention by the operator who can choose manually the first point, but this is time consuming and is subjective. To increase the efficiency of this step and to eliminate operators subjectivity, since 1994 we have used a new special purpose algorithm which determines automatically the "best" starting point for each acceleration trace. This algorithm was tested on hundreds of accelerograms, and was found to be successful (error less than one pixel) in about 95 percent of cases). Difficult cases would still require operator intervention. An example of a circumstance that can complicate this task is when the gap between the end of the previous record and the onset of the current record is too short or the traces overlap, so that they appear as continuous on the scanned image. When such traces are displaced (i.e. the trace starts with a large amplitude), this problem is eliminated. An illustration of such "overlap" is shown in Figure 5. The dashed line in part (c) is an example of an inaccurately digitized trace (0.05 s was omitted in the beginning).

Figure 6 shows an example of raw data images and inaccurately (arbitrarily) digitized trace beginnings (the dashed line, referred to as "old" digitization) and trace beginning digitized with the help of the new features of program LeFilm, LeTV and LeTrace. In this example the magnitude of the error for the "old" digitization is difficult to interpret, because of the lack of apparent reasoning that guided the operator. For example, it is not clear why the "old" digitization of the L trace in Figure 6b starts early with a "ramp" of ~0.05 s, in a manner not related to the raw data. Parts (a) and (b) show respectively motions recorded at the ground floor of a structure and in the free-field (approximately 25 m away towards south-west), both recorded by the same multi-channel recorder (CR-1), on the same film and with common trigger time. Such records are invaluable for soil-structure interaction studies, and for analyses of differential motions between the two closely spaced points. Errors as those seen in Figure 6 (in the origin times of different components) make the "old" digitization useless for detailed studies, and misleading for an unsuspecting user. "Conclusions" that could result from such data would be, e.g., that the high frequency strong motion amplitudes are not correlated at separation distance of 25 m, that the two sites have different soil properties (i.e. different wave velocities), that the wave train approached the two stations from a different direction (i.e. different phase velocities) and so on. Another example is shown in Figure 7 (Los Angeles dam, right abutment record of the Northridge earthquake). It is seen that, in the "old" digitization (dashed line), the operator started to digitize the L, V and T traces 0.135, 0.190 and 0.175 seconds after trigger.

Many such errors go undetected, because their amplitudes are small enough that visual comparison of raw data output from digitization with original film cannot detect the discrepancies, and because hardly any records happen to be digitized more than once, independently, allowing such comparisons. The errors illustrated in Figure 6a and Figure 6b and in particular the errors shown in Figure 7 show negligence and lack of quality control. The cumulative investment of time and of resources in the analysis of the recorded motions and of their effects on different structures is significant. In the light of these conditions the above described errors cannot be tolerated.

3.2 Trace Following

3.2.1 Non uniform Optical Density

Ideally, the traces on the film should be thin, sharp and uniformly dark, resulting in a scanned trace image which is continuous. In reality, the optical density of the trace varies with the trace amplitude, and there are "random" variations resulting from imperfections of the instrument or handling of the film. The scanned trace image is, in general, lighter for larger trace amplitudes, but at the very peak it is darker. The reason for the former is that, for uniform film speed and light bulb intensity, the light beam moves faster and travels longer distances as it exposes the film, and for the latter is that the trace segments just below the peak partially overlap due to finite trace thickness, and the vertical trace velocity is zero at the peak, so that the light beam exposes the film longer. Typical imperfections include scratches or dust, poor focussing of the traces, fogged or dirty mirrors and lenses, variable darkness of the film background, etc., and result in "random" fluctuations of trace darkness.

The variations in darkness (optical density) of the film image result in variations of gray levels of the scanned pixels. The "width" of the scanned trace image depends on the chosen threshold level, and can be controlled by choosing this level. Pixels with smaller gray level than the threshold will be considered as "white" (background) in the case of dark traces on white or transparent background. Traces are "wider" for lower threshold level, and become discontinuous for sufficiently high threshold level. Figure 8 illustrates these variations. It shows a bitmap image of an acceleration trace for three threshold levels of gray, 190, 220 and 230 (600 dpi scanning resolution, 256 levels of gray). It is seen that, for threshold level 230, trace bitmap image is narrow and discontinuous. For threshold level 190, near the acceleration peaks, the trace bitmap image is wide and continuous, and has rough boundaries (to the right of the negative peak). The rough boundaries are caused by random variations in the trace and background optical densities, and may lead to spurious peaks in the digitized data.

The estimated amplitude of the digitized signal (y-coordinate) by the automatic trace following algorithm, is at the middle of the vertical cross-section of the bitmap trace image, and depends on the chosen threshold level. The signal is estimated more accurately from the bitmap if the traces are thinner but continuous. If the bitmap trace image is discontinuous, not enough measured information is included in the estimation, and if threshold level is too low (thick trace) the signal is estimated from measurements which are too noisy. The "optimum" threshold level for the illustration in Figure 8 appears to be near 220. However, this level will not be the optimal one along the entire length of the record, because the trace darkness on the film, and consequently the thickness of the scanned trace image, depend on the signal amplitude. The operator adopts a threshold level for automatic trace following which is "optimal" for most parts of the signal, and in the finishing phase corrects the detectable errors due to too high or too low threshold levels.

Figure 9 shows another example of a spurious peak. Such peaks are difficult to detect, because many local acceleration peaks do have similar "double" peaks. These spurious peaks can be detected only by detailed and time consuming visual inspection of all digitized signals versus the scanned image (raw data) for various threshold levels.

A related type of a problem occurs when there is a "white" spot in the middle of the scanned trace image, illustrated in Figure 10a. The automatic trace following algorithm will follow the wider branch of the trace, producing a pulse like error in the estimated signal. A related and often overlooked problem occurs when the trace has a one sided indentation (Figure 10b) or when a dust particle, a dark scratch or a dark shadow crosses the trace (Figure 10c). Digitization errors illustrated in Figure 10 are very common, difficult to detect and occur almost exclusively for data scanned with 600 dpi resolution. Such errors were very rare for scanning with 300 dpi. For higher resolutions, e.g. 1200 dpi, these errors will multiply and may be responsible for most of the operators time in manual verification and correcting the data in LeTV.

Intersection of an acceleration with a fixed trace ("straight" line created by a fixed mirror) is usually not difficult to interpret automatically, particularly if the trace following algorithm can memorize, as well as control the variations of the slope of the estimated signal based on prior knowledge of the type of trace followed. However, situations when the baseline is unusually thick and the acceleration trace has a peak near the baseline are difficult to handle automatically, and the best approach is to check for such problems and fix them manually in the editing phase of program LeTV. Figure 11 shows an example of two spurious peaks in the digitized signal resulting from such a situation, which have remained undetected in commercially digitized record of the Northridge earthquake.

3.2.2 Smoothing of the High Frequency Peaks

Another common problem with automatic trace following is "low-pass filtering" of high frequency peaks in the recorded signal. This is caused by merging and partial overlapping of the trace just below the peak, due to finite thickness of the light beam exposing the film. In the immediate vicinity of the peak, the resulting trace thickness is increased, but only on the side towards the zero-acceleration line, and the amplitude of the estimated signal at the peak is biased towards smaller values. This problem is pronounced for vertical acceleration traces, which usually contain more high frequencies than the horizontal accelerations, especially in the near-field of moderate and large earthquakes. High frequency, large vertical accelerations result in large amplitude, rapidly oscillating traces with low optical density (e.g. see the vertical traces in Figure 2 and Figure 3a). They require lower than average threshold levels for automatic trace following, otherwise the trace is not continuous. On the other hand, lowering the threshold level will increase the filtering of high frequencies, and to avoid this effect in portions of the signal with intermediate and small accelerations (e.g. see horizontal acceleration traces in Fig. 2 and 3a), the threshold level should be as high as possible. It should be mentioned here that this problem cannot be solved by reducing the pixel size. Its consequences can be diminished only by reducing the trace width on the film (by careful focusing), and by increasing the film speed beyond 1 cm/s.

The contradictory requirements calling for lower threshold level, so that the trace image is continuous, and for higher threshold level, so that peak amplitudes are not reduced, can be solved simultaneously if variable threshold levels are used in automatic trace following. This requires sophisticated and reliable software, and highly skilled operators. For these reasons, we run this software only in the manual mode in program LeTV. Such a software is generally not available, and usually a compromise must be made by choosing some intermediate threshold level. This leads to (1) smoothing of the amplitudes of high frequency peaks ("low-pass filtering") where the chosen threshold level is too low (Figure 12), and (2) missing variations in the signal where the chosen threshold level is too high (e.g., ascending and descending line segments in the areas with large peaks; see Figure 5c and Figure 12). The latter causes high frequency large peak amplitudes to be connected by straight lines because the darkness of the portion in between is below the working threshold level. Figure 12b illustrates outcomes of automatic trace following for different threshold levels in the region outlined in Figure 12a. It is seen that, for threshold level <200, the high frequency acceleration (~ 20 Hz wave) is reduced, and it almost disappears for threshold level 150 or lower. For threshold levels 200-230, the trace image between the peaks is progressively lost (due to too high working threshold level) and is "digitized" as a straight line (e.g., see also "old" digitization in Figure 5c, between 0.4 and 0.6 s).

3.2.3 Distortions Due to High-Contrast Preprocessing of the Film Image

Many of the above problems can be reduced or eliminated by working with 256 levels of gray, versus 16 levels or black and white image only. This, however, requires large capacity and fast access hard disk, as well as a fast CPU, and was difficult or impractical to implement in the past, when only 286, 386 or 486 PC-s were available. Simple, well recorded accelerations can be processed successfully by using only 16 levels of gray (0 - "white"; 15 - "black"). However, digitizing a binary black-and-white scanned image is very difficult, will inevitably require extensive photographic or digital enhancement of contrast, and will lead to situations which are very difficult to process correctly. Moderate enhancement of contrast can be useful (sometimes it is essential), but it may also lead to complicated distortions of the traces, to the extent that is not acceptable. Such distortions of the bitmap are illustrated in Figure 13. Part (a) shows a segment of a record of the 20 March, 1994, Northridge earthquake aftershock (M = 5.3) copied directly from the film. In part (b), an enlarged portion of the vertical acceleration trace is shown, after successive Xeroxing of the record (equivalent to multiple high-contrast processing of the film image). It can be seen that this process introduced characteristic elongation of peak amplitudes (see also the vertical traces in Figures 6a,b in which the film image was enlarged by successive Xeroxing). Figure 13b shows comparison of two independent digitizations of this segment which differ significantly near the peaks. It appears that the "old" digitized version has distortions near peaks of the same type as the repeatedly Xeroxed image shown below. The "new" version, digitized by the LeAuto software (600 dpi resolution and 256 levels of gray), shows no such distortions. We do not know the exact reason for the distortion in the "old" version of this record, and we have never encountered such distortions in our digitization of strong motion records. We speculate that this record was digitized after excessive high-contrast enhancement of the bitmap image by software, or by a lithographic photo process prior to scanning. These types of distortions should have been detected in the quality control phase of the job.

3.3 Nonuniform Film Speed

The nominal film speed for typical strong motion accelerographs is 1 cm/s (SMA-1, CR-1). For the M02 accelerograph (New Zealand) which records on a 35 mm film, the actual film speed is 1.5 cm/s, and this is equivalent to 3 cm/s for a 70 mm film (SMA-1). Increasing the film speed improves the resolution and accuracy of digitization of high frequency accelerations not only in time, but also in amplitude (see the discussion on the low-pass filtering of high frequency and large amplitude accelerations in the previous section). The old AR-240 accelerograph (which recorded the Pacoima Dam accelerogram during the 1971 San Fernando, California, earthquake) had recording speed of 2 cm/s (equivalent to 0.5 cm/s on a 70 mm film; Hudson, 1970).

Most instruments have one or two relays which produce a two-pulses-per-second signal (2PPS), recorded along the top and bottom edges of the film or paper. About 20 years ago, one of these relays was converted to work with a local clock which produces a binary code of the Julian day, hour, minute and second every 10 seconds. At first, it was believed that absolute trigger time was not necessary for recorded strong ground motion (Hudson, 1970), but since it was first introduced in the early 1970's (Dielman et al., 1975), it opened many new possibilities for advanced wave propagation studies in strong motion seismology.

The 2PPS signal accuracy is believed to be ~1 percent, but it is rarely calibrated. Since the early 1970's, we have assumed that the time coordinate of analogue records is scaled more accurately using 2PPS pulses (generated electronically) rather than by the nominal speed of the film (driven mechanically), and have used it to correct for minor variations in the film speed (Trifunac and Lee 1973; 1979; 1990). Exceptions are records for which the 2PPS and absolute clock relays malfunctioned simultaneously (e.g. Trifunac et al., 1998). Figures 7, 9, 12a and 13 show differences in the digitized accelerations when the time is scaled using the digitized 2PPS signal ("new") and when we assume uniform film speed was used ("old"). The difference is manifested by time dependent delays between the two records.

Occasionally the film speed may experience abrupt changes and stalls. This is caused by friction in the film driving mechanism, friction in the film cassette or by faulty motors, and in general cannot be corrected uniquely. Duration of short stalls can be estimated by measuring the shortening of distance between consecutive pulses of the 2PPS signal. Approximate corrections of the digitized data then can be performed by inserting "gaps" into the scanned bitmap image, and recreating manually the missing portion of traces (Lee and Trifunac, 1984). A description of processing an accelerogram with many stalls can be found in Trifunac et al. (1998).
 

3.4 Rotation of the Digitized Traces

Because the record is digitized directly from the scanned image, it is important that the film or paper original is well aligned with the scanner. Perfect alignment would mean that the time axis of the record is parallel to one of the scanner axes. In reality, however, this is difficult to achieve, and the two are off by some small but finite angle, a. The problem of imperfect alignment is one of the oldest problems recognized in digitization and processing of strong motion accelerograms (Trifunac, 1971). By careful placement of the original onto the scanner or the digitizing table, angle a can be kept small, i.e. of the order of 1° (Figure 14). One simple way to correct for this angle is by marking and digitizing at least one pair of fiducial points onto one of the fixed traces, evaluating a from the position of these marks, and rotating back the digitized signals for the same angle. The accuracy of this correction is sufficient for typical acceleration records. It may be limited by the noise in the scanned image of the baseline only for very short records (less than 5 to 10 s long). While marking and digitizing fiducial points is esential for long records that require scanning multiple pages (the feducial marks are used to match the independently digitized pages), it is often neglected by some operators for one page records, and in those cases correction for the angle a is not performed. The associated distortions for a real record are illustrated in Figure 15. It shows a segment of the transverse acceleration trace of the record in Figure 14. The "old" digitized trace appears to be rotated clockwise relative to the "new" trace by a ~ 0.9° . This is most obvious near the positive and negative peaks. Assuming that the scanners used for the "old" and "new" digitized versions were both accurate, and knowing that the "new" version was corrected for a rotation a , the difference between the "old" and "new" versions could be explained as an error in the "old" version due to imperfect alignment of the original record with the scanner (a ~ 1° ). If this record was photographically enlarged before scanning, the observed rotation could also be explained as distortion caused by imperfections of the lens, or by nonparallel planes of the negative and of the projected image.


4. DISCUSSION AND CONCLUSIONS

Selected most common problems encountered in automatic digitization of accelerograms recorded on film were illustrated and discussed. This was preceded by a brief review of the changes in hardware and software capabilities since the 1970’s. Differences between two digitized versions of the same original records were discussed and explained in most of the presented examples (the "old" version was digitized commercially for the LADWP in 1994, and the "new" version was digitized by the authors of this paper using the LeAuto software package). The problems discussed include errors in identifying the origin of common time coordinates for the three recorded components of motion, smoothing of the high frequency peaks as well as of amplitude variations between peaks (both in segments of the record with moderate and large signal amplitudes), spurious peaks and noise in the digitized signal due to imperfections of the film image or specific circumstances (e.g. intersection of traces), possible distortions due to high contrast preprocessing of the scanned image, scaling of the time coordinate and rotation of the digitized signal. The errors due to all of these problems can be either eliminated or significantly reduced, automatically by intelligent algorithms (performing accurate identification of trigger times for each trace individually and correction for common time scale fluctuations) or manually by operator intervention.

The most important factor determining the quality of the processed data is the experience of the operator and rigorous quality control of the outcome of the automatic trace following. An experienced operator can avoid most of these problems by carefully using the software. Finally, critical comments and suggestions by experienced operators are invaluable for further improvement of the automatic digitization software. The efficiency and reliability of this software can be improved only by close interaction between the programmer and an experienced operator.

The digitization of accelerograms recorded on film can be viewed as an estimation problem using noisy measurements, and the result is nonunique. Too high threshold level results in exclusion of recorded information in the estimation process, and too low threshold level results in including too much noise in the estimation process. We showed that systematic biases in the estimate due to this "noise" (e.g. spurious peaks and additional pulses, and smoothing of the sharp peaks) can be eliminated or at least abated by careful choice of the threshold level. Scanning the image with a 256 level gray-scale is highly recommended, while digitization from a binary black-and-white bitmap image is discouraged. A major improvement of the software for automatic trace following would be a self learning algorithm with adaptive threshold level. Such an algorithm has been incorporated in the editing phase of our procedures (LeTV). Within this phase, the algorithm is used to digitize a trace segment in a window defined manually by the operator. Much work remains to be done to include this algorithm in the fully automatic trace following procedure (LeTrace).

There is a common misperception that the accuracy of estimating the high frequencies in a record will increase significantly by scanning the film record at a higher resolution. This is true only up to a limit. For example, the "smoothing" of the amplitudes of the sharp peaks can be eliminated only by better focussing of the light beam (possible only up to a certain degree) or by increasing the film speed. Our experience shows that, for higher scanning resolution (600 dpi), the digitized signal detects more noise (high frequency errors) due to imperfections on the film, such as scratches and dust (these imperfections are "not seen" or are "smoothed out" by larger pixels, for lower scanning resolutions e.g. 300 dpi).

We conclude that the governing factor for high quality digitized accelerograms is rigorous quality control and experienced and conscientious operator. Intelligent software as well as the increased hardware capabilities significantly speed up the process and reduce the labor cost. However, no software will take care of all of the difficulties. Continuous and systematic quality control by the operator, at all the phases of the process, will always remain the determining factor for the quality of the end product.


5. ACKNOWLEDGEMENTS

The authors thank Ron Tognazzini and Craig Davis of the Los Angeles Department of Water and Power for making available the original films and the commercially processed data of the 1994 Northridge earthquake records, used to illustrate the problems discussed in this paper.


REFERENCES

  1. Dielman, R.J., T.C. Hanks and M.D. Trifunac (1975). "An Array of Strong Motion Accelerographs in Bear Valley, California," Bull. Seism. Soc. Amer., 65, 1-12.
  2. Hudson, D.E. (1970). "Ground Motion Measurements," Chapter 6 in "Earthquake Engineering" edited by R.L Wiegel, Prentice Hall, Inc., Englewood Cliffs, N.J.
  3. Lee, V.W. and M.D. Trifunac (1984). "Current Developments in Data Processing of Strong Motion Accelerograms," Dept. of Civil Eng. Report 84-01, Univ. of Southern California, Los Angeles, California.
  4. Lee, V.W. and M.D. Trifunac (1990). "Automatic digitization and processing of accelerograms using PC," Dept. of Civil Eng. Report 90-03, Univ. of Southern California, Los Angeles, California.
  5. Lindvall-Richter-Benuska Associates (1995). "Processed LADWP Power System Strong Motion Records from the Northridge, California, Earthquake of 17 January, 1994," Report LRB 007-027, prepared for the Los Angeles Department of Water and Power, Los Angeles, California.
  6. Novikova, E.I. and M.D. Trifunac (1991). "Instrument correction for the coupled transducer-galvanometer system," Dept. of Civil Eng. Report 91-02, Univ. of Southern California, Los Angeles, California.
  7. Novikova, E.I. and M.D. Trifunac (1992). "Digital instrument response correction for the Force Balance Accelerometer," Earthquake Spectra, 8(3), 429-442.
  8. Todorovska, M.I. (1998). "Cross-axis Sensitivity of Accelerographs with Pendulum like Transducers-Mathematical Model and the Inverse problem," Earthquake Engineering and Structural Dynamics, 27(10), 1031-1051.
  9. Todorovska, M.I. and M.D. Trifunac (1997). "Amplitudes, Polarity and Time of Peaks of Strong Ground Motion During the 1994 Northridge, California Earthquake," 16(4), 235-258.
  10. Todorovska, M.I., E.I. Novikova, M.D. Trifunac and S.S. Ivanovic (1995). "Correction for Misalignment and Cross Axis Sensitivity of Strong Earthquake Motion recorded by SMA-1 Accelerographs," Dept. of Civil Eng. Rep. No. 95-06, Univ. of Southern California, Los Angeles, California.
  11. Todorovska, M.I., E.I. Novikova, M.D. Trifunac and S.S. Ivanovic (1998). "Advanced Sensitivity Calibration of the Los Angeles Strong Motion Array," Earthquake Engineering and Structural Dynamics, 27(10), 1053-1068.
  12. Trifunac, M.D. (1971). "Zero Baseline Correction of Strong Motion Accelerograms," Bull Seism. Soc. Amer., 61, 1201-1211.
  13. Trifunac, M.D. (1972). "A Note on correction of Strong Motion Accelerograms for Instrument Response," Bull. Seism. Soc. Amer., 62, 401-409.
  14. Trifunac, M.D. and V.W. Lee (1973). "Routine Computer Processing of Strong Motion Accelerograms," Report EERL 73-03, Calif. Inst. of Tech., Pasadena, California.
  15. Trifunac, M.D. and V.W. Lee (1979). "Automatic Digitization and Processing of Strong Motion Accelerograms," Parts I and II, Dept. of Civil Eng. Report No. 79-15, Univ. of Southern California, Los Angeles, California.
  16. Trifunac, M.D. M.I. Todorovska and V.W. Lee (1998). "The Rinaldi Strong Motion Accelerogram of the Northridge, California, Earthquake of 17 January 1994," Earthquake Spectra, 14(1), 225-239.
  17. Wong, H.L. and M.D. Trifunac (1977). "Effects of Cross-Axis Sensitivity and Misalignment on Response of Mechanical Optical Accelerographs," Bull. Seism. Soc. Amer., 67, 929-956.


Home

Contact: mtodorov@usc.edu

USCweb