# **Issues in Line Edge and Linewidth Roughness Metrology**

# J. S. Villarrubia

National Institute of Standards and Technology,<sup>†</sup> Gaithersburg, MD, 20899, USA

Abstract. In semiconductor electronics applications, line edge and linewidth roughness are generally measured using a root mean square (RMS) metric. The true value of RMS roughness depends upon the length of edge or line that is measured and the chosen sampling interval. Additionally, the true value is obscured by a number of measurement errors: Different finite-length sections of line have randomly differing roughnesses, producing a sampling error, the expected magnitude of which depends upon the length of line that is sampled and details of its roughness power spectrum. Noise in the microscope images from which roughness is computed results in both a random measurement error and a non-random measurement bias. These issues and proposed solutions in the literature are reviewed. It is also suggested that there may be a plausible role for non-RMS metrics, for example estimation of the likelihood of width or edge position extremes based upon direct measurements of the roughness amplitude density function.

**Keywords:** line edge roughness (LER), linewidth roughness (LWR), measurement algorithms, measurement bias **PACS:** 06.20.-f, 06.30.Bp

# **INTRODUCTION**

Integrated circuit functional structures are generally designed with simple linear or smoothly curved shapes. Thus, transistor gate edges are drawn linear and parallel, as are the edges of lines in interconnect layers. Contact holes are meant to be cylindrical, with smoothly rounded edges. As is now widely appreciated, actual printed features do not entirely preserve the smoothness of their designs. A number of different materials, chrome in the mask, photoresist, polycrystalline silicon, each with its own characteristic roughness depending upon an interplay of material properties and processing conditions, contribute to the roughness of the final device features. Material and process properties generally have little connection with the size of the devices, so roughness does not scale with technology node.

Yamaguchi et al. divide the effects of roughness into two categories, degradation or variation of device properties.[1] A number of published studies have connected roughness to device performance, either via simulation or experiment. At the transistor gate level the most frequently cited effects are variability in transistor threshold voltage and a significant increase in off-state leakage current.[1-9] Xiong et al.[4] additionally note that high frequency roughness can lead to a decrease in the length of the conducting channel due to enhanced lateral diffusion of the self-aligned source/drain extension. Most studies to date concern transistor performance. However, Lin et al.[6] note a surprising increase of breakdown temperature with roughness for TaN barriers used to prevent diffusion from Cu interconnect lines into the surrounding low-K dielectric. They also raise the possibility that Cu conductivity will suffer for rough interconnects due to enhanced electron surface scattering compared to smooth-edged conductors.

The need for roughness metrology is recognized in the International Technology Roadmap for Semiconductors (ITRS).[10] Specifications for linewidth roughness (LWR) control and metrology are contained in Table 117 of its Metrology section. In the ITRS, LWR is measured as 3 standard deviations of the linewidth (or CD, for critical dimension) including all roughness wavelengths shorter than twice the technology node. The specifications call for LWR control of 2.6 nm in 2005, improving to 1.6 nm by 2009. To support this control, the ITRS specifies that the metrology tool needs a 3 standard deviation repeatability of better than about 0.5 nm for LWR measurements in 2005, improving to approximately 0.3 nm by 2009. The recognition of this metrology need and these specifications are relatively new. As recently as 1999, the Metrology section of the ITRS contained no specifications for roughness measurement. Specifications for line edge roughness (LER) were first added in 2001.

<sup>&</sup>lt;sup>†</sup>. Official contributions by the National Institute of Standards and Technology are not subject to copyright.

Given the newness of concern about this issue, it is not surprising that roughness metrology in semiconductor electronics applications exhibits some symptoms of immaturity. For example, as we will see, a complete measurement specification for root mean square (RMS) roughness should specify both maximum and minimum roughness wavelengths to include in the measurement. The ITRS currently specifies only the longest roughness wavelength. CD-SEM (critical dimension scanning electron microscope) tools generally offer a RMS measure of LER or LWR, but the number and spacing of the CD measurements from which this measure is computed vary from tool to tool. The ITRS contains a specification for maximum allowable bias for CD measurements, but there is no similar requirement for roughness measurements, maybe because the possibility that measurement bias could be an important problem in LWR metrology is not widely recognized.

The best roughness metric should be based upon a theoretical and observed relationship between its measure of roughness and device properties. Despite the good start in correlating RMS roughness to transistor leakage and threshold voltage that was just described, the mechanisms by which roughness propagates from one manufacturing process step to the next and ultimately affects performance are for the most part only sketchily known. While almost all discussion of roughness in semiconductor applications is currently in terms of an RMS metric, other metrics exist. When more is known, we may discover that one or more of these others serve some of our purposes better.

In the next section we review some metrology issues for RMS measures of roughness from the literature. These issues are the dependence of the measured value upon sampling length and sampling interval, random errors due to measurement noise and sampling, and nonrandom (bias) errors due to noise. Afterwards, we consider the possibility that non-RMS metrics may sometimes be appropriate. This consideration is necessarily somewhat speculative, since the present state of the evidence is not sufficient to definitively establish the superiority of another metric. Nevertheless, some examples are given that I hope are plausible enough to encourage future open-mindedness on this subject.

# SOME ISSUES FOR RMS ROUGHNESS METRICS

LWR can be quantified in a number of different ways. Most begin by measuring the width of a line at Npositions separated by a distance  $\Delta$ , representing a total



**FIGURE 1.** Definition of width residuals used for determining various LWR metrics. Linewidths are measured at intervals,  $\Delta$ , along a sampled length, *L* (top). The average width is subtracted, resulting in a curve (bottom) showing width residuals vs. position along the line.

sampling length of  $L = N\Delta$  (Fig. 1). Let us call these widths  $W_i$ , i = 1, ..., N. Let the *i*th width residual be  $w_i = W_i - \overline{W}$  with  $\overline{W}$  the average of the measured  $W_i$ values. Most metrics are based upon these width residuals. Some that have been used are the average roughness (i.e.,  $\sum |w_i|/N$ ), the correlation length, the fractal dimension, and the bearing ratio (among a much longer possible list). In semiconductor industry applications, however, only the familiar root mean square (RMS) or standard deviation metric is encountered with any frequency. If we call this metric  $R_0$  it is defined as

$$R_0^2 = \frac{1}{N-1} \sum_{i=1}^{N} w_i^2 \,. \tag{1}$$

(A similar definition applies for LER, except that the  $w_i$  are replaced by the edge position residuals, generally determined by subtracting a best-fit line from the edge positions, and the sum is divided by N-2 instead of N-1 because the linear fit removes two degrees of freedom instead of the average's one.) As noted, the ITRS specifications are stated in terms of three times this standard deviation metric. In this section we make note of three different issues in the measurement of  $R_0$ . These are (1) the dependence of  $R_0$  upon L and  $\Delta$ , (2) random errors associated with sampling, and (3) bias due to noise in images from which  $R_0$  is determined.

#### **Dependence upon** L and $\Delta$ .

The value of  $R_0$  depends upon the metrologist's choice of L and  $\Delta$ . (An example of the L dependence is shown for self-affine edges by Constantoudis et al.[11])



**FIGURE 2.** Power spectrum derived from width residuals. Parseval's theorem relates the root mean square roughness to the area under the PSD curve. Real measurements have finite spacing,  $\Delta$ , and sample length, *L*, so the measured roughness is related to the area between the limits shown.

This is not a measurement artifact. Rather, it reflects the fact that different choices of L and  $\Delta$  cause the measurement to sample different ranges of roughness frequencies. This is best illustrated by considering the power spectral density (PSD), which is proportional to the square of the magnitude of the Fourier transform. Parseval's theorem relates  $R_0$  to the area beneath the PSD between frequency limits 1/L and  $1/(2\Delta)$  (Fig. 2). Conceptually, one may imagine a measurement of an infinitely long line with perfect spatial resolution. The resulting PSD would extend over all frequencies from 0 to infinity, and the area beneath it would correspond to the RMS roughness with all frequencies included. In practice, of course, we are limited by sample size, instrument resolution, and finite time to finite nonzero values for L and  $\Delta$ . The resulting RMS roughness is equivalent to integrating a finite interval of the PSD curve between frequency limits 1/L and  $1/(2\Delta)$ , as indicated by the area shaded with diagonal lines in Fig. 2. Clearly, the value of such an integral depends upon the frequency limits.

Ignoring this dependence can lead to common measurement "fallacies." For example, it may be tempting to compare  $R_0$  measured over some segment with a second measurement made over only a subset of the segment, in an attempt to establish that the subset has a roughness that differs from the segment as a whole. Such a comparison is not valid, since the roughness measure of the whole segment includes low frequency roughness components that are excluded from the measurement of the shorter segment. The differences in this case may be large, because PSDs are typically largest at low frequencies, as in Fig. 2. Similarly, SEMs currently differ in their capabilities, so that measurements per-



**FIGURE 3.** Transistor geometry. Should the longest roughness wavelength of interest be defined as the long dimension of the transistor (or some multiple of that length)?

formed with different instruments, even when they measure the same total length of line, may use different values of N (and hence  $\Delta$ ). This can also in principle invalidate the comparison, although the sensitivity of  $R_0$ to choice of  $\Delta$  tends to be low because the roughness power at reasonably high frequencies is generally negligible. In both cases, the measurements contain enough data that a valid comparison *can* be made—but only if one accounts for the different roughness frequency intervals.

Because the roughness value depends upon L and  $\Delta$ , measurement specifications for RMS roughness should include specification of these limits. How should these limits be decided? An illustrative rationale is suggested in Fig. 3 and Fig. 4. Figure 3 is a schematic showing some aspects of transistor geometry. It shows roughness along the long dimension of the transistor gate electrode. The electrical properties of the transistor are determined by dopant concentrations that are formed when the dopant is implanted and then diffused. We could identify the longest roughness wavelength of interest with the long dimension of the transistor, or perhaps with twice (or some other small multiple of) this length. Roughness wavelengths much longer than this cause variability in gate widths between transistors, but are not responsible for significant intra-transistor variability, so they do not correspond to our intuitive notion of roughness. For this reason, one possible scheme



**FIGURE 4.** Schematic top view of dopant after implant but before diffusion (dark shading) and after diffusion (lighter shading). Diffusion smooths roughness wavelengths that are comparable to or smaller than the diffusion length.

would account separately for long-wavelength roughness as "within-chip CD variability."

A variety of physical processes may determine the short wavelength limit. One natural length scale is set by the average distance between dopant atoms, which is typically in the range of several to several tens of nanometers [12]. Another natural limit, as pointed out by Xiong et al.[4], may be set by dopant diffusion lengths on the order of 10 nm. This would naturally tend to smooth the electrically relevant post-diffusion dopant profile, even if the physical gate edge contained roughness wavelengths shorter than this (Fig. 4). (However, as Xiong et al. also point out, the average position of the channel edge may be sensitive to shorter wavelengths, even if these do not affect its variability.) We might set the high frequency limit at a given ITRS technology node to one of these lengths. The details of this argument would obviously change if we were to consider roughness other than in the gate. For example, the short wavelength cutoff for roughness in a photomask is likely to be much larger than for a gate, since the exposure process does not have the resolution to transfer the shortest wavelengths. On the other hand, metal interconnect conductivity may well be affected mainly by shortwavelength roughness that scatters electrons near the edges of the conductor.

The best way to determine a roughness metric, including frequency limits in the case of an RMS metric, is by demonstrated experimental correlation of the candidate metric to important aspects of device performance. The sorts of pre-experimental considerations discussed here are meant to suggest possibilities to test experimentally and to suggest that there is unlikely to be one answer that serves for all device layers.

#### **Random Errors Due to Noise and Sampling**

One source of random error in roughness measurements is the inevitable noise in images. As a result of image noise, assigned linewidths or edge positions, even for the same point on the sample, vary randomly from one measurement to the next. If the linewidth determination at each position along a line has a normally distributed random component with standard deviation,  $\sigma_{\varepsilon}$ , and  $\sigma_{\varepsilon}$  is small compared to  $R_0$ , then the standard deviation of the roughness determination is  $\sigma_{R_0} = \sigma_{\varepsilon}/\sqrt{N-1}$ .[13] For typical values of  $\sigma_{\varepsilon}$  near 1 nm and N near 100 this results in a standard deviation in the measured roughness on the order of 0.1 nm contributed by this source.

A second source of error is the intrinsic randomness of the roughness itself. The roughness of a given line segment is one particular realization of a stochastic process. Even in the absence of any measurement error, the roughness of that segment will differ from that of another randomly chosen segment. If the size of this sampling uncertainty is larger than the allowed uncertainty of our roughness measurement, then it can be reduced by averaging the roughness variances from multiple segments. This average will be more representative of the average roughness than would be the roughness of a lone segment. It is convenient to express the desired uncertainty in  $R_0$  in the form  $u_{R_0} = \eta R_0$  where  $\eta$  is a fraction smaller than 1. For example, the ITRS calls for a roughness measurement precision of 20% of the required roughness value, so  $\eta = 0.2$ . One should then average the roughness values determined from *m* segments, where [13]

$$m \ge \frac{\sum_{k=0}^{N/2} \overline{P_k}^2}{4\eta^2 \left(\sum_{k=0}^{N/2} \overline{P_k}\right)^2}$$
(2)

Here the  $\overline{P_k}$  are the average components of the PSD. The dependence of *m* on  $1/\eta^2$  means that halving the uncertainty requires quadrupling the number of measurements that must be included in the average. That is, it is a manifestation of the familiar rule that the uncertainty in an averaged quantity decreases like the square root of the number of independent measurements included in the average. The ratio of sums of PSD components accounts for correlations in the width residuals. The more uncorrelated are the width residuals at neighboring measurement positions, the closer the PSD will be to a white spectrum. (In a white spectrum the  $\overline{P_k}$  are all the same, regardless of k). In this case the N residuals provide N-1 independent estimates of the roughness, reducing the number of repeats, *m*, that would otherwise be required, a fact that is reflected in the small value of the ratio of sums in Eq. (2) in the limit of white noise. On the other hand, high correlation in the width residuals means the roughness power will reside in fewer components in the PSD. This results in a larger ratio of the sums, and a corresponding larger value for m.

Figure 2 shows a typical power spectrum, but a typical size for sampling error depends upon our assumptions concerning the overall roughness,  $\lambda_{min}$ , and  $\lambda_{max}$ . The wavelength limits, as we have seen, determine which part of the power spectrum is relevant. The 2003 ITRS specifies that  $\lambda_{max}$  should be twice the technology node, i.e., 180 nm for the 90 nm node. If we take  $\lambda_{min} = 10$  nm then the PSD in Fig. 2 leads to a fractional sampling error for a single (m = 1) measurement of  $R_0$  of almost 40 %. (For  $3R_0 = 2.6$  nm, as required for the 90 nm node by the 2003 ITRS, one standard deviation of sampling uncertainty would be about 0.3 nm, which is larger than the uncertainty contributed by noise.) This is larger than our required 20%, so as stated in Ref. 13 it is necessary to increase the number of measurements to at least 4, for a total measured length of line equal to 8 times the node. On the other hand, if we take  $\lambda_{max} = 2 \ \mu m$  then Eq. (2) with the same PSD says the sampling error is already only  $0.12R_0$  even for a single segment of this length. For  $3R_0 = 2.6$  nm sampling error now contributes a one standard deviation uncertainty of 0.1 nm. This uncertainty is comparable to that contributed by noise. (See the discussion in the first paragraph of this section.) However, the ITRS roughness requirement is often a rather tight specification. It is a well-known problem that roughnesses of some materials (resists for example) often significantly exceed the specifications. Since the sampling uncertainty scales with the roughness but the noise uncertainty does not, this means that in many practical situations the sampling error will be the dominant source of random error.

#### **Bias Due to Image Noise**

The random measurement errors in edge positions and linewidths discussed in the last section and represented by  $\sigma_{\varepsilon}$  do more than introduce *random* errors in  $R_0$ . They also introduce a bias (sometimes called a "systematic" error).[13] This is because it is not possible on the basis of a single measurement to distinguish randomness in the true position or width from the apparent randomness caused by noise. The "noise roughness" variance,  $\sigma_{\varepsilon}^2$  adds to the true roughness,  $R_t^2$  to produce the measured  $R_0^2$ . That is,

$$R_0^2 = R_t^2 + \sigma_\varepsilon^2. \tag{3}$$

This bias has been observed experimentally [14], as reproduced (upper curve) in Fig. 5. This curve shows the value of  $R_0$  measured from images of the same sample position with varying pixel integration time. Lower pixel integration time results in noisier images, so the data points on the left side of the graph correspond to noisier images than those on the right. The vertical bars indicate ±1 standard deviation of the repeatability in the roughness measurement. The repeatability is obviously poorer on the left, as one would expect. The bias manifests itself in the fact that the average  $R_0$  is higher for the noisier images on the left than for those on the right. Since all of these images were of the same field of view,



**FIGURE 5.** Measured LWR vs. pixel integration time for two different LWR metrics. Vertical bars are  $\pm 1$  standard deviation of the observed repeatability. Since all measurements were at the same location, any observed dependence of LWR upon pixel integration time must be a measurement artifact. Figure reproduced from Ref. 14.

the actual roughness is constant. Any observed differences are therefore measurement errors.

In practice the size of the bias is likely to be an important measurement issue. For dense lines, for example, the ITRS specification for CD "metrology tool precision" in 2005 is 2 nm. The metrology tool precision is closely related to  $\sigma_\epsilon;\, 3\sigma_\epsilon$  is essentially the metrology tool precision under the conditions employed for a roughness measurement. These conditions are likely to be somewhat more demanding than those for a CD measurement, because CD measurements can afford to average over some length of the line whereas such spatial averaging removes short wavelength roughness that might be of interest in a roughness measurement. For this reason, it is quite likely that in 2005  $3\sigma_{\rm s}$  is 2 nm or greater. Contrast this to the ITRS requirement that  $3R_{\rm t} \le 2.6$  nm in the same year. Since these numbers differ only a little, it means the true roughness term and the bias term in Eq. (3) are comparable in size.

If repeated measurements of the same area are made, the data contain the information that one needs to estimate and correct for the bias. The steps in this process are as follows [14]: (1) Divide the total electron dose per pixel in two. (2) Instead of a single image at the full dose, acquire two images each at half the dose. (3) Determine the roughness variances,  $R_{01}^2$  and  $R_{02}^2$  from these two images as before. (4) Determine the differences in CDs (or edge positions) at the same measurement locations in the two images. The average square of these differences is an estimate of  $2\sigma_{\epsilon}^2$ . (5) Determine the bias-corrected roughness variance as  $R_q^2 = (R_{01}^2 + R_{02}^2)/2 - \sigma_{\epsilon}^2$ . In this way, randomness in the measured linewidth at a fixed position on the line, which can be observed by comparing one image to the other, provides an estimate of  $\sigma_{\epsilon}^2$ . By subtracting the bias term from the conventionally determined roughness, one obtains a corrected roughness measure. Roughness determined by this measure is shown by the lower curve in Fig. 5. The corrected metric agrees with the previous metric in the low noise limit, but maintains the same average value as the noise level is increased.

## **IS RMS THE RIGHT METRIC?**

In the previous section we devoted considerable attention to measurement issues and solutions associated with the root mean square roughness metric. This attention is justified by its near-exclusive use in semiconductor electronics applications. However, it is still only a few years ago, in 2001, that LER and LWR made their first appearance in the industry roadmap. Given the recentness of the industry's concern about edge and width roughness, we may question whether the emphasis on this metric is the industry's considered choice or, rather, simply a kind of default choice resulting from its greater familiarity.

We have seen that  $R_0$  (or the bias-corrected version,  $R_{\rm q}$ ) integrates all roughness frequencies between  $1/\lambda_{\rm max}$ and  $1/\lambda_{min}$  (Fig. 2). This means roughness frequencies are effectively sorted into two kinds: those outside of these limits, which are given no weight, and those within these limits, all of which are given the same weight. Is this a realistic description of how roughness affects device performance? It seems unlikely. For example, it seems likely that the fidelity with which edge roughness transfers from a photomask line edge to the resist pattern will vary continuously, from very high at roughness wavelengths long to very low for wavelengths short compared to the exposure tool's spatial resolution. If this were the relevant process step, we would need roughness frequencies to have smoothly varying weights between 0 and 1. If we are considering some different process step (gate leakage due to channel length variation or interconnect conductivity due to diffuse electron scattering from the conductor's edges) we should expect different weighting. Thus, one might imagine a weighted RMS metric, with the weights determined by the process step under consideration.

There are other metrics that are not RMS-based at all. Some that seem promising are based on the amplitude density function (ADF). The ADF for width variation is the probability per unit length that a randomly selected measured width will lie between W and W + dW. It can be estimated by normalizing the histogram of the binned widths:



**FIGURE 6.** (a) Amplitude density function (the more jagged curve) of a data set compared to the Gaussian with the same average and standard deviation (smoother curve). (b) The low width tail portion of the figure in a is shown enlarged. The Gaussian fit is now shown with a dash-dot curve. The measured ADF (continuous curve) is shown inside a  $\pm 1$  standard deviation uncertainty interval (dashed curves).

$$ADF(W) = \frac{H(W, \Delta W)}{N\Delta W}$$
 (4)

Here,  $H(W, \Delta W)$  is the histogram of the *N* measured widths with bin size  $\Delta W$ . The normalization insures that the sum of ADF(W) $\Delta W$  over all the bins is 1. An example of an actual ADF taken from CD-SEM measurements of five approximately 72 nm wide lines, measured at 2 nm intervals over a 2 µm length is shown (the jagged curve) in Fig. 6a.

It is often assumed that ADF(W) is a normal (Gaussian) distribution, and this appears to be true to a reasonable approximation in Fig. 6a. If the mean and standard deviation of the measured W values are used to plot a normal probability function, that function (smooth curve in Fig. 6a) fits the data as shown. However, it has been remarked that real distributions are more often non-Gaussian in the tails than near their centers.[15] Such is

the case for this one, as can be seen in Fig. 6b, which enlarges the small rectangular region marked in the lower left corner of Fig. 6a. Here the measured distribution is shown as a continuous line, with dashed lines above and below representing a  $\pm 1$  standard deviation uncertainty interval. The Gaussian estimate of the distribution (the dash-dot) line, lies for the most part below the measured distribution and consistently outside of this interval. The difference is reasonably large—the Gaussian distribution underestimates the likelihood of large deviations (widths 7 nm or more below the average width) by almost 40 % of the true likelihood. Is this important? After all, this region accounts for only about 1 % of the measured widths. If the Gaussian estimate is wrong in so few cases, then should we be concerned?

Maybe so. One can imagine plausible scenarios in which this part of the width distribution is nearly allimportant. Example 1: Consider off-state leakage in a transistor gate. If the gate electrode has nonzero LWR then the length of the conducting channel will vary. If we model the various parts of the transistor as parallel conductors (a model that appears to be only approximate,[4] but good enough for our present illustrative purpose) then the total leakage is the integral of the leakages from various infinitesimal segments:

$$I_{\text{off}} = \int_{0}^{\infty} I(W) \text{ADF}(W) dW$$
 (5)

Here W is the gate width (or channel length), I(W)is the off-state current appropriate to a transistor with uniform channel length W, and ADF(W)dW is the probability of encountering channel length W. Diaz et al.[2] and Xiong et al.[4] both model leakage using a superexponential increase of leakage current with decreasing gate length. The super-exponentially high values of current for small W means the low width tail of the ADF is significantly overweighted in this integral. That is, a significant part of the leakage comes from just those unusually narrow regions that the Gaussian curve in Fig. 6 underrepresents. Example 2: Suppose we are concerned about the impact of roughness on device yield. Even without considering any specific mechanism by which roughness affects the yield, we can say that for any well-designed process the probability of a yieldreducing event in a given transistor had better be very low. We therefore expect to find such events in the tails of the distribution.

If either of the above scenarios or another like them turns out to be important, we will require our model probability distribution to get the tails right. We have seen that the SEM (or another high spatial resolution microscopy, like atomic force microscopy) can provide a direct measure of the ADF. Estimating the tail indirectly by a parameter (such as the standard deviation) determined from the main body of the distribution is likely to be less valuable under this circumstance than a metric determined directly from the tail. This will create its own set of metrology challenges. Although unusual but real width excursions inhabit the tails of the distribution, measurement outliers are also to be found there. Good measurements of the actual distribution of rare events will require robust metrology.

For some applications, such as rank ordering samples from least to most rough, a curve or function like the ADF will not do. One needs a simple numerical metric. There are a number of ways that such metrics can be derived from the ADF. For example, one might define a probability metric by summing the ADF over all bins that correspond to widths smaller than a predetermined width. For example,  $P_{-x}$ , could be the probability that a measured width will be more than x nm below the average width. The larger the roughness, the larger the value of  $P_{-x}$ . For the distribution in Fig. 6 it happens that  $P_{-7 \text{ nm}} \approx 0.01$ . Alternatively, one could define a percentile metric, as the width deviation at which a particular probability is reached. Thus, in the this example, a probability of 0.01 (the 1.0 percentile) corresponds to a width deviation of -7 nm. The larger (in magnitude) is this deviation, the larger the roughness.

### SUMMARY AND CONCLUSIONS

The standard deviation, a RMS measure of roughness, is currently the most commonly used metric in semiconductor electronics applications. Three problems that arise in roughness metrology when using this metric were reviewed. (1) The RMS roughness of a line is a function of the length of line that is measured and the sampling interval with which it is measured. (2) Image noise and finite sampling result in random errors in the measured roughness. These can be significant compared to ITRS specifications for roughness measurement precision. (3) Image noise can result in a positive measurement bias. The measured roughness is larger than the true roughness because of a false "noise roughness." In practical measurements the noise roughness can be comparable in size to the ITRS-specified sample roughness. For this reason it should be regarded as a significant issue, even though the ITRS does not have a specification for maximum bias.

Solutions to these problems were also reviewed. (1) Roughness comparisons should be based upon measurements taken with the same sampling length and sampling interval. The sampling length and interval should be chosen purposefully, based upon some model of how the various roughness frequencies affect device performance. (2) Random errors can be reduced by averaging. A simple expression, Eq. (2), allows a determination of how much sampling must be done to achieve a desired RMS roughness measurement repeatability. (3) Bias can be reduced by slightly changing the measurement procedure in a way that permits measurement of the bias. With an estimate of the bias, a correction can be applied.

A metric should eventually be tied to device performance (or something else that we care about) both theoretically, through some reasonable model, and experimentally, through a demonstrated correlation. The RMS roughness metric is not the only possible one, and although it is currently the most commonly used, it is not clear that it is the best-or that if it is the best for some process steps it will be the best for all. Too little is currently known about how roughness affects performance to be certain. For some processes it may be necessary to measure roughness in order to predict the likelihood of rare but important excursions from average width or edge position. In such cases, to rely upon the standard deviation is to extrapolate the behavior of the bulk of the probability distribution to predict the behavior of the tail. This extrapolation requires one to assume a particular form for the distribution, for example to assume that it is Gaussian. Such an assumption, as we have seen, may not necessarily be correct in the tails even when it is a good approximation elsewhere. It is, moreover, not a necessary assumption, since the kind of data we already acquire when determining LWR (at least when using high spatial resolution microscopy like SEM or atomic force microscopy) can be used to construct the amplitude density function (ADF). The ADF is an estimate of the probability distribution, tails and all. (The amount of data and the robustness of the metrology required to estimate the tails reliably was, however, not explored for this report.) Since there exist plausible scenarios in which other roughness metrics are superior to the RMS metric, we should remain openminded about metrics until more is known about whether these or similar scenarios are important.

# **ACKNOWLEDGMENTS**

This work was funded by the NIST Office of Microelectronics Programs, through the National Semiconductor Metrology Program and by the Nanomanufacturing Program of NIST's Manufacturing Engineering Laboratory.

### REFERENCES

- A. Yamaguchi, R. Tsuchiya, H. Fukuda, O. Komuro, H. Kawada, and T. Iizumi, "Characterization of Line-Edge Roughness in Resist Patterns and Estimation of its Effect on Device Performance," Proc. SPIE **5038**, 689-698 (2003).
- C. Diaz, H.-J. Tao, Y.-C. Ku, A. Yen, and K. Young, "An Experimentally Validated Analytical Model for Gate Line-Edge Roughness (LER) Effects on Technology Scaling." IEEE Electron Device Letters, 22(6), 287-289 (2001).
- K. Patterson, J. L. Sturtevant, J. Alvis, N. Benavides, D. Bonser, N. Cave, C. Nelson-Thomas, B. Taylor, K. Turnquest, "Experimental Determination of the Impact of Polysilicon LER on sub-100 nm Transistor Performance," Proc. SPIE 4344, 809-814 (2001).
- S. Xiong, J. Bokor, Q. Xiang, P. Fisher, I. Dudley, and P. Rao, "Study of Gate Line Edge Roughness Effects in 50 nm Bulk MOSFET Devices." Proc. SPIE 4689, 733-741 (2002).
- A. Asenov, S. Kaya, and A. R. Brown, "Intrinsic Parameter Fluctuations in Decananometer MOSFETs Introduced by Gate Line Edge Roughness," IEEE Trans. Electron. Devices 50(5), 1254-1260 (2003).
- Q. Lin, C. Black, C. Detavernier, L. Gignac, K. Guarini, B. Herbst, H. Kim, P. Oldiges, K. Petrillo, and M. Sanchez, "Does Line Edge Roughness Matter?: FEOL and BEOL Perspectives," Proc. SPIE 5039, 1076-1085 (2003).
- A. Yamaguchi, K. Ichinose, S. Shimamoto, H. Fukuda, R. Tsuchiya, K. Ohnishi, H. Kawada, and T. Iizumi, "Metrology of LER: influence of line-edge roughness (LER) on transistor performance," Proc. SPIE 5375, 468-476 (2004).
- J.-Y. Lee, J. Shin, H.-W. Kim, S.-G. Woo, H.-K. Cho, W.-S. Han, and J.-T. Moon, "Effect of line edge roughness (LER) and line width roughness (LWR) on Sub-100 nm Device Performance," Proc. SPIE 5376, 426-433 (2004).
- K. Shibata, N. Izumi, and K. Tsujita, "Influence of line edge roughness on MOSFET devices with sub-50nm gates," Proc. SPIE 5375, 865-873 (2004).
- International Technology Roadmap for Semiconductors (ITRS), 2003 Edition, http://public.itrs.net.
- V. Constantoudis, G.P. Patsis, L.H.A. Leunissen, and E. Gogolides, "Line edge roughness and critical dimension variation: Fractal characterization and comparison using model functions," J. Vac. Sci. Technol. B 22, 1974-1981 (2004).
- 12. Daniel J. C. Herr (private communication) called this to my attention.
- B. D. Bunday, M. Bishop, D. McCormack, J. S. Villarrubia, A. E. Vladár, R. Dixson, T. Vorburger, and N. G. Orji, "Determination of Optimal Parameters for CD-SEM Measurement of Line Edge Roughness," Proc. SPIE 5375, 515-533 (2004).
- J. S. Villarrubia and B. D. Bunday, "Unbiased Estimation of Linewidth Roughness," *Proc. SPIE* 5752 (2005), in press.
- W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, *Numerical Recipes in C, (Cambridge University Press, Cambridge, 1988)* p. 520.