# The Banana-Doughnut Debate Explained (Montelli)

I have written this somewhat simplified account after many requests from colleagues and students to explain what the series of articles,

*Comments*and*Replies*in Geophysical Journal is about. It is not intended to replace that discussion, which should be the ultimate source of information. We ourselves have struggled to understand some of the issues raised by our colleagues, and to clarify our own writings in a more accessible language - we are aware that the original literature is not easily accessible. I wish to state from the outset that we have the highest respect for our opponents in this debate. In fact, apart from a few quibbles regarding the purely mathematical formalism, there is much common ground. We differ with Rob van der Hilst in one important aspect: we consider it much more urgent to replace ray theory with more powerful methods than he does. I hope the following will help those interested to enter the debate.## Why go bananas?

For more than a hundred years, seismologists have treated seismic P and S waves as if they satisfy the same laws as optical rays (

*ray theory).*This approach has been extraordinarily succesful: it enabled Beno Gutenberg to determine the radius of the Earth's core in 1910, Inge Lehmann to discover the tiny inner core in 1936, and Harold Jeffreys and Keith Bullen to derive a spherical Earth model that could accurately predict seismic travel times and be used to locate earthquakes by reading the times of P and S waves from seismograms located far away. But even after those gigantic accomplishments, ray theory continued to help seismologists to make fundamental discoveries. In the 1970's, it had become evident that a spherical Earth model was not sufficient: some region on Earth are slower or faster than others, basically because they are warmer or cooler. Researchers at MIT and Harvard, led by Keiti Aki and Adam Dziewonski, pioneered the technique of seismic tomography. This gave us the first tangible evidence that the deep seismicity near the ocean's trenches is actually caused by oceanic lithosphere sinking back in Earth's mantle. In 1990, an undergraduate at Utrecht, Suzan van der Lee - now at Northwestern University - came up with the first image of a slab (the Aegean) sinking in the lower mantle for her senior thesis project; because this image was counter the prevalent opinion that the less dense upper mantle did not mix with heavier lower mantle rock, it took a while to double-check it with her advisors and get into print [1].

At the same time Rob van der Hilst - also at Utrecht, now at MIT - came up with extensive evidence that the slabs around the Pacific were able to sink into the lower mantle [2]. These findings seemed to make minced meat of the so-called

*two layered*convection model. This model states that Earth cools off with little or no any exchange of mass between upper and lower mantle. It was succesful in explaining geochemical observations. These required a hidden*reservoir*where argon and helium could be held since Earth's formation without escaping into the atmosphere, and the lower mantle was the obvious candidate for that as long as its rock stayed where it was: deep. But after all these successes, limitations of

*ray theory*became apparent. Wavelengths of seismic waves often measure in the hundreds of kilometers, and this makes it difficult to see small objects with seismic waves. This is very similar to the limitations of the optical microscope which does not allow us to see beyond a very small resolution, given by the Fresnel zone. If we treat seismic waves like optical rays, they must also have Fresnel zones. Because seismic rays curve back to the surface of Earth, these zones take on the shape of bananas, as is shown in the following figure:## Why doughnuts?

The simple idea of a Fresnel zone gave us an idea of what we could resolve in the limit of

*ray theory*, but it gave us no way to improve on it. But independently from the work on P and S waves, seismologists working with very low-frequency waves (*surface waves*and*normal modes*) had already figured out how complicated the actual sensitivity is of seismic waves. Figure 2 shows that the sensitivity for a normal mode is also spread out over the surface of the Earth in a band-like structure (though it looks more like a caterpillar than a banana).In the mid 1980's, Roel Snieder at Utrecht (now at Colorado School of Mines) was the first to derive similiar sensitivities for surface waves at much higher frequencies. He also attempted to interpret perturbations in such seismic waveforms tomographically, using a first order perturbation theory. But it was not until about ten years later that it became evident that one could get a Fresnel zone-like

*sensitivity kernel*by summing over many of such surface wave modes, as shown in Figure 3.These kernels still look very much like the Fresnel zone sketched in Figure 1. The

*doughnut hole*made its appearance in the theory when postdoc Henk Marquering at Princeton started to look at the sensitivity of*arrival times*of seismic S waves, using techniques very similar to those used earlier by Snieder and by Li and Tanimoto.Marquering's results were initially baffling and seemed to contradict every seismology textbook published in the 20th century: the location of the ray itself was shown to be a region of

*zero*sensitivity for the travel time of the wave! In striking contrast,*Ray theory*defines the ray as the*only*region where the travel time is sensitive to Earth's structure, and the two theories are in obvious conflict to each other. But after grappling for a while with this paradox we understood it (an explanation is provided in [18]), and extensive numerical tests soon proved that it was, in fact,*ray theory*that is deficient, not the new*finite frequency*theory. Princeton's Tony Dahlen soon developed a very efficient, though approximate, way to compute the sensitivity kernels for travel time (named*banana-doughnut kernels*by Marquering because the region of zero sentitivity creates a doughnut hole at the center of the banana). Figure 4 shows two such a kernels, computed with Dahlen's theory [3].## Why disputing it?

The concept that travel times can not be influenced by Earth structures located

*on the ray itself*is counterintuitive, and has met with resistance of two kinds: many - even very esteemed - seismologists initially expressed disbelief, but that hesitancy usually disappeared when we showed how well the theory worked when tested on purely synthetic seismograms computed with the pseudospectral method (see the publications of Shu-Huei Hung and Adam Baig in Geophys. J. Int. [4-8]). The second resistance was of the*it doesn't make a difference in practice*type. Yet the kernels soon turned out to make a decisive contribution to the resolution of small features when Princeton graduate student Raffaella Montelli started to test the new theory on actual data and - to her own surprise - discovered more than a dozen plumes (hot uprisings under islands like Hawaii) in the lower mantle of Earth. By coincidence, this discovery came at a time that the plume hypothesis itself was under fire. I suspect that this has contributed to fuel the debate about the banana-doughnut kernels (paradoxically, plume opponents and I may soon find common ground since analysis of the lower mantle plumes seems to raise significant problems for the model of*whole mantle convection*and bring us back to a very limited mass exchange between upper- and lower mantle). So the kernels do make a difference! Not everyone agrees, however. Martijn de Hoop (now Purdue) and Rob van der Hilst wrote a paper in Geophysical Journal [4] in which they claim that though "the sensitivity being identical zero on the unperturbed source-receiver ray", that "the kernel itself does not have a zero on the unperturbed ray". The discussion is very much about mathematical niceties that an ordinary seismologist (such as me) would not ever wish to worry about, but in the end this is nonsense in our view, which we have detailed in Dahlen, F.A. and G. Nolet, Geophys. J. Int. 163, 949-951, 2005 (click here:) .

For a reply by de Hoop and van der Hilst, see [11].

**One particular aspect of the paper by de Hoop and van der Hilst may lead readers astray and for a proper understanding of finite frequency theory we explore it in some detail. They claim that one can**

*mollify*kernels to make the zero sensitivity disappear. They state that this*mollifying*happens implicitly when the matched filter that is used in the measurement of the time delay is not exactly equal to the observed P or S pulse. Or when it is time-shifted, e.g. as a consequence of an error in the earthquake's origin time.**First of all, this view reflects a basic misconception of the role of banana-doughnut kernels: they are**

*designed*to interpret the difference between the matched and observed waveforms! This has nothing to do with the zero sensitivity on the ray. The argument that origin time errors can annihilate the zero sensitivity is correct, but*only if the tomographer allows origin time bias to propagate into the model*- something I teach my students to avoid at all cost. The rather complicated mathematical arguments can be paraphrased as follows:**Assume we have a tomographic system of equations with a model m, data d and time errors e and a matrix (with the banana-doughnut kernels in some discretized version) A:**

**A m = d + e**

**De Hoop and van der Hilst map the error e back into the model by a transformation e=Em, and rephrase the system as:**

**(A-E) m = d**

**to conclude that when one reads back the kernels in the new**

*mollified*matrix (A-E), these kernels have no zeroes on the ray. Magic? Well, this is correct in so far as one could actually do the mapping e=Em. Which is a problem unless the errors can be adequately modeled by m and most importantly if one actually knows the errors e - but why then not subtract the error from ther data? If the errors, on the other hand, represent a systematic shift (e.g. because all origin times are estimated too early) no one would do tomography this way, because it maps the bias explicitly into the model. In practice, tomographers center the distribution of delay times around 0 just to avoid that the model is affected by such bias.

The second attack by van der Hilst and de Hoop is in another paper published in Geophysical Journal [12]. The argument is mainly that the beneficial effects of banana-doughnut kernels are limited to a small magnification of amplitude anomalies that could easily have been accomplished by reducing the

*damping*(or*regularization*) of the inversion that is at the basis of every tomographic interpretation. We disagree with the authors that there is that much freedom in choosing one's damping strategy because data errors are known quite well and the misfit of the data (or*chi-square*in the language of statistics) cannot be too large. We reject the examples shown by the authors to illustrate their point of view because they are selected for cases where the Fresnel zone is narrow (near the surface) or where the anomalies are much larger than the Fresnel zone (slabs). We also reject their statistical analysis because it fails to take into account that wavefront healing has both positive and negative signs - resulting in a null result when averaging over all effects, as the authors do. But these papers are much easier to understand and we refer the interested reader to [12] and a preprint of our response: (click on Machiavelli): As a footnote, there is also a paper that questions the beneficial effects of finite frequency inversions for surface wave phases [14]. Such kernels have been developed by Ying Zhou (now at Virgina Tech) [15-17]. In contrast to Zhou, Anne Sieminski does not invert for 3D structure of the Earth, but attempts to retrieve a 2D map of

*phase velocities*. This approach has two limitations: first of all, inverting data for one frequency only ignores the beneficial effects of the different widths of sensitivity kernels at different frequencies (which is correlated to depth of the sensitivity, but that is no reason not to exploit it). But most importantly there is again a conceptual misunderstanding: the concept of*local phase velocity*implies the validity of*ray theory*for surface waves! If wavefronts are deformed by heterogeneities off the geometrical raypath, their local speed is determined by their past history, and not unique. Ying Zhou [15] has shown how large the approximations are that one has to make to adopt a local velocity, and that these approximations usually annihilate the beneficial effects of finite frequency interpretations. If the path coverage is dense, and the kernel width need not be exploited to get a good resolution, even the local velocity approach - despite its significant shortcomings - can yield a phenomenal increase in resolution [19].## Further reading

The original paper to study finite-frequency tomography for body waves is (Dahlen et al., GJI, 2000)

However, many may find this a hard nut to crack. A simplified derivation of the same results - together with a method to compute kernels in local, 3D structures, can be found in [18] (click here:)

However, many may find this a hard nut to crack. A simplified derivation of the same results - together with a method to compute kernels in local, 3D structures, can be found in [18] (click here:)

## Banana-doughnut software

Software is being cleaned up and documented, and has been tested for the situations in which we used it, but is not yet in really user-friendly form. However, I have begun to give out the three most fundamental programs to students and colleagues who are willing to test them on their own data sets and commit to give feedback (and not just bug me with questions because for that I have no time!):

[a]

[b]

[c] wether you apply ray theory or finite frequency theory, the principle of parsimony dictates that you should not solve for more detail in a model than is warranted by the data coverage. The way we accomplish this is to space deep model nodes further apart than close nodes (in addition to regularizing with a smoothness constraint). Program

[a]

**grafbdyn.f**is a program that computes travel times and geometrical spreading in 3D media; when combined with**bdsub.f**it computes banan-doughnut kernels for travel times and amplitudes. Can be used for local studies (typically models of 100*100*50 nodes). A first application, local tomography in the Gulf of Corinth, was presented by Stephanie Gauthier at the EGU conference in Vienna, 2006 (click here of a copy of our poster).[b]

**raydyntrace.f**computes all necessary parameters to calculate kernels in the paraxial approximation as in [3], and a whole lot more (such as crustal and ellipticity corrections).[c] wether you apply ray theory or finite frequency theory, the principle of parsimony dictates that you should not solve for more detail in a model than is warranted by the data coverage. The way we accomplish this is to space deep model nodes further apart than close nodes (in addition to regularizing with a smoothness constraint). Program

**springs3d.f**computes an optimum grid distribution by connecting nodes with (virtual) springs of specified length and minimizing the potential energy of the parameterization [20].If you are interested in collaborating to make the software well-tested and user-friendly, please email me at nolet@princeton.edu.

## References

[1] Spakman, W., van der Lee, S., & van der Hilst, R., Travel-time tomography of the European-Mediterranean mantle down to 1400 km, Phys. Earth Plan. Int., 79, 3-74, 1993.[2] van der Hilst, R., Engdahl E.R., Spakman W. & Nolet, G., Tomographic imaging of subducted lithosphere below northwest Pacific island arcs, Nature, 353, 37-43, 1991,

[3] Dahlen, F.A., Hung, S.-H. & Nolet, G., 2000. Fr'echet kernels for finite-frequency travel times - I. Theory, Geophys. J. Int., 141, 157-174.

[4] Baig, A. & Dahlen, F., 2004. Statistics of traveltimes and amplitudes in random media, Geophys. J. Int., 158, 187-210.

[5] Baig, A., Dahlen, F., & Hung, S.-H., 2003. Traveltimes of waves in three-dimensional random media, Geophys. J. Int., 153, 467-482.

[6] Hung, S.-H., Dahlen, F., & Nolet, G., 2000. Frechet kernels for finite-frequency travel times - II. examples, Geophys. J. Int., 141, 175-203.

[7] Hung, S.-H., Dahlen, F., & Nolet, G., 2001. Wavefront healing: a banana-doughnut perspective, Geophys. J. Int., 146, 289-312.

[8] Baig, A. & Dahlen, F., 2004. Traveltime biases in random media and the s-wave discrepancy, Geophys. J. Int., 158, 922-938.

[9] de Hoop, M. & van der Hilst, R., 2005. On sensitivity kernels for wave equation transmission tomography, Geophys. J. Int., 160, 621-633.

[10] Dahlen, F. & Nolet, G., 2005. Comment on the paper on sensitivity kernels for wave equation transmission tomography by de Hoop and van der Hilst, Geophys. J. Int., 163, 949-951.

[11] de Hoop, M. & van der Hilst, R., 2005. Reply to a comment by F.A. Dahlen and G. Nolet on: On sensitivity kernels for wave equation tomography, Geophys. J. Int., 163, 952-955.

[12] van der Hilst, R. & de Hoop, M., 2005. Banana-doughnut kernels and mantle tomography, Geophys. J. Int., 163, 956-961.

[14] Sieminski, A., L've^eque, J.-J., & Debayle, E., 2004. Can finite-frequency effects be accounted for in ray theory surface wave tomography, Geophys. Res. Lett., 31, doi:10.029/2004GL021402.

[15] Zhou, Y., Dahlen, F., & Nolet, G., 2004. Three-dimensional sensitivity kernels for surface wave observables, Geophys. J. Int., 158, 142-168.

[16] Zhou, Y., Dahlen, F., Nolet, G., & Laske, G., 2005. Finite-frequency effects in global surface wave tomography, Geophys. J. Int., 163, 1087-1111.

[17] Zhou, Y., Nolet, G., Dahlen, F., & Laske, G., 2005. Global upper mantle structure from finite-frequency surface-wave tomography, J. Geophys. Res., in press.~

[18] Nolet, G., Dahlen, F., & R.Montelli, 2005. Traveltimes and amplitudes of seismic waves: a re-assessment, in: A. Levander and G. Nolet (eds.), Array analysis of broadband seismograms, AGU Monograph Ser., 37-48.

[19] T. Yang and D. Forsyth, Regional tomographic inversion of the amplitude and phase of Rayleigh waves with 2-D sensitivity kernels, Geophys. J. Int., in press, 2006.

[20] Nolet, G. and R. Montelli, Optimum parameterization of tomographic models, Geophys. J. Int., 161, 365-372, 2005.