Statistical analysis of seismicity is critical for understanding earthquake observations, testing proposed prediction and forecast methods, and assessing seismic hazard. Unfortunately, despite its importance to seismology — especially to those studies that potentially impact public policy — statistical seismology is mostly ignored in the education of seismologists, and there has been no central repository for relevant software. To remedy these deficiencies, and with the broader goal of enhancing the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is an educational platform that is designed to be authoritative, up-to-date, prominent, and useful. We anticipate an audience that ranges from beginning graduate students to experienced researchers.
Every co-author of this article has served as a referee for at least one seismology manuscript in which the author(s) made a questionable or incorrect application or interpretation of statistics. We suspect that most readers have had a similar experience. This is not a matter of stupidity — not even the important kind championed by Schwartz (2008) — but rather a lack of understanding and/or awareness of sometimes sophisticated mathematical concepts and how they should be applied to uncertain data. We seek to fill this gap in knowledge, understanding, and application, and to promote excellence in statistical seismology, by providing the information and resources necessary to understand and implement the best practices, with the hope that readers will apply these methods to their own research.
Given that seismology is a field of applied physics, it is reasonable that, starting with only Hooke’s law, students are taught to derive the wave equation, Snell’s law, reflection/ refraction coefficients, and the behavior of surface waves. But seismology is also increasingly becoming a field of applied statistics, and few seismology students are taught even the most basic statistical methods let alone the underlying theory. For instance, while most seismology texts mention the Gutenberg- Richter magnitude distribution, few include Aki’s (1965) demonstration and Weichert’s (1980) additional treatment suggesting that one should use maximum likelihood to estimate the model parameter a- and b-values. Such disregard for statistics in seismology texts might be explained by the fact that seismology evolved from physics and had an early emphasis on theoretical understanding. Relying on supplementary statistical courses is an imperfect solution for seismology students: Even with some basic training, it is rarely simple to apply textbook statistical procedures to problems of seismicity, where clustering undermines the common assumption of independent data, and issues of data quality related to seismic networks are unique.
Because statistics is so little emphasized in seismology texts, the audience that stands to benefit from CORSSA is quite varied. CORSSA material can serve undergraduate students as a starting point to understand the issues, and it should serve graduate students as a resource for their own research. Moreover, it should serve experienced researchers from outside the statistical seismology community, and even researchers within that group, as a point of reference to enhance the quality of their work. To serve this diverse audience, CORSSA covers a wide variety of material, which we categorize using the following seven themes:
The thematic structure was devised to make it easy for readers to focus on their personal requirements to get an introduction to statistical seismology (Theme I), or to learn about the basics of earthquakes (Theme II), statistics (Theme III), and/or the intricacies of seismicity catalogs (Theme IV) before moving on to applications found in Theme V and Theme VI. Theme VII provides information about data formats and standardized datasets that can be used for testing software.
Each of these themes comprises a series of articles. Articles act as tutorials and rely on previously published, peer-reviewed literature. Each article deals with a specific task or topic and includes some subset of the following: discussion of why the topic is useful for research; a brief review of theory; a list of methods and software that address this topic; a discussion of trade-offs between analysis choices; pitfalls to be aware of; example results; examples of applications in scientific journals; recommendations for further reading; and next steps for the reader to take.
CORSSA is a collection of review articles related to statistical seismicity analysis, organized by a few thematic elements, and supplemented by software packages, data, a glossary, news items, and discussion forums. To more fully understand this project, it is useful to compare it with three common contemporary research outlets: peer-reviewed journals, textbooks, and wikis. As with textbooks and unlike wikis and regular issues of journals, a comprehensive design guides CORSSA development. But unlike a book and similar to a wiki, individual CORSSA articles and other content are made available immediately once they are deemed ready, rather than waiting for everything to be completed. Moreover, given that technology now allows a more accurate representation of the ever-evolving, incremental nature of scientific advancement, the concept of a final state of knowledge is obsolete — in short, CORSSA articles and content can be revised and updated, and version information can be included when appropriate. Also like a wiki, large datasets can be curated and presented in the context of an article and as standalone resources. Because CORSSA is primarily an educational resource, its articles will not contain new interpretive science; on the contrary, and given that content can be updated, CORSSA will feature “living” review articles.
We believe that identifying authors and using an optionally anonymous peer-review system provides an authority that is sometimes missing in anonymous wiki entries (and Web pages in general). Therefore, as with journals, CORSSA authors are clearly identified, and articles are peer-reviewed and subject to editorial approval. By identifying authors, we also acknowledge their efforts, which are crucial to CORSSA’s existence. CORSSA articles can be cited in much the same way as traditional peer-reviewed journal articles. Although we categorize the articles by theme, the relatively small number of articles allows us a simple citation scheme without specifying a volume or issue number: Articles are cited by author(s), year, and a unique digital object identifier (DOI). CORSSA is not a traditional journal, so its articles are not indexed in the Web of Science databases. But because these articles have DOIs, the Web of Science Cited Reference Search and other tools such as Google Scholar can track citations.
Recognizing that the portable document format (PDF) is the current standard for research articles, we present articles as PDF files. Nevertheless, readers can search the text of all articles directly via the Web interface, rather than having to open each article file. The PDF also allows authors to easily include long equations and in-line vector graphics, an advantage over most Web-based content, which tends to present equations and figures as low-resolution images. Authors can also provide standalone, high-quality graphics that are appropriate for presentations. Because these articles are more educative than most research articles, we anticipate that authors will include illustrative examples and accompanying code. To accommodate this need, the CORSSA system allows authors to link an article with software, data, and accompanying explanatory text (Figure 1).
Like most textbooks and wikis, CORSSA maintains a glossary of relevant terms. If one of these terms is used in a CORSSA article, its first occurrence within the article is linked directly to the definition in the glossary (similar to the electronic version of the New York Times). The glossary includes community-developed definitions that are specific to statistical analysis of seismicity, but the definitions are general enough to be shared by multiple articles, much like a wiki. Perhaps the most important feature for readers, and unlike most textbooks and journals, is that CORSSA content is free to all.
In May 2010, 24 scientists from 11 nations attended a workshop in Zürich, Switzerland, to flesh out a plan for CORSSA and begin drafting an initial set of articles and accompanying material (Figure 2). This was a workshop in the literal sense: The majority of the time was dedicated to working in small groups, designing the contents of each thematic section. During the workshop, the authors of this article formed a CORSSA executive committee; by volunteering for this committee, we pledged our commitment to implement and publicize CORSSA, including sharing editorial and administrative responsibilities.
After the conclusion of the Zürich workshop, CORSSA participants continued drafting articles; soon thereafter, two article templates were designed and distributed: one for authors who prefer Microsoft Word and another for authors who prefer LaTeX. These templates provide a consistent look for each article with minimal typesetting.
The workshop participants agreed to use the Silva content management system for the CORSSA Web presence. Silva allows us to quickly add and edit all CORSSA content without requiring detailed knowledge of Web development technology. Silva has an open source license and was a natural choice because we rely on the technical support of the Swiss Seismological Service IT group, which was already familiar with using and supporting Silva.
We worked with colleagues at the ETH-Bibliothek (http://www.doi.ethz.ch
) to obtain DOIs for CORSSA content. Serendipitously, we discovered that ETH-Bibliothek is a member of the DataCite consortium (http://datacite.org
), which is one of only seven DOI registration agencies worldwide. This drastically reduced the administrative overhead and cost for DOIs. We reserved the prefix doi:10.5078/corssa, to which we append a unique eight digit number for each article. We note that we could also register DOIs for CORSSA datasets, software, and other content in the same way, but we have not yet chosen to do so.
With an initial set of DOI-registered articles and accompanying content, the CORSSA Web presence was officially publicized to attendees of the European Seismological Commission in Montpellier, France, in September 2010.
In this section, we describe CORSSA as it existed at the time of this writing; because it is a living resource, we don’t expect that the description will remain exactly accurate in the future, but this section provides the reader with an informative snapshot. We encourage the reader to visit http://www.corssa.org
for current information.
At the time of this writing, CORSSA includes seven published articles across five themes. In Theme I, introductory material, Michael and Wiemer (2010) described the motivation for, and some historical development related to, the CORSSA project. Vere-Jones (2010) adapted his keynote presentation from the 2007 International Statistical Seismology (StatSei) conference, suggesting how statistical tools can aid seismicity analyses and how students of seismology can obtain effective statistical training. As part of Theme III, statistical foundations, Naylor et al. (2010) mentioned some of the difficulties that a new researcher may face when attempting exploratory data analysis with earthquake catalogs, and they provided several practical exercises and code snippets. Husen and Hardebeck (2010) contributed a review of an important topic that many researchers neglect — accuracy and precision of earthquake locations — to Theme IV, understanding seismicity catalogs. They outlined in clear language how events are usually located or relocated; they also reported typical assumptions and highlighted the coupled nature of seismic velocity models and earthquake location estimates. For Theme V, models and techniques for analyzing seismicity, Hainzl et al. (2010) reviewed work related to spatiotemporal seismicity models based on rate-and-state friction and Coulomb stress transfer; they supplemented a brief theoretical treatment with discussion of numerical algorithms for parameter value estimation. Marsan and Wyss (2011) described the challenges in robustly identifying and understanding seismicity rate changes. In Theme VI, earthquake predictability and related hypothesis testing, Zechar (2010) compared various methods for evaluating earthquake predictions and earthquake forecasts, noting advantages and disadvantages for each strategy. He also contributed several software implementations and practical applications to accompany the article.
An additional six articles exist in various states of draft. Gulia et al. (under review) discuss methods for investigating the quality of a seismic catalog, including techniques for deblasting, or identifying non-tectonic events. Mignan and Woessner (under review) comprehensively review methods used to estimate catalog completeness — the magnitude level above which all earthquakes are believed to be reliably reported. Woessner et al. (under review) tell the story of how a seismicity catalog is generated and maintained. Zhuang et al. (forthcoming) provide a broad overview of several statistical models used to describe seismicity distributions. Iwata (under review) discusses earthquake triggering caused by forces other than tectonic loading, e.g., tides and passing seismic waves. The important issue of declustering — identifying and removing aftershock sequences from catalogs — is reviewed by van Stiphout et al. (under review).
Moreover, several articles that were planned during the initial CORSSA workshop have not yet been drafted. The complete list of envisioned articles is maintained at http://www.corssa.org/articles/draft_toc.pdf
and, while this list associates potential authors with potential articles, we would happily consider additional volunteer authors or article suggestions (e-mail contributions@corssa.org).
The CORSSA glossary contains 69 terms. The list of terms was originally compiled during the organizational meeting in Zürich, primarily by eavesdropping on the discussions of the individual working groups; this list was then augmented by combing through the submitted articles for frequently used terms. The linking between articles and the glossary is automated with a few simple scripts — one implemented as a macro for articles drafted using the Word template, and another implemented as a Java function that operates on LaTeX articles. Although the glossary is closely linked to the articles, it can also be used as a standalone resource for readers seeking seismicity-related definitions. The purpose of the CORSSA glossary is to provide a concise, contextual definition of each term, but we also provide links to more comprehensive treatments; for example, pointing to relevant research articles, U.S. Geological Survey Web pages for earthquake-specific terms, and Wikipedia for statistical terms. We also link some glossary terms to other glossary terms with which they can be contrasted; for example, “moment magnitude” is in this way linked with “local magnitude.”
All contributors to CORSSA are acknowledged on the Web presence; the list of contributors includes workshop attendees, article authors, referees, and individuals who shared software. Because we want CORSSA articles to be as useful and accurate as is practical, we host forums that allow open communication between the authors of each article and the readers; these forums also allow communication among readers. At the time of this writing, these forums have been little used, which may indicate that the forums are too new, that our reader community is too small to merit this functionality, or perhaps that readers are not accustomed to this type of interaction. We note that journals such as Nature and Nature Geoscience allow similar functionality in the form of online comments, and these too are often unused.
As a service to readers, we maintain a minimal news section that includes a list of recent and upcoming relevant meetings and a growing list of relevant journal articles. We suspect that these manually curated lists will serve as a convenient central point to access the latest information related to statistical seismicity research advances. Such a resource is increasingly useful; as David Foster Wallace pointed out in 1996, as the amount of information that we daily receive continues to grow, we need some method for filtering what is important (Lipsky 2010, 38). We intend for this news section to be sparingly used to announce calls for papers for special issues of journals or conference sessions.
We have received anecdotal positive feedback regarding CORSSAConclusionsarticles. The content has been effective for introducing new and continuing graduate students to complex topics, and article reprints that we brought to conferences have been very popular souvenirs.
Less than one year after work began on CORSSA, we have made tremendous progress in building a resource that we believe will educate students and researchers.
When designing CORSSA, we made a deliberate decision to limit the initial scope of the project; as none of the participants had done something quite like this before, and because we were mostly reliant on volunteer efforts, we resisted the temptation to make an excessively ambitious plan. It was primarily for this reason that we chose to emphasize statistical seismicity analysis rather than the broader field of statistical seismology. Moreover, seismicity analysis has tended to dominate the recent StatSei meetings (e.g., Schorlemmer and Jackson 2009).
Nevertheless, nothing about the design of CORSSA precludes us from expanding to cover other topics within statistical seismology. As research interests evolve, so too can CORSSA, provided that a sufficiently energetic community persists. We suspect that many other subfields would benefit from a resource similar to what we have designed and implemented, and because so many of the features of CORSSA are not knowledge domain specific, we hope that it can serve as a blueprint for others.
Portions of this article appear in slightly different form in the CORSSA article by Michael and Wiemer (2010). We thank an anonymous referee for many insightful comments and useful suggestions. We thank Benno Luthiger and Philipp Kästli for general technical assistance. We thank Angela Gastl and Francesco Croci for assistance with DOIs. We thank the following organizations for supporting CORSSA: Network of Research Infrastructures for European Seismology (NERIES), the Swiss Seismological Service, Southern California Earthquake Center, and the U.S. Geological Survey. JDZ was partially supported by NSF grant EAR-0944202. We especially thank Mietta Petronio for her patient and energetic support of this work.
Aki, K. (1965). Maximum-likelihood estimate of b in the formula log N = a − bM and its confidence limits. Bulletin of the Earthquake Research Institute 45, 237–239.
Gulia, L., S. Wiemer, and M. Wyss (under review). Catalog artifacts and quality control. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-93722864.
Hainzl, S., S. Steacy, and D. Marsan (2010). Seismicity models based on Coulomb stress calculations. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-32035809.
Husen, S., and J. L. Hardebeck (2010). Earthquake location accuracy. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-55815573.
Iwata, T. (under review). Earthquake triggering caused by the external oscillation of stress/strain changes. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-65828518.
Lipsky, D. (2010). Although of Course You End Up Becoming Yourself: A Road Trip with David Foster Wallace. New York: Broadway, 352 pp.
Marsan, D., and M. Wyss (2011). Seismicity rate changes. Community Online Resource for Statistical Seismicity Analysis; doi:10.5078/ corssa-25837590.
Michael, A. J., and S. Wiemer (2010). CORSSA: The Community Online Resource for Statistical Seismicity Analysis. Community Online Resource for Statistical Seismicity Analysis; doi:10.5078/ corssa-39071657.
Mignan, A., and J. Woessner (under review). Completeness magnitude in earthquake catalogs. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-00180805.
Naylor, M., K. Orfanogiannaki, and D. Harte (2010). Exploratory data analysis: Magnitude, space, and time. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-92330203.
Schorlemmer, D., and D. D. Jackson (2009). Seismologists and statisticians establish new research targets. Eos, Transactions, American Geophysical Union 90 (43); 10.1029/2009EO430008.
Schwarz, M. A. (2008). The importance of stupidity in scientific research. Journal of Cell Science 121, 1,771.
van Stiphout, T., J. Zhuang, and D. Marsan (under review). Seismicity declustering. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-52382934.
Vere-Jones, D. (2010). How to educate yourself as a statistical seismologist. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-17728079.
Weichert, D. H. (1980). Estimation of the earthquake recurrence parameters for unequal observation periods for different magnitudes. Bulletin of the Seismological Society of America 70 (4), 1,337–1,346.
Woessner, J., J. L. Hardebeck, and E. Haukkson (under review). What is an instrumental seismicity catalog? Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-38784307.
Zechar, J. D. (2010). Evaluating earthquake predictions and earthquake forecasts: A guide for students and new researchers. Community Online Resource for Statistical Seismicity Analysis; doi:10.5078/ corssa-77337879.
Zhuang, J., M. J. Werner, D. Harte, S. Hainzl, and S. Zhou (forthcoming). Basic models of seismicity. Community Online Resource for Statistical Seismicity Analysis; 10.5078/corssa-47845067.
[Back]
Posted: 24 August 2011