Citations: Link, Locate, Discover, Connect

Why cite?

Citation is an essential aspect of scholarly writing. Reference to other literature embeds the current work in the context of existing knowledge, supporting or disputing other works, and establishing the authority of the scholarly assertion.  References function for these purposes when they allow the reader to locate the work cited and validate, explore and allow understanding and further research. 

Citation indexes empower the essential functions of citation to link materials, to locate relevant works, to discover new papers through the network of connections created by the authors.  Garfield described this essential feature in his 1975 introduction to the first Journal Citation Reports:  “A citation index is based on the principle that there is some meaningful relationship between one paper and some other that it cites or that cites it, and thus between the work of the two authors or two groups of authors who published the papers.”[1]  These relationships, aggregated at scale, provide a deep, textured understanding of the development of the scholarly corpus.  In the context of a citation index, referencing a prior work does not just attribute credit, it creates a literal connection that reflects the intellectual connection between published works.

How to cite

The usefulness of citations to both authors and indexes are dependent on their providing sufficiently accurate bibliographic metadata to uniquely identify the item the author intends.  Some of the elements are primarily signifiers to human readers, some are more useful for a more automated infrastructure of linking.  Style guides vary in how they expect metadata to be presented, but a minimal set consists of the following:

  • Author(s) – for both human reader recognition and machine-linking
  • Publication date – for both human-reader information and machine linking
  • Item title – often optional, but generally useful to human readers
  • Source title – the book, journal or other source that published the work
  • Volume/issue/page and/or unique ID (such as a DOI, or article number) Pages are an historic, print-based specifier that continues to drive both human and machine location of the item; DOI functions primarily to facilitate automated linking for indexing and online navigation.

As with most critical systems, some level of  redundancy is helpful so that small errors do not invalidate the reference, or obscure its intended target.  An author name, item title and source title can assist a human reader in finding the referenced paper in the absence of a machine-readable DOI.

Turning citations into metrics in the JCR

In the processing of publication data for the creation of Web of Science, Clarivate Analytics functions, in effect, as a specialized “reader” of the citation record.  Instead of identifying cited works to read, we use the cited reference metadata to create proprietary metadata “fingerprints” of both the source material and cited references to create a navigable link between two records.  These links can then clarify and expand the reference data, allowing us to display a single, authoritative form of the citation.  When JCR is produced, all references cited in any of the Web of Science Core Editions[2] from the completed prior year are extracted from our central production database.  When those citations are linked to a source item in the journal, we use the source record to ensure a complete and accurate reference attribution.  When a citation is not linked, it is not discarded, rather we employ a unique set of additional aggregation steps to include these citations as part of the journal metrics.[3]  The citation dataset that is used for the JCR is a snapshot of the data at that time, rather than a continuously updated link to the citation data that are presented in Web of Science.

Suppression from the JCR

Journals are not editorially selected for appearance in the JCR. Eligibility is the result of selection by the in-house team of Web of Science Editors for indexing in the Science Citation Index Expanded or the Social Sciences Citation Index in Web of Science.  The goal of the JCR then, is to publish accurate, complete citation data that reflects the citation impact of those journals in and by the surrounding literature.  The metrics should not be merely mathematically accurate, but should give a realistic reflection of the journal’s use.

There are two reasons for a journal to be editorially suppressed from appearing in the JCR, and both are to ensure the integrity of the data and the resulting rankings. 

The first, and most common, is when source content for the journal in one or more of the prior years is incomplete.  Missing content will under-represent the size of the scholarly content in the journal and generate a falsely low denominator in the Journal Impact Factor (JIF) and other ratio-based metrics[4].  Although every effort is made to ensure indexing is complete prior to the JCR extraction, missing articles, or issues are sometimes identified during the production cycle and the journal is manually removed from the final metrics.

The second is when the numerator includes an exceptional pattern of citation concentration that alters the JIF to a degree where the ranking of the journal within its category or categories is distorted and is no longer accurately representing the citation use of the journal.  Currently, there are two types of numerator analysis:  journal self-citation and “citation stacking.”[5]

The analysis of journal self-citation begins with establishing the range of journal self-citation that characterized the general population of journals in each of the major coverage areas – Science and Social Sciences – in the prior year.  Journals are further assessed against the performance and journal self-citation contribution within individual categories to compare the rank of journals with and without journal self-citation. We then identify titles where that differential indicates an extraordinary contribution of journal self-citation to the final  of the journal compared to related titles.  Because the effect on ranking is compared within the category, it is unlikely that a small field with a limited set of relevant citation partners would be inordinately affected.

Citation stacking is the concentration of the citation exchange between two or more journals. that is, one or a few articles in a “donor” journal concentrate a large number of citations to the JIF denominator content in a “recipient” journal, often to the exclusion of any other references or reference years.  Journal sets in small, strongly overlapping fields will naturally show interdependence in citations, but these will extend across many years; journals that are only a few years in publication will show citations concentrated only within those few years, leading to what might appear to be a temporal concentration.  Both of these factors are corrected by other aspects of the analysis.
Suppression of a journal from a given year of the JCR is not an evaluation of cause, only a assessment of effect; therefore suppression from JCR does not always result in de-selection from Science Citation Index Expanded or the Social Sciences Citation Index.  Evaluation of the metrics accuracy and evaluation of Editorial contribution are separate. 

Citation ethics as a community effort

Ideally, citation is an article-level activity that constitutes a part of authorship.  Appropriate citation should begin with the authors’ accurate representation of works that contributed to the content, of the work at hand.  Citation should be included only for the purpose of contributing to the scholarly completeness of the work.  The assessment of both appropriateness and completeness of the cited references in a work is a critical aspect of both peer review and editorial oversight[6] and the use of editorial position to alter a reference list to enhance an individual’s or journal’s profile is ethically problematic[7],[8].  Review of the purpose or academic value of citations within an article is the work of reviewers and editors to ensure the integrity of the scholarly content they publish.

Citation data alone are not able to determine whether the individual citations in an article are academically necessary and consistent with the citation norms of the field, or extraneous, incomplete or incorrect.  The JCR Editorial team can only review the consistency and accuracy of the resulting metrics, and act to ensure the integrity of that record.  While we are often alerted by members of the scholarly and publishing community when they have ethical concerns regarding the citation behavior of a journal, our investigation of these matters must be supported by demonstrable evidence that the reported actions have distorted the resulting metrics.

The suppression of a title from the JCR is often misunderstood as a “blacklisting” or a punitive action taken by Clarivate in response to intentional manipulation of citation counts or deliberate inflation of the JIF value.  While the citation behavior that results in suppression is often extreme, the cause or motive of the behavior cannot be a consideration, as it cannot be equally or objectively assessed across all journals.

The JCR and JIF are not created by Clarivate Analytics.  Rather, they are calculated by Clarivate based on the citation network that is created by the authors, reviewers and editors of the articles and journals.  It is a shared responsibility to protect the integrity of the scholarly record, as well as the citation metrics that result.

Marie E. McVeigh                 Nandita A. Quaderi
Product Director, JCR          Editor-in-Chief, Web of Science

Clarivate Analytics

--------------------------------------------------------------------------------------------------------------------------------------

[1] Garfield E. (1975). “Preface and Introduction to Journal Citation Reports.” Volume 9 of Science Citation Index.  Available at:  http://garfield.library.upenn.edu/papers/jcr1975introduction.pdf
[2] Cited references from the Book Citation Indexes were added as contributing materials to the JCR in 2018, and are included in the recently published 2017 Journal Impact Factors and other JCR metrics.  The set of content whose citations are included in the JCR now comprise:  Science Citation Index Expanded, Social Sciences Citation Index, Arts & Humanities Citation Index, Emerging Sources Citation Index, Conference Proceedings Citation Indexes, and the Book Citation Indexes, making the WoS Core Collection, in totem, the source of references that determine the JCR metrics.  Only the Science Citation Index Expanded, and Social Sciences Citation Index journals currently have JCR metrics published.
[3] Hubbard SC and McVeigh ME (2011). “Casting a Wide Net:  the Journal Impact Factor denominator.”  Learned Publishing 24: 133-137.  https://doi.org/10.1087/20110208
[4] http://ipscience-help.thomsonreuters.com/incitesLiveJCR/10667-TRS.html  The JIF for 2017 is calculated as the ratio between the number of citations in 2017 to any of the journal’s content in 2015 or 2016, and the count of scholarly items published in the journal in 2015 or 2016.
[5] Clarivate Analytics (2017). “Title Suppression from Journal Citation Reports.” http://wokinfo.com/media/pdf/jcr-suppression.pdf
[6] Penders B (2018).  “Ten simple rules for responsible referencing.”  PLoS Computational Biology, 14(4): e1006036.  https://doi.org/10.1371/journal.pcbi.1006036
[7] https://publicationethics.org/files/u7141/Forum%20discussion%20topic_final.pdf
[8] Wilhite AW and Fong EA (2012). “Coercive Citation in Academic Publishing.” Science, 335(6068), pp. 542-543. https://doi.org/10.1126/science.1212540