Comparing Web of Science, Scopus and Google Scholar from an Environmental Sciences perspective 1

This paper presents a macro- and micro-level comparison of the citation resources Web of Science (WOS), Scopus and Google Scholar (GS) for the environmental sciences scholarly journals in South Africa during 2004-2008. The macro-level measuring instruments consisted of 26 evaluation criteria with the following broad categories: content, access, services, interface, searching, search results, cost, citation and analytical tools, and linking abilities. The micro-level measuring instrument’s evaluation criteria represented the data fields of the journal records to establish comprehensivity. The macro-level evaluation results indicated that Scopus surpassed both WOS and GS whereas the micro-level evaluation results indicated that WOS surpassed both Scopus and GS. Based on the macro- and micro-level evaluation results the study was able to establish that GS is not yet a substitute but rather a supplementary citation resource for the fee-based WOS and/or Scopus for the South African international accredited scholarly environmental sciences journals during the period 2004-2008.

The identification of the best-suited citation resource/s favouring environmental sciences research will prove invaluable in assisting academic libraries to make prudent collection management decisions concerning citation resource subscriptions. Currently, there is no study conducted which compares these citation resources in a South African context. To date, the citation coverage of the environmental sciences as discipline in a South African context has not been established. It was therefore decided to use scholarly environmental sciences journals accredited by the South African Department of Higher Education and Training (DoHET) for the population sample. This would also additionally determine the citation coverage of the environmental sciences scholarly journals in South Africa.
The objectives of this article include comparing WOS, Scopus and GS on a macro-level, micro-level, in order to determine whether GS could be considered a substitute for the fee-based citation resources WOS and Scopus, regarding the verification of the content of the exported data for the journal sample population compared in terms of content completeness and quality. The significance of this study emphasizes the existing citation coverage of South African scholarly environmental sciences journals. In addition, it adds new as no citation coverage of environmental sciences in a South African context from 2004 to 2008 has been established. This is of benefit to environmental sciences academics in terms of their internationally accredited research profile. This study is beneficial to academic libraries for collection development and service delivery as it determines whether the free Web citation resource GS can substitute or complement a fee-based citation resource, like WOS and Scopus. Ultimately it contributes to determining whether these resources should be retained on a library's budget. Bauer and Bakkalbasi (2005: 1) suggest that citation analysis is used as a tool to track scholarly research, and measure the impact on scholarly research in order to justify tenure and promotion. Citations form the foundation of citation tracking and analysis. In essence, a citation can be defined as a written reference to a specific work or portion of a work by a particular author that identifies the document in which the work can be located. Reitz (2004: 142) describes the frequency in which a work is cited as the measure of importance which can be assigned to a work. The number of citations for a work can be expressed as the citation counts. Citation analysis therefore involves the counting of the number of times a paper or a researcher is cited (Pringle 2008: 90;Wohlin 2007: 2;Grant 1991: 557). In addition, Vucovich, Baker and Smith (2008: 63) suggest that citation counts are based on the following assumptions that the work cited by an author implies that the document is being used:

A general overview of citation resources
• The citation of an article reflects the merit or significance of the article.
• Citations of an article imply that the references are derived from the best literature on the topic.
• The content of the articles being cited is related to the topic of the article.
Often researchers obtain all possible citation counts pertaining to their research in order to determine their publication impact on the subject discipline and on the body of research. Citation analysis can be described an as effective and efficient indicator of the publication impact and quality of the research. It is understandable that citation tracking and analysis has therefore become more prominent amongst academics (Kloda 2007: 89). The literature refers to citation index, citation tool, source index and subject index when describing a resource which lists and tracks the various scholarly citations during a specific year and lists the citations alphabetically by the author cited followed by the sources or names of authors citing (Bar-Ilan 2007: 26;Reitz 2004: 143). For the purpose of this article, a citation resource can be described as a resource which includes any print, electronic Web-based resource which includes citation references, cited references and citation analysis tools for the purpose of accessing citation trends.
The rising costs associated with and the volume of scholarly publications has ushered in a very challenging time for information professionals (Branin & Case 1998: 475). Many libraries and information centres are faced with maintaining expensive subscriptions to fee-based citation resources with shrinking or stagnant library budgets. The traditional feebased citation resource WOS is considered an expensive citation resource (Schroeder 2007: 246). The prolific production of citation resources implies that there are more citation resources being produced as an alternative or to either supplement the existing citation resources. With the introduction of new citation resources many scholarly articles have been produced comparing these citation resources and debating the advantages and disadvantages and emphasize the need for comparative studies on citation resources. (Kloda 2007: 89;Branin & Case 1998: 475). Kloda (2007: 89) asserts that there is a need to know which citation resource is the best and the most appropriate to use.
In the scholarly literature, different terminology is used when referring to citation coverage within citation resources, which varies from 'coverage', 'subject coverage', 'journal coverage', 'meta-resource coverage', 'database coverage', etc. It is therefore necessary to clarify these terms and define citation coverage.
'Coverage', in general, would imply everything included in or the entire composition or the potentiality of data for retrieval on all possible resources which would include peer-reviewed scholarly journals (including Open Access), conference proceedings, books and chapters from books, patents and peer-reviewed reports on webpages of a scholarly nature. 'Resource coverage' is described as the coverage of all references and potentially available journal articles available for retrieval on various types of resources including databases, meta-resources, online books, information websites and online search engines.
'Data coverage' can be defined as the coverage of data potentially available for retrieval by the client. 'Database coverage' refers to the number of journals or number of journal titles contained in the citation resource (Gavel & Iselid 2008: 8;Schroeder 2007: 243). For the purpose of this article, 'database coverage' is described as the coverage of all data potentially available for retrieval on a database. 'Meta-resource coverage' can be described as the coverage of all data and meta-data which are potentially available for retrieval on a meta-resource of which search engines and online journals in a searchable format are examples (Hung et al., 2008: 361).'Journal coverage' can be defined as the coverage of all journal articles which are potentially available for retrieval within the resource (Mayr & Walter 2008: 82). 'Subject coverage' can be described as the coverage of all subject disciplines represented by the indexed or assigned descriptors, which are potentially available for retrieval within the resource (Meho & Yang 2007: 2106Golderman & Connolly 2007: 19). 'Citation coverage' can be defined as the coverage of all references of journal articles, cited references of journal articles, cited authors and cited journals which are potentially available for retrieval on the resource (Bakkalbasi et al., 2006: 2). One can therefore accept that coverage of a resource would include the concepts of citation coverage, journal coverage, data coverage, meta-resource coverage and subject coverage. While each of the above terms has a specific coverage focus, the element which distinguishes the citation coverage from all the other types of coverage is the presence of citation references and cited references.
For the purpose of this article, 'citation resource coverage' can be defined as the coverage of all references of journal articles, cited references of journal articles, cited authors and cited journals which are potentially available for retrieval on various types of citation resources, which include citation databases, meta-resources containing citations, online books, search engines and information websites.

Figure 1 Schematic representation of citation resource relationships
Given the definitions above, the concepts 'resource coverage' and 'citation resource coverage' can be described as similar concept, where the concept 'citation coverage' being the defining factor in 'citation resource coverage'. The concepts 'database coverage' and 'meta-resource coverage' can be described as types of resource coverage (see Figure 1). The concepts 'journal coverage', 'subject coverage' and 'data coverage' can be seen as having elements of 'resource coverage', 'citation resource coverage', 'meta-resource coverage' and 'database coverage' in common. This implies that 'journal coverage', 'subject coverage' and 'data coverage' can be determined in a resource, which includes a database, i.e. WOS and Scopus, and a meta-resource, i.e. GS. The concept of 'citation coverage', however, is a unique attribute pertaining to resources which contain the references of an article, cited references, cited authors and cited journals of the specific item. The term 'citation resource coverage' is derived from using the one attribute of a resource which defines it as a citation resource, i.e. 'citation coverage'. Citation information is only included in the resources which are designed for the storing, listing and retrieval of citation information. WOS, Scopus (both databases) and GS (a meta-resource) can be compared on equal grounds as citation resources and by assigning the attribute 'citation coverage' as the common factor.
The hierarchical relationship between resource coverage and the concepts of database coverage and meta-resource coverage in relation to the concepts of data coverage, journal coverage, subject coverage and citation coverage are represented schematically in the figure below. The defined concept 'citation resource coverage' is illustrated in relation to resource coverage, database coverage, meta-resource coverage, data coverage, journal coverage, subject coverage and citation coverage.
It was decided for the purpose of this article that the term 'citation resource coverage' would be the most appropriate to use when referring to the coverage of citation resources like WOS, Scopus and GS.

An overview of WOS, Scopus and GS
WOS originated in the 1960s as a print based citation index with the aim of facilitating citation tracking and analyses. The WOS in its current on-line form was launched in 1997and was considered the only credible citation resource for more than 40 years. WOS, a product of Thomson ISI, continued to enjoy the monopoly until Scopus and GS were introduced in 2004.
WOS enjoyed the reputation as a popular multidisciplinary citation resource for various reasons which includes consistency (Schroeder 2007: 246); broad subject-centered coverage (Thomson Reuters 2009: 1); controlled vocabulary and authority control (Schroeder 2007: 244); only includes refereed content (Norris & Oppenheim 2007: 141); extensive date range which spans 40 years (Schroeder 2007: 244); quality and accuracy guaranteed (Mikki 2009a: 1); stability as oldest resource (Norris & Oppenheim 2007: 141); and the ability for clients to personalize the services. WOS has also been criticised on various points which include: under estimating the international citation impact of an individual researcher (Harzing & Van der Wal 2008: 4); not accommodating the research published in Open Access journals (Meho 2007: 4); limited coverage of scholarly journals (Meho 2007: 2); limited coverage of non-Western European and non-North American sources (Meho & Yang 2006: 2); expensive citation resource (Schroeder 2007: 246); and considered a non-user friendly citation resource (LaGuardia 2005: 40). Given all the advantages and disadvantages in the literature, WOS is still the preferred citation resource by many researchers.
GS was introduced in 2004 by Google with the aim of providing a simple method of searching interdisciplinary scholarly information via the Web. It has steadily gained the reputation of challenging all information professionals to take notice of this new free citation resource (Felter 2005: 43). The fascination with GS includes allowing for the democratization of citation analysis and allows for free access to all with Internet access. GS however has also gained critics which describe GA as the "barbarian at the gate" and a citation resource which has given other citation resources a  (Noruzi 2005: 170); supports automatic Boolean and truncation operators (Burright 2006: 2); retrieves international, non-English language journals (Meho & Yang 2006: 26); and supports Open Access scholarly resources (Dess 2006: 1).
The disadvantages include the lack of transparency concerning content (Golderman & Connolly 2007: 18); lacks authority files (Burright 2006: 1); lacks subject hierarchy for subject searching (Burright 2006: 1); uneven coverage of disciplines (Harzing & Van der Wal 2008: 9); less successful at tracking older citations (Harzing & (Gardner & Eng 2005: 42); only indexes the first 100-120 Kb of the data collected (Jacso 2005c: 210); contains fragmented scholarly literature and content omission occurs (Bornmann et al., 2009: 33); and is not updated on a regular and frequent basis (Giustini & Barsky 2005: 85). However, GS has established a reputation in a short time and is well used by researchers and students. It is evident in the literature that it has become a citation resource to be reckoned with.
In 2004, Elsevier developed the fee-based citation resource Scopus, described as an interdisciplinary citation resource which represents print and Web sources. Manafy (2005: 13) described Scopus as creating a new niche by trying to wedge itself between WOS and GS by providing peer-reviewed citations with Web-based tools and features. Scopus has various advantages which includes content which is subject-centred (LaGuardia, 2005: 42); including comprehensive coverage as a citation resource (Meho and Sugimoto 2008: 5); includes refereed materials (Manafy 2005: 12); enjoys the reputation of including value-adding features (LaGuardia 2005: 42); includes languages other than English (Dess 2006: 1); transparency policy regarding content in Scopus (Manafy 2005: 12); includes a large percentage of abstracts in journal records (Jacso 2005a(Jacso : 1539; and includes other scholarly resources other than only peer-reviewed journals and conference proceedings (Jacso 2005a(Jacso : 1539. Scopus has also received criticism, which includes the lack of citation coverage before 1996 (Gavel & Iselid 2008: 14); a limited coverage of the Humanities as discipline (Jacso 2005a(Jacso : 1539; and is an expensive citation resource (Schroeder 2007: 246). Given all the advantages and disadvantages in the literature, Scopus has made considerable progress as a citation resource in a short time (Gavel & Iselid 2008: 19).

Research on comparative studies between WOS, Scopus and GS
Since 2004 there has been considerable debate in the literature surrounding the capabilities of both Scopus and GS compared to WOS. The literature includes studies which focus on using comparative criteria based on the features and attributes of the citation resources which included size and dimension of citation resource, content, coverage, subject scope, etc. Jacso (2005aJacso ( : 15372005b: 365) concluded that WOS was the superior citation resource. The study by Bosman et al. (2006: 44) found a large percentage of citation coverage between Scopus and WOS compared to a smaller percentage in GS. No substantial difference in the citation patterns in WOS and GS across various disciplines could be observed in the study by Pauly and Stergiou (2005: 34) and it was then suggested that GS, could be a substitute citation resource to WOS.
The Neuhaus and Daniel (2008: 193) study gives an overview of the WOS, Chemical Abstracts, GS and Scopus as Citation Source for Citation Analysis. Tarantino (2006: 23) investigated WOS, Scopus and GS with the purpose of establishing how they function and not necessarily to find which of the citation resources was superior, and concluded that there was no superior citation resource. The study by Golderman and Connolly (2007: 18) supported this view. Jacso's study (2008: 525) investigation concluded that Scopus produced more reliable h-index rankings than GS. The study by Meho and Sugimoto (2008: 4), concerning the top 15 citing sources, universities or countries of scientific networks, and found no significant differences in the results obtained from the retrieved results.

Methodology
The research study consisted of a detailed literature review, followed by an empirical component using a comparative research design (Mouton 2001:154). This consisted of a pilot study and the development of three measuring instruments based on the literature review. The results were then analyzed and the comparison concluded.
Purposive non-probability sampling was used in order to define the sample population for the study. The South African scholarly environmental sciences journals were chosen as the predetermined target population. The sampling involved formulating an inclusive definition of environmental sciences, determining the South African accredited environmental sciences journals and determining the internationally accredited South African environmental sciences journals.
The process of collecting the data involved conducting searches on the citation resources to retrieve the citation references of the journal articles contained in the journals. The extracted data was placed into Excel spread sheets in order to facilitate the evaluation and comparison of the data. The citation resources were evaluated in four phases: pilot study, macro-level evaluation, micro-level evaluation and content verification.
The research approach that has been followed for this study can be classified under Pasteur's quadrant of Stokes's research classification quadrants (1997) as "use-inspired basic research".

Findings and discussion
The three citation resources WOS, Scopus and GS were subjected to a macro and micro-level evaluation in order to establish a citation resource of relevance and preference for environmental science scholarly South Africa journals. The macro-level comparison determined the extent of comprehensivity of WOS, Scopus and GS using the identified macrolevel evaluation criteria.
The micro-level measuring instrument focussed on the specific record content of the journal articles, by comparing the content occupied by the data fields on the journal record to establish the degree of comprehensivity of the journal article records on the three citation resources. Therefore, to establish the extent of the content present in the data fields on the journal article records.
6.1 Summary results and analysis of macro-level evaluation A macro-level measuring instrument consisting of 26 evaluation criteria was used (see table 1). If the citation resources adhered to the evaluation criteria, then a rating of 1 was awarded, the rating 2 was awarded to partial adherence of evaluation criteria and a rating of 3 where no adherence to evaluation criteria. An asterisk "*" was used where it was not possible to determine a result. The evaluation criteria included the following: types of publications, scope, subject/ discipline coverage, size, and time span in years and currency, preservation strategy and target audience. The macro-level evaluation criteria dealing with interface included the end-user interface, search screen interface and search results screen. The criteria dealing with searching included: Data search options, advanced searching options, browsing options, cited reference searching options and classification codes. The search results criteria consisted of bulk saving, marking, sorting, e-mailing, exporting, search history and alerts. The last three criteria included costs, criteria dealing with citation and analytical tools and linking abilities, which consisted of inter-database and intra-database linking abilities. The macro-level evaluation results retrieved (table 2) indicated that out of 26, WOS received a total of 18, 1-ratings (62.9%), Scopus received a total of 20, 1-ratings (76.9%) and GS received a total of 3, 1-ratings (11.5%). It can be deduced that on a macro scale, Scopus surpassed WOS and GS.

Summary results and analysis of micro-level evaluation
The micro-level measuring instrument involved identifying the data fields of the journal article records that were represented on the citation resources. There were 39 data fields identified for the micro-level measuring instrument.
The micro-level comparison results were derived from the journal sample population with year range 2004-2008. The results contained content which could be viewed and exported from the citation resources. The retrieved results reported in two formats: Web Interface (WI) and export (E) format. A summary of the micro-level comparison results are shown in Table 3.
The study by Schroeder (2007: 244) states that GS is lacking in indexing capacity when compared to WOS. The findings of this study support this statement as the micro comparison revealed that GS does not index to the same extent as Scopus and WOS. It is evident that GS, as a citation resource, is not a substitute for the existing fee-based citation resources WOS and/or Scopus regarding indexing.
The "E" format of WOS received the highest percentage of 76.9% (30 of the 39 data fields contained data) for comprehensive representation of the data on the records of the journal articles contained in the citation resources. The WI format of WOS received the second highest score of 71.8% (28 of the 39 data fields contained data). The Scopus "WI" format received the third highest score of 59% and the E format 46.2%. The E format of GS indicated a 30.8% and the WI 23.1%.
Based on the results, according to table 3, WOS rates first concerning the comprehensiveness of the representation of the data on the records of the journal articles contained in the citation resources, with Scopus second and GS third.

Conclusions
The results from the macro-level evaluation indicate that Scopus received the highest rating (76.3%) compared to WOS (69.2%) and GS (15.4%). Using these results, it can be stated that Scopus performed the best as a citation resource on a macro scale. The results from the micro-level evaluation indicate that journal article records on WOS contained the most comprehensive representation of the data of the three citation resources. The results determined that WOS surpassed both Scopus and GS on both the WI format as well as the E format concerning the fields contained in the journal article records.
The final conclusion that can be made from this research is that GS is not a substitute for WOS and/or Scopus using the citation references from the South African environmental sciences journals which were internationally accredited during the period 2004-2008. The study by Harzing and Van der Wal (2008: 20) supports the findings of this study which indicates that GS can be used as a supplement citation resource.
It was further determined that the citation resource Scopus can be considered as a substitute for WOS, which was traditionally the citation resource of choice of academic researchers. The Norris and Oppenheim (2007: 168) study supports this finding and recommends Scopus be used as an alternative to WOS for the Social Sciences.
A last recommendation is that the current study forms a basis to a follow-up study as part of a longitudinal research project which is repeated with data collected from the period 2009 -2013.