Ridderprint

The history, evolution, and future of big data and analytics 27 Table 2.1: Search keywords BDA keywords AND (Organizational level AND Performance) “machine learning” organization performance “data science” company analytics firm “deep learning” unit “big data” team “artificial intelligence” group individual employee worker On September 7 th , 2017, we searched the ISI Web of Knowledge bibliographic database, acknowledged as the most reliable database (Jacso, 2008; Bar-Ilan, 2008), for these keyword combinations and extracted the results of the relevant work-related domains (i.e., operation research, management science, business, business finance, psychology, psychology applied, management, sport sciences, and economics). This retrieved dataset included 324 primary documents which, in turn, provided 14,767 unique secondary (cited) documents. 1 In order to reduce the complexity of this latter dataset of secondary documents, we determined a citation threshold – the minimum number of citations a secondary document had to have to be included. Via an iterative approach (Zupic & Čater, 2015), a minimum threshold of two citations reduced our sample of secondary documents to 1252 papers. Table 2.2 demonstrates which journals published our primary and secondary papers. 2.2.2 Analyses Three bibliometric analyses were conducted. Document co-citation analysis and algorithmic historiography were applied to the sample of secondary papers whereas bibliographic coupling was applied to the sample of primary papers. These three methods are explained in detail later. Clusters of nodes in networks can be detected using modularity optimization. Detecting clusters in a network requires the partition of a network into communities of densely connected nodes. Here, one prefers the nodes belonging to different communities to be only sparsely connected. The quality of the partitioning can thus be quantified via the overall modularity of the network – a value that represents the density of links within communities as compared to links between communities. Hence, the best clustering solution is that in which the modularity is highest (Blondel, Guillaume, Lambiotte, & Lefebvre, 2008; Newman, 2004). Because iterative clustering algorithms work with a random starting point, we exmined the robustness of our clustering solution. We ran Blondel and colleagues’ clustering algorithm (2008) 50 times for each analysis (using Gephi’s default resolution settings; i.e., 1.0) and calculated the average optimal number of clusters. Subquently, we chose the next solution where the number of clusters was equal to this average optimal number. 1 Datasets available via https://bit.ly/2pHSb57