View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
The Enhanced Microsoft Academic Knowledge Graph |
Identification Number: |
doi:10.17903/FK2/TZWQPD |
Distributor: |
Κατάλογος Δεδομένων SoDaNet |
Date of Distribution: |
2024-04-30 |
Version: |
1 |
Bibliographic Citation: |
Pollacci, Laura, 2024, "The Enhanced Microsoft Academic Knowledge Graph", https://doi.org/10.17903/FK2/TZWQPD, Κατάλογος Δεδομένων SoDaNet, version 2 |
Holdings Information: |
https://doi.org/10.17903/FK2/TZWQPD |
Citation |
|
Title: |
The Enhanced Microsoft Academic Knowledge Graph |
Alternative Title: |
EMAKG |
Identification Number: |
doi:10.17903/FK2/TZWQPD |
Authoring Entity: |
Pollacci, Laura (University of Pisa) |
Grant Number: |
GA 870661 |
Grant Number: |
654024 |
Grant Number: |
654024 |
Distributor: |
Κατάλογος Δεδομένων SoDaNet |
Date of Distribution: |
2024-04-30 |
Holdings Information: |
https://doi.org/10.17903/FK2/TZWQPD |
Study Scope |
|
Keywords: |
SCIENTIFIC PUBLICATIONS, SCIENTIFIC MIGRATION FLOWS, SCIENTIFIC COLLABORATION NETWORK |
Topic Classification: |
SCIENCE AND TECHNOLOGY, Social and occupational mobility, SOCIETY AND CULTURE, OTHER |
Abstract: |
The Enhanced Microsoft Academic Knowledge Graph (EMAKG) is a large dataset of scientific publications and related entities, including authors, institutions, journals, conferences, and fields of study. The proposed dataset originates from the <a href="https://makg.org" target="_blank">Microsoft Academic Knowledge Graph (MAKG)</a>, one of the most extensive freely available knowledge graphs of scholarly data. To build the dataset, we first assessed the limitations of the current <a href="https://makg.org" target="_blank">MAKG</a>. Then, based on these, several methods were designed to enhance data and facilitate the number of use case scenarios, particularly in mobility and network analysis. EMAKG provides two main advantages: <ol> <li>It has improved usability, facilitating access to non-expert users</li> <li> It includes an increased number of types of information obtained by integrating various datasets and sources, which help expand the application domains.</li> </ol>For instance, geographical information could help mobility and migration research. The knowledge graph completeness is improved by retrieving and merging information on publications and other entities no longer available in the latest version of <a href="https://makg.org" target="_blank">MAKG</a>. Furthermore, geographical and collaboration networks details are employed to provide data on authors as well as their annual locations and career nationalities, together with worldwide yearly stocks and flows. Among others, the dataset also includes: <ol> <li>fields of study (and publications) labelled by their discipline(s); </li> <li>abstracts and linguistic features, i.e., standard language codes, tokens , and types</li> <li>entities’ general information, e.g., date of foundation and type of institutions; and</li> <li>academia related metrics, i.e., h-index. </li> </ol> The resulting dataset maintains all the characteristics of the parent datasets and includes a set of additional subsets and data that can be used for new case studies relating to network analysis, knowledge exchange, linguistics, computational linguistics, and mobility and human migration, among others. |
Time Period: |
1800-01-01-2021-12-31 |
Geographic Coverage: |
Worldwide |
Unit of Analysis: |
Individual |
Unit of Analysis: |
Other |
Universe: |
The dataset includes data on scientists across the world. |
Methodology and Processing |
|
Time Method: |
Longitudinal |
Sampling Procedure: |
Total universe/Complete enumeration |
Characteristics of Data Collection Situation: |
total authors: 243,042,675 diambiguated: 151,355,324 |
Data Access |