You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A collection of Jupyter notebooks with examples of querying different PID providers like [ORCID](https://orcid.org/), [ROR](https://ror.readme.io/), [Crossref](https://www.crossref.org/) and PID graphs like the [FREYA PID Graph](https://blog.datacite.org/powering-the-pid-graph/) and [OpenAlex](https://openalex.org/about) for connected objects.
6
+
A collection of Jupyter notebooks with examples of querying different PID providers like [ORCID](https://orcid.org/), [ROR](https://ror.readme.io/), [Crossref](https://www.crossref.org/) and PID graphs like the [FREYA PID Graph](https://blog.datacite.org/powering-the-pid-graph/), [OpenAlex](https://openalex.org/about) and [OpenAIRE](https://www.openaire.eu/) for connected objects.
7
7
8
8
Currently included connections:
9
9
* organization-organization
@@ -17,7 +17,11 @@ Currently included connections:
17
17
* person-works
18
18
* input: ORCID
19
19
* output: list of works authored/created by the person, each identified by their DOI
20
-
* data sources: Crossref, FREYA PID Graph, OpenAlex, ORCID
20
+
* data sources: Crossref, FREYA PID Graph, OpenAlex, ORCID, OpenAIRE
21
+
* work-projects
22
+
* input: DOI
23
+
* output: list of projects the work was produced in, each identified by their OpenAIRE project ID
24
+
* data sources: OpenAIRE
21
25
22
26
23
27
Please navigate into the respective folder to see the list of available notebooks.
@@ -35,3 +39,5 @@ you can use this link to launch the notebooks on Binder where you can execute an
35
39
In the joint project [TAPIR](https://projects.tib.eu/tapir/en/) (Partially Automated Persistent Identifier-based Reporting), partially automated procedures for research reporting are being tested in the context of university and non-university research. To this end, the question is being investigated :
36
40
37
41
To what extent can the necessary data aggregation be carried out on the basis of openly available research information using persistent identifiers?
42
+
43
+
*More information in our blog post "[Project TAPIR: Harvesting the power of PIDs](https://blogs.tib.eu/wp/tib/2022/03/01/project-tapir-harvesting-the-power-of-pids/)"*
Copy file name to clipboardExpand all lines: organization-people/orcid_get_people_by_organization.ipynb
+13-10
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
"source": [
9
9
"### Query ORCID for people affiliated with an organization and filter for current employees only\n",
10
10
"\n",
11
-
"This notebook queries the [ORCID API](https://api.orcid.org/v3.0/) for all [people affiliated with an organization](https://info.orcid.org/faq/how-do-i-find-orcid-record-holders-at-my-institution/) and additionally narrows down the affiliation to people **currently employed** by the organization. From the resulting list of people we output the ORCID iDs.\n",
11
+
"This notebook queries the [ORCID Public API](https://api.orcid.org/v3.0/) for all [people affiliated with an organization](https://info.orcid.org/faq/how-do-i-find-orcid-record-holders-at-my-institution/) and additionally narrows down the affiliation to people **currently employed** by the organization. From the resulting list of people we output the ORCID iDs.\n",
12
12
"\n",
13
13
"*Disclosure:\n",
14
14
"The process of querying the ROR API for additional identifiers and using them to query the ORCID API for affiliated people is the same as used by the [FREYA PID Graph](https://blog.datacite.org/powering-the-pid-graph/) and is implemented in [DataCite Application API](https://doi.org/10.5438/8gb0-v673).*"
"print(\"Wikidata ID: \" + str(organization_wikidata_id or ''))"
175
176
]
@@ -261,7 +262,7 @@
261
262
},
262
263
"source": [
263
264
"### Connection organization -> people\n",
264
-
"The second part of the process is to query for the people affiliated with the organization. For this we use the ORCID API and search for people affiliated with an organization like it is explained in the ORCID tutorial [\"How do I find ORCID record holders at my institution?\"](https://info.orcid.org/faq/how-do-i-find-orcid-record-holders-at-my-institution/). As parameters for the query we use the Grid ID and Ringgold ID for the organization.\n"
265
+
"The second part of the process is to query for the people affiliated with the organization. For this we use the ORCID API and search for people affiliated with an organization like it is explained in the ORCID tutorial [\"How do I find ORCID record holders at my institution?\"](https://info.orcid.org/faq/how-do-i-find-orcid-record-holders-at-my-institution/). As parameters for the query we use the ROR ID, Grid ID and Ringgold ID for the organization.\n"
"### Query OpenAIRE for publications authored by a person\n",
8
+
"This notebook queries the [OpenAIRE HTTP API](https://graph.openaire.eu/develop/api.html) via its `/publications` endpoint for publications authored by a person. It takes an ORCID iD as input which is used to filter for publications where one of the creators' `orcid` field matches the given ORCID iD. From the resulting list of publications we output all DOIs.\n",
9
+
"\n",
10
+
"*Note:\n",
11
+
"The API has several different endpoints for research outputs: they are divided into publications, research data, software metadata and other research products, so to get a full picture about a person's research output, you would have to query all of these endpoints and union their results.*"
12
+
]
13
+
},
14
+
{
15
+
"cell_type": "code",
16
+
"execution_count": 1,
17
+
"metadata": {
18
+
"pycharm": {
19
+
"name": "#%%\n"
20
+
}
21
+
},
22
+
"outputs": [],
23
+
"source": [
24
+
"# Prerequisites:\n",
25
+
"import requests # dependency for making HTTP calls\n",
26
+
"from benedict import benedict # dependency for dealing with json"
27
+
]
28
+
},
29
+
{
30
+
"cell_type": "markdown",
31
+
"metadata": {
32
+
"collapsed": true,
33
+
"pycharm": {
34
+
"name": "#%% md\n"
35
+
}
36
+
},
37
+
"source": [
38
+
"The input for this notebook is an ORCID iD, e.g. '`0000-0003-2499-7741`'."
39
+
]
40
+
},
41
+
{
42
+
"cell_type": "code",
43
+
"execution_count": 2,
44
+
"metadata": {
45
+
"pycharm": {
46
+
"name": "#%%\n"
47
+
}
48
+
},
49
+
"outputs": [],
50
+
"source": [
51
+
"# input parameter\n",
52
+
"example_orcid_id=\"0000-0003-2499-7741\""
53
+
]
54
+
},
55
+
{
56
+
"cell_type": "markdown",
57
+
"metadata": {},
58
+
"source": [
59
+
"We use it to query the OpenAIRE HTTP API for publications that specified the ORCID iD within their metadata in one of the creators `orcid` field. Since the API uses pagination, we need to loop through all pages to get the complete result set."
60
+
]
61
+
},
62
+
{
63
+
"cell_type": "code",
64
+
"execution_count": 3,
65
+
"metadata": {
66
+
"pycharm": {
67
+
"name": "#%%\n"
68
+
}
69
+
},
70
+
"outputs": [],
71
+
"source": [
72
+
"# OpenAIRE endpoint to query for publications\n",
"From the resulting list of publications we extract and print out each title and DOI. \n",
111
+
"\n",
112
+
"*Note: publications that do not have a DOI assigned, will not be printed.*"
113
+
]
114
+
},
115
+
{
116
+
"cell_type": "code",
117
+
"execution_count": 4,
118
+
"metadata": {},
119
+
"outputs": [
120
+
{
121
+
"name": "stdout",
122
+
"output_type": "stream",
123
+
"text": [
124
+
"Number of publications found: 6\n",
125
+
"\n",
126
+
"10.15488/11463, Roadmap to FAIR Research Information in Open Infrastructures\n",
127
+
"10.1515/bd.2006.40.4.466, Informationsvermittlung: Personalisiertes Lernen in der Bibliothek: das Düsseldorfer Online-Tutorial (DOT) Informationskompetenz\n",
128
+
"10.1080/00048623.2006.10755322, Teaching Information Literacy with the Lerninformationssystem\n",
129
+
"10.3389/frma.2021.694307, Enhancing Knowledge Graph Extraction and Validation From Scholarly Publications Using Bibliographic Metadata\n",
130
+
"10.3897/rio.7.e66264, OPTIMETA – Strengthening the Open Access publishing system through open citations and spatiotemporal metadata\n",
131
+
"10.1016/j.procs.2019.01.074, The Research Core Dataset (KDSF) in the Linked Data context\n"
132
+
]
133
+
}
134
+
],
135
+
"source": [
136
+
"# from the result pages, extract the data about each publication\n",
137
+
"def extract_publications_from_page(page):\n",
138
+
" return [pub for pub in benedict.from_json(page).get('response.results.result') or []]\n",
Copy file name to clipboardExpand all lines: person-works/orcid_get_works_by_person.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
"source": [
9
9
"### Query ORCID for works authored by a person\n",
10
10
"\n",
11
-
"This notebook queries the [ORCID API](https://pub.orcid.org/v3.0/) to retrieve works listed in a person's ORCID record. It takes an ORCID URL or iD as input to retrieve the ORCID record of a person and the works listed on it. From the resulting list of works we output all DOIs."
11
+
"This notebook queries the [ORCID Public API](https://pub.orcid.org/v3.0/) to retrieve works listed in a person's ORCID record. It takes an ORCID URL or iD as input to retrieve the ORCID record of a person and the works listed on it. From the resulting list of works we output all DOIs."
0 commit comments