isidore-en.md 39.9 KB
Newer Older
1
---
2
3
lang: en
description: Presentation of ISIDORE, the search engine for publications, digital data and profiles of researchers in the humanities and social sciences from around the world.
4
5
6
7
---

# ISIDORE

8
## What is ISIDORE?
9

10
ISIDORE is a search engine for discovering and finding publications, digital data and profiles of researchers in the humanities and social sciences (SHS) from around the world.
11

12
It allows searching the full text of several million documents (articles, theses and dissertations, reports, datasets, web pages, database records, descriptions of archival holdings, etc.), event reports (seminars, conferences, etc.). In addition, ISIDORE links these millions of documents together by enriching them with scientific concepts from the work of the SHS research communities.
13

14
It is accessible on the Web on the portal [isidore.science](https://isidore.science).
15

16
It also offers scientific social network functionalities. As such, it falls into the category of search engines and assistants and offers many features to organize scientific monitoring.
17

18
Launched on December 8, 2010, ISIDORE is the result of a collaboration between the CNRS "very large equipment" Adonis (2007-2013), the Center for Direct Scientific Communication and the companies Antidot, Mondéca and Sword. It is currently developed, updated and operated by the TGIR Huma-Num.
19

20
References on the history of ISIDORE :
21
22

- Yannick Maignien, "ISIDORE, de l'interconnexion de données à l'intégration de services", Hyper Article en Ligne - Sciences de l'Homme et de la Société, [10670/1.k9lck9](https://isidore.science/document/10670/1.k9lck9)
23
- Stéphane Pouyllau et al, "Bilan 2011 de la plateforme ISIDORE et perspectives 2012-2015", MoDyCo, Modèles, Dynamiques, Corpus - UMR 7114, [10670/1.bqexsj](https://isidore.science/document/10670/1.bqexsj)
24
25
- Philippe Bourdenet, "L'espace documentaire en restructuration : l'évolution des services des bibliothèques universitaires", Le serveur TEL (thèses-en-ligne), [10670/1.lnieuv](https://isidore.science/document/10670/1.lnieuv)

26
## How does ISIDORE work?
27

28
ISIDORE harvests textual and full text metadata, enriches them and then indexes them. It exploits the metadata of the documents as well as the full text, the goal is to analyze this information in order to enrich them, to link them to the concepts of the scientific repositories (thesaurus, etc.), to link them to the authors' identifiers (ORCID, IDRef, IDHAL, VIAF, etc.).
29

30
Several enrichments are performed:
31

32
- Semantic annotation: the words present in the metadata of the documents are compared to the entries of the repositories through an algorithm based on a morphological analysis of the terms. If an equivalence is found between a term from the document and an entry in one of the repositories, then the resource will be linked to that repository entry. The repositories are multilingual and aligned with each other. Thus, the semantic annotation is multilingual.
33

34
- Disciplinary categorization: ISIDORE uses a semantic classifier that, after being trained on a reference corpus, categorizes all documents in ISIDORE into the SHS disciplines of the MORESS repository. The training of the classifier is realized with the help of the manual categorization realized by the researchers in HAL during the deposit of their publications.
35

36
- The detection of the authors: ISIDORE detects the authors of the documents and enriches the author form (first name and last name) with the help of international (ORCID, VIAF, ISNI) and national (IDHAL, IDRef) author identifiers.
37

38
ISIDORE indexes, in its search engine:
39

40
41
42
43
44
- Document metadata;
- The full text (if it is available in open access) ;
- The semantic annotations ;
- Disciplinary classification;
- Author enrichment and normalization.
45

46
More information is available on [the "Repositories" page](https://isidore.science/vocabularies) of ISIDORE.
47

48
### Can ISIDORE index multilingual documents and data?
49

50
51
52
Yes. Since 2015, documents and datasets in English, Spanish
and French are indexed, enriched and linked to scientific repositories by ISIDORE (metadata and full text). For full text outside these three languages, it is indexed in the language of the document but enrichment does not take place.
For more information, you can consult our post on the subject: [Isidore speaks English, sino también español et toujours en français](https://humanum.hypotheses.org/921).
53

54
### How often is ISIDORE updated?
55

56
57
58
59
60
61
62
63
64
65
ISIDORE is updated, incrementally, on average once a
month. Why this delay? In addition to harvesting and indexing documents
documents, ISIDORE enriches them with concepts from
scientific repositories (thesaurus, taxonomy, etc.). This semantic enrichment work is
automatic and allows us to offer you reading suggestions. It
is to help you discover documents other than those you were
you were looking for. This requires a certain amount of processing and calculation time.
The updates of the documents concerning you, which will be thus proposed to you in your
proposed in your user account as documents to be claimed, will also follow
will also follow this monthly update rhythm.
66

67
## How to use ISIDORE?
68

69
ISIDORE offers several tools to search, discover, collect and organize the contents it indexes:
70

71
### The isidore.science portal
72

73
The [isidore.science](https://isidore.science) portal is a website in three languages that provides a [relevance search engine](https://isidore.science) that can be used with several query methods.
74

75
76
77
78
79
80
81
82
- By default, ISIDORE searches for all the words in a query posed by the
    user by removing empty words ("of", "the", "the",
    "the", etc.);
- It is possible to search for a document with a complete sentence or
    It is possible to search for a document with a complete sentence or a group of words by using quotation marks around the sentence,
    for example: "direction of consciousness" will search for exactly
    this sentence. Thus, in this case, the "of" will not be considered
    as an empty word;
83

84
85
86
87
#### Search operators
Several boolean search operators are available in
ISIDORE. Note that the syntax of the operators is important in
ISIDORE, they are always in UPPERCASE (e.g. AND):
88

89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
- AND: the intersection allows to find the terms (or set of terms) common to the query.
    of terms) common to the query. For example:
    - consciousness AND gender
    - cold war" AND migration
- OR: the union allows to find the searched terms
    belonging to both sets of terms, or to one or the other.
    For example:
    - "semantic web" OR "web 3.0"
- EXCEPT (NOT): the exclusion allows to reduce the noise by excluding
    terms. For example:
    - revolution NOT French
- NEAR(n.): the NEAR(n.) operator (understand "close to
    of") allows to link terms by indicating a value "n." of
    proximity between them. It works like an AND with n.
    word(s) between the terms. The value "n." indicates the number of words
    to separate the two terms. NEAR also works
    without the value n. and is in this case equal to a NEAR(10), i.e. 10 words between the
    (10), i.e. 10 words between the searched terms (standard spacing).
    - house NEAR(4) nobility : search for house and nobility with
        proximity of 4 words
109

110
#### Sorting of search results
111

112
By default, in [isidore.science](https://isidore.science), it is proposed to sort the results by semantic relevance. It is possible to change the sorting of the search results to :
113

114
115
116
117
118
- a sorting by novelty
- sorting by author's name in alphabetical order
- sorting by author's name in reverse alphabetical order
- sort by ascending date
- a sorting by decreasing date
119

120
Very soon, will be available again :
121

122
123
- a sorting on the title by alphabetical order
- a sorting on the title by inverted alphabetical order
124

125
### The advanced search
126

127
128
An advanced search is also available at [https://isidore.science/as](https://isidore.science/as) and also accessible from
the first page of the [portal](https://isidore.science/as).
129

130
### The personal space for researchers
131

132
Isidore.science offers a personal space for researchers allowing them to :
133

134
135
136
137
138
- to collect, classify and organize the documents found ;
- to gather all their scientific production in order to edit it in a personal profile page;
- to follow the productions of colleagues;
- record and publish queries and their results for monitoring purposes;
- create bibliographies that can be exported to Zotero;
139

140
### The APIs of isidore.science
141

142
143
The [isidore.science search engine APIs](https://api.isidore.science) are available at the URL [https://api.isidore.science](https://api.isidore.science) through the GET method on HTTP or HTTPS.
They provide a fast, accurate and reliable query service for ISIDORE data with advanced search features (auto-completion, spell checking, multi-criteria, boolean and faceted searches, sorting, aggregation of answers, etc).
144

145
Each request to the engine is submitted by means of a URI pointing to a specific web service. The response is a stream in XML (default format) or JSON format.
146

147
The [isidore.science API] web page (https://api.isidore.science) details all the commands available for the different services available.
148

149
### Enriched metadata for *Linked Open Data*.
150

151
ISIDORE's metadata, ontologies and repositories are available in a triplet repository [RDF (Resource Description Framework) or *TripleStore*](https://fr.wikipedia.org/wiki/Resource_Description_Framework), thus placing ISIDORE data in the *Linked Open Data*. A web interface for querying using the SPARQL language and browsing the ISIDORE graph is available via :
152

153
154
- A documented SPARQL query interface and presentation of the ISIDORE data model: https://isidore.science/sqe  
- The basic Virtuoso software interface: https://isidore.science/sparql
155

156
In the ISIDORE *TripleStore*, the main vocabularies for structuring information are :
157

158
- RDF and RDFS
159
160
161
162
163
164
165
166
167
- Dublin Core Element Set
- Duclin Core TERMS
- SIOC
- FOAF
- OWL
- SKOS
- ORE
- DBPEDIA

168
(The complete list is available at <https://isidore.science/sparql?nsdecl>)
169
170


171
### Complementarity between ISIDORE and Zotero
172

173
#### Use from ISIDORE of the Zotero connector to feed its bibliographic database
174

175
ISIDORE is compatible with Zotero and allows to import the references of documents on two levels as soon as the user has installed [the Zotero connector](https://www.zotero.org/download/) in his browser:
176

177
178
- On the page listing the results of a search,
- On the page listing the results of a search, In the page displaying a document.
179

180
#### Using the ISIDORE search connector from Zotero
181

182
Zotero (Linux, MacOS, Windows client) allows to use search engines to search or complete bibliographic references directly from the Zotero interface. We propose here two ISIDORE connectors for Zotero allowing to use ISIDORE from author search.
183

184
Adding ISIDORE to Zotero allows :
185

186
187
- To complete references from a search on the author's name: this is the "ISIDORE, help me find what he/she has published."
- To find documents in which the author is cited: this is the "ISIDORE, what do you have on the author?"
188

189
These [connectors and installation documentation are available on the TGIR Huma-Num GitLab](https://gitlab.huma-num.fr/spouyllau/ISIDORtero).
190

191
### Use of RSS feeds
192

193
ISIDORE can propose its research results in the form of RSS feeds in order to feed scientific monitoring software (including Zotero for example), research notebooks, etc. The RSS feeds created in ISIDORE are updated, like all the contents of the search engine, approximately once a month during the general update of the ISIDORE contents. Thus, it is possible to follow, from Zotero, the update of the ISIDORE documents resulting from the registered queries.
194

195
196
197
To do so, you have to ask ISIDORE --- in your personal space in
the link to the RSS feed of a registered query by
going, once in your personal space, to "My queries" :
198

199
![My Image](media/isidore.png)
200

201
202
For a registered request, you have to click on the pictogram "Request
RSS feed of the request" available on the right ![My Image](media/isidore-rss-001.png){: style="width:170px"} and to copy the link with ![My Image](media/isidore-requeteRSS.png){: style="width:120px"}.
203

204
The copied link is of the form: `https://isidore.science/feed/lt3913`.
205

206
207
208
If your browser is equipped with a module for reading RSS feeds, it will be possible to
possible to use this link directly in your browser.
In our example, we will use it in Zotero.
209

210
In Zotero, you have to choose: New feed > From URI :
211

212
![My Image](media/zot-001.png){: style="width:60%;margin-left:20%"}
213

214
215
216
217
Then add the url of the feed provided by ISIDORE (with the browser
Safari browser under MacOS take care to remove the mention "feed:" from
the url). Then paste it in "URL" of the Zotero RSS feed creation window
of Zotero's RSS feed creation window, example below :
218

219
![My Image](media/zot-002.png)
220

221
222
Then you have to give a title to your feed, for example :
"isidore.science - watch over ...".
223

224
## What can be found in ISIDORE?
225

226
### Organization of documents and data in ISIDORE
227

228
ISIDORE contains several million documents in SHS that are harvested, enriched with scientific references and indexed. They are organized in :
229

230
231
232
- Research documents and data (archives, raw materials, photographs, films, datasets, statistics, etc.) and are identified in the ISIDORE ontology by: http://isidore.science/class/primaires
- Published documents and data (articles, books, dissertations and theses, reports, etc.) and are identified in the ISIDORE ontology by: http://isidore.science/class/secondaires
- Scientific events (conferences, study days, etc.) and are identified in the ISIDORE ontology by: http://isidore.science/class/evenementielles
233
234


235
236
237
238
For a large number of SHS disciplines, ISIDORE allows searching documents coming from the main publication platforms worldwide, as well as a large number of digitized collections from national, university and
municipal libraries.
For advanced search uses, the [ISIDORE advanced search](https://isidore.science/as) offers, for example, the possibility to
the possibility of searching for documents between two dates and by discipline or by collections.
239

240
The main publication platforms (journals and books) present in ISIDORE are
241
242
243

- OpenEdition
- Cairn
244
- Perseus
245
246
247
248
249
- Erudit
- Oapen
- Redalyc
- Scielo Books

250
The complete list of collections containing publications can be obtained by querying [the ISIDORE 3store](https://isidore.science/sqe) with the [following] SPARQL(https://isidore.science/sparql?query=SELECT+*+WHERE+%7B%0D%0A%3Fs+rdf%3Atype+%3Chttp%3A%2F%2Fisidore. science%2Fclass%2FCollection%3E.%0D%0A%3Fs+rdf%3Atype+%3Chttp%3A%2F%2Fisidore.science%2Fclass%2Fpublications%3E. %0D%0A%3Fs+dcterms%3Atitle+%3Ftitre%0D%0A%7D+ORDER+BY+ASC%28%3Ftitre%29&format=text%2Fhtml&debug=on&timeout=0) :
251
252
253
254
255

```
SELECT * WHERE {
 ?s rdf:type <http://isidore.science/class/Collection>.
 ?s rdf:type <http://isidore.science/class/publications>.
256
257
 ?s dcterms:title ?title
} ORDER BY ASC(?title)
258
```
259
The main digital libraries (municipal, national, etc.) present in ISIDORE are :
260
261

- Gallica
262
- Selene
263
264
265
266
267
268
269
270
271
272
- E-rara
- NuBIS
- Octaviana
- Burgerbibliothek
- Berkeley Library Digital Collections
- Argonnaute
- BNE
- Cornell
- Didόmena

273
The complete list of collections containing archival holdings and book collections can be obtained by querying [the ISIDORE 3store](https://isidore.science/sqe) with the [following] SPARQL(https://isidore.science/sparql/? default-graph-uri=&query=SELECT+*+WHERE+%7B%0D%0A%3Fs+rdf%3Atype+%3Chttp%3A%2F%2Fisidore.science%2Fclass%2FCollection%3E.%0D%0A%3Fs+rdf%3Atype+%3Chttp%3A%2F%2Fisidore. science%2Fclass%2Fprimaires%3E.%0D%0A%3Fs+dcterms%3Atitle+%3Ftitre%0D%0A%7D+ORDER+BY+ASC%28%3Ftitre%29&format=text%2Fhtml&timeout=0&debug=on) :
274
275
276
277
278

```
SELECT * WHERE {
 ?s rdf:type <http://isidore.science/class/Collection>.
 ?s rdf:type <http://isidore.science/class/primaires>.
279
280
 ?s dcterms:title ?title
} ORDER BY ASC(?title)
281
282
```

283
### Indexing of the main data platforms in SHS
284

285
ISIDORE harvests, as it is called, and indexes the contents of many SHS data platforms allowing researchers to group all their data in their user profile. We encourage researchers, for their research programs, to use platforms offering open interoperability devices and protocols allowing to present documentary and scientific metadata.
286

287
The main data platforms (sources, archives but also publications) are harvested by ISIDORE.
288

289
The complete list of collections can be obtained by querying [the ISIDORE 3store](https://isidore.science/sqe) with the [following] SPARQL(https://isidore.science/sparql/? default-graph-uri=&query=SELECT+*+WHERE+%7B%0D%0A+%3Fs+rdf%3Atype+%3Chttp%3A%2F%2Fisidore.science%2Fclass%2FCollection%3E. %0D%0A+%3Fs+dcterms%3Atitle+%3Ftitre%0D%0A%7D+ORDER+BY+ASC%28%3Ftitre%29%0D%0A&format=text%2Fhtml&timeout=0&debug=on) :
290
291
292
293

```
SELECT * WHERE {
 ?s rdf:type <http://isidore.science/class/Collection>.
294
295
 ?s dcterms:title ?title
} ORDER BY ASC(?title)
296
297
```

298
Please feel free to report any of these to us.
299

300
#### Can data deposited and documented in NAKALA be referenced by ISIDORE?
301

302
303
304
Yes, data deposited and documented in NAKALA can be
accessible in ISIDORE. NAKALA offers as standard the [OAI-PMH] interoperability protocol (https://fr.wikipedia.org/wiki/Open_Archives_Initiative_Protocol_for_Metadata_Harvesting) which allows for the harvesting of document metadata, and therefore
to reference, enrich and index them by ISIDORE.
305

306
307
308
However, referencing by OAI-PMH harvesting is not
automatic for the moment, in particular to allow users to prepare and organize their data and
data and metadata. To be referenced, simply request by email to be indexed ISIDORE via <isidore-sources@huma-num.fr>.
309

310
#### How will scientific articles and images deposited in the HAL, HAL-SHS and MédiHAL open archive be accessible in ISIDORE?
311

312
All the files (PDF, illustrations, photographs, audio and video) deposited and documented in the open archive HAL, including HAL-SHS, as well as MédiHAL are automatically referenced in ISIDORE and indexed at the level of their metadata. All these documents and their notices are thus accessible through the various interfaces of interrogation of ISIDORE.
313

314
#### Can the data deposited in the Didómena (EHESS) warehouse be referenced by ISIDORE?
315

316
Yes, [Didómena](https://didomena.ehess.fr) (the research data warehouse of EHESS) offers OAI-PMH interoperability. Be careful, harvesting is not automatic. To be referenced at the level of your collection, please provide us with the OAI-PMH access point via <isidore-sources@huma-num.fr>.
317

318
#### Can data deposited in Calames (ABES) be referenced by ISIDORE?
319

320
Yes, descriptions of archival holdings cataloged in [Calames](http://calames.abes.fr) (the catalog of archives and manuscripts of French university libraries) are indexed in ISIDORE. However, the EAD-XML standard, used in Calames, does not always allow an optimal documentary indexing: mainly at the level of the richness of the metadata. This is due to the logic of the EAD-XML standard in the encoding of information in the levels of description of the funds.
321

322
#### Can the data deposited in the Data.sciencespo warehouse be referenced by ISIDORE?
323

324
Yes, the data deposited and documented in [Data.sciencespo](https://data.sciencespo.fr) (Dataverse) offers interoperability in OAI-PMH. It is harvested automatically by ISIDORE.
325

326
#### Can the data deposited in the COCOON platform be referenced by ISIDORE?
327

328
Yes, the data deposited and documented in [the COCOON platform](https://cocoon.huma-num.fr) offers interoperability in OAI-PMH. This platform is automatically harvested by ISIDORE.
329

330
#### Can files and documents deposited in the European Zenodo platform be referenced by ISIDORE?
331

332
333
334
Yes, it is possible for ISIDORE to reference the files and
documents deposited and documented on the platform
Zenodo](https://zenodo.org).
335

336
337
338
339
340
341
342
The referencing is based on the principle of OAI-PMH harvesting on a
set of files and data (and thus their metadata) corresponding to one or more
identifier(s) corresponding to the "communities" identifiers in Zenodo (see https://developers.zenodo.org/#sets).
We can also group several Zenodo identifiers in the same
ISIDORE collection allowing the depositors of several corpora deposited in
deposited in Zenodo to group them in ISIDORE to give them more
visibility.
343

344
345
To add your Zenodo repositories in ISIDORE, [please send us
the URL
346
OAI-PMH](mailto:isidore-sources@huma-num.fr?subject=%22Je%20souhaiterai%20faire%20moissonner%20mes%20dépôts%20Zenodo%22)
347
of your repository (see <https://developers.zenodo.org/#oai-pmh>).
348
349


350
## How do I get data referenced by ISIDORE?
351

352
353
There are several ways to get data and documents referenced by
ISIDORE:
354

355
356
357
358
359
360
361
- Submit your data via [an XML stream of standardized metadata and
    using the OAI-PMH protocol](#how-to-signal-data-in-isidore-with-metadata-and-the-oai-pmh-protocol) associated with metadata in
    Dublin core format. This method is adapted for documentary databases
    databases, corpora, scientific archives and document/data libraries.
    document/data libraries. As an example, [a tool such as
    such as Omeka (Classic or S) offers the OAI-PMH protocol via modules](#a-website-using-omeka-classic-and-omeka-s-can-be-referenced-by-isidore).
    This method is adapted to research program websites presenting document or data corpora, scientific blogs (except Hypotheses.org), and web pages in general.
362

363
These two methods are also often implemented by data publication tools (CMS, etc.), for example :
364

365
### Can a web site using Drupal be indexed by ISIDORE?
366

367
368
369
Yes, it is possible to have web pages generated by the Drupal CMS indexed by ISIDORE.
by the Drupal CMS. There are two ways to do this, depending on the nature of the
content of your pages:
370

371
372
373
374
375
376
377
- Either via the OAI-PMH protocol and in this case there are several
    modules for Drupal, see on
    https://www.drupal.org/search/site/OAI-PMH](https://www.drupal.org/search/site/OAI-PMH?f%5B0%5D=ss_meta_type%3Amodule "OAI-PMH for Drupal").
- Or via the use of a Dublin
    Core metadata structure in the web pages generated by Drupal using RDFa and a
    sitemap.xml. An article dedicated to this way of proceeding is
    available at the above address.
378

379
### Can a website using Omeka Classic and Omeka-S be referenced by ISIDORE?
380

381
Yes, Omeka *Classic* and Omeka S offer modules to expose metadata according to the OAI-PMH protocol:
382

383
384
- Module for [Omeka S](https://omeka.org/s/modules/OaiPmhRepository/)
- Module for [Omeka Classic](https://omeka.org/classic/docs/Plugins/OaiPmhRepository/)
385
386


387
### How to report data in ISIDORE with metadata and OAI-PMH protocol?
388

389
390
To report your data in ISIDORE using the
OAI-PMH protocol, you just have to :
391

392
393
394
395
396
397
398
- Prepare your data and metadata using the
    Documentary vocabulary Dublin Core Element Set or Dublin Core
    Terms, depending on the level of precision you want, and to
    make them accessible via [the OAI-PMH protocol](https://fr.wikipedia.org/wiki/Open_Archives_Initiative_Protocol_for_Metadata_Harvesting);
- To organize and document the *Sets* in its OAI-PMH repository.
- To report to <isidore-sources@huma-num.fr> the address of its
    warehouse to Huma-Num.
399

400
#### Document sets in OAI-PMH: *Sets*
401

402
403
The OAI-PMH protocol makes it possible, through the creation of *Sets*, to bring together in a
The OAI-PMH protocol makes it possible to create *Sets*, which allow to gather in a coherent set of records whose perimeter makes sense from a scientific or editorial point of view and which is left to the free appreciation of the producer of the data.
404

405
406
407
408
409
410
411
412
413
414
It also allows to define a hierarchy in the *Sets* with an inheritance mechanism by specifying
in the set name the name of the parent *Set* and the child *Set*, separated by the character `:`.
separated by the `:` character. ISIDORE is able to use these
*Sets* to limit harvesting to a set of records or to differentiate between different
differentiate between different data sources within the same warehouse.
The producer will therefore have to specify the harvesting methods that seem to be
appropriate in order to make the most of their resources within ISIDORE.
resources within ISIDORE. To do this, he must indicate the *Set* or sets
concerned or a rule allowing to distinguish the *Sets* to be taken into
to be taken into account.
415

416
The *Sets* can present metadata, in Dublin Core Element Set, which are specific to them. For example:
417

418
``xml
419
420
<set>
 <setSpec>OuvColl</setSpec>
421
 <setName>OuvColl</setName>
422
423
 <setDescription>
  <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
424
   <dc:description>Research works distributed on Cairn.info</dc:description>
425
426
427
428
429
  </oai_dc:dc>
 </setDescription>
</set>
```

430
#### Records in OAI-PMH or *Records*:
431

432
433
434
435
436
437
438
In the ISIDORE framework, each OAI-PMH "record" corresponds to a document.
The ISIDORE harvester thus exploits the metadata described according to the
application profile defined by the Open Archive Initiative for the
Dublin Core Element Set (also known as Dublin Core "simple"). Moreover
In addition, the harvester also collects the full-text document(s) whose URLs
whose URLs (beginning with `https://` or `http://`) are specified in the
in the `<dc:identifier>` element.
439

440
441
442
443
We recommend that data producers provide records that are as metadata-rich as possible.
rich as possible in metadata. Indeed, relevance in
ISIDORE favors the richest possible metadata. Fields such as
such as :
444

445
``xml
446
447
448
449
450
<dcterms:description>
<dcterms:creator>
<dcterms:date>
```

451
are essential.
452

453
##### Example of a complete record according to the OAI-PMH protocol:
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469

```xml
<record>
 <header>
  <identifier>oai:halshs.archives-ouvertes.fr:halshs-00514304</identifier>
  <datestamp>2010-09-02T11:06:50Z</datestamp>
  <setSpec>halshs</setSpec>
  <setSpec>SHS:ECO</setSpec>
  <setSpec>SDV:BIO</setSpec>
  <setSpec>INFO:INFO_BT</setSpec>
  <setSpec>SDV:SA:AEP</setSpec>
  <setSpec>SDV:SA:STA</setSpec>
  <setSpec>CIRAD</setSpec>
  <setSpec>SHS</setSpec>
 </header>
 <metadata>
470
  <oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
471
472
473
474
475
  <dc:identifier>http://halshs.archives-ouvertes.fr/halshs-00514304/en/ </dc:identifier>
  <dc:identifier>http://halshs.archives-ouvertes.fr/docs/00/51/43/98/PDF/Regulation_GMO_pprint.pdf</dc:identifier>
  <dc:identifier>http://halshs.archives-ouvertes.fr/docs/00/51/43/98/PDF/ppt_nocmt_broader_regulation.pdf </dc:identifier>
  <dc:title>Broadening the scope of regulation: a prerequisite for a positive contribution of transgenic crop useto sustainable development</dc:title>
  <dc:creator>Fok, Michel</dc:creator>
476
  <dc:subject>[SHS:ECO] Humanities and Social Sciences/Economy and finances</dc:subject>
477
478
  <dc:subject>[SDV:BIO] Life Sciences/Biotechnology</dc:subject>
  <dc:subject>[INFO:INFO_BT] Computer Science/Biotechnology</dc:subject>
479
  <dc:subject>[SDV:SA:AEP] Life Sciences/Agricultural sciences/Agriculture, economy and politics</dc:subject>
480
481
482
483
484
485
486
487
488
489
  <dc:subject>[SDV:SA:STA] Life Sciences/Agricultural sciences/Sciences and technics of agriculture</dc:subject>
  <dc:subject>regulation</dc:subject>
  <dc:subject>coordination</dc:subject>
  <dc:subject>GMO</dc:subject>
  <dc:subject>biotechnology</dc:subject>
  <dc:subject>seed price</dc:subject>
  <dc:subject>research</dc:subject>
  <dc:subject>weed resistance</dc:subject>
  <dc:subject>pest complex shift</dc:subject>
  <dc:description>Ex-ante regulation of transgenic crop use generally prevails, before the authorization of commercial release.This kind of regulation addresses the concerns of biosafety and coexistence, under pressure of pros and/or cons of GMO. After fifteen years of large scale use of transgenic crops (notablysoybean and cotton) in various countries (USA, China, Brasil, India...), ecological and economic phenomena are observed and which could threaten the sustainable use of transgenic varieties. I advocate that the regulation scope must be extended so as to a) promote a systemic and coordinatedapproach of transgenic crop use, b) ensure seed purity with regard to the transgenic trait, c) maintain research on non-transgenic varieties, and d) warrant fair pricing of transgenic seeds.</dc:description>
490
  <dc:coverage>Montpelier</dc:coverage>
491
492
493
494
495
496
497
498
499
500
501
  <dc:coverage>France</dc:coverage>
  <dc:date>2010-08-29</dc:date>
  <dc:language>English</dc:language>
  <dc:type>proceeding with peer review</dc:type>
  <dc:source>Proceedings of Agro2010, the XIth ESA Congress</dc:source>
  <dc:source>Agro2010, the XIth ESA Congress</dc:source>
 </oai_dc:dc>
</metadata>
</record>
```

502
503
504
505
In addition to this description in *Dublin Core Element Set*, each
record can be described in one or more metadata formats, the choice of which is
metadata formats, the choice of which is left to the
the administrator of the OAI-PMH warehouse.
506

507
508
509
510
511
The ISIDORE harvester is able to use the *Dublin Core Terms* format and any XML schema allowing
full-text exposure (including TEI or EAD) thus improving its indexing.
its indexing. The data producer will have to take care to respect
scrupulously respect the specifications of the OAI-PMH protocol in its version
2.0 in particular on :
512

513
514
515
- The strict respect of the "datestamp" values in the *records* in order to synchronize the updates between the producer and ISIDORE;
- The good management of deleted data ([detail on the OAI-PMH protocol documentation](http://www.openarchives.org/OAI/openarchivesprotocol.html#DeletedRecords));
- In the case of a publisher's data warehouse or one of significant size, access to its OAI-PMH warehouse via the IP addresses of ISIDORE's OAI-PMH harvesters (harvesting reported by ISIDORE to its IT department).
516

517
518
519
We advise producers to regularly validate the compliance of their warehouse
compliance of their repository using, for example, the [tools of the Open archive
initiative](https://www.openarchives.org/pmh/tools/). Finally, we advise data producers to contact the Huma-Num team for any information requests.
520

521
### How to report data in ISIDORE with RDFa metadata?
522

523
524
RDFa allows to express a metadata structure according to the principles of the Semantic Web (RDF for *[Resource Description Framework](https://fr.wikipedia.org/wiki/Resource_Description_Framework)*) in the HTML code of Web pages. The "a" in RDFa stands for "in
attributes", i.e. within the HTML code).
525

526
527
How to express metadata of a web page very simply by
using the [RDFa syntax
528
RDFa](https://tcuvelier.developpez.com/tutoriels/web-semantique/rdfa/introduction/)
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
? For example, in a blog post published with WordPress. If there
may exist [plugins to do this]()
this](https://wordpress.org/plugins/search/RDFa/),
the obsolescence of the latter can make it difficult to maintain them in
over time. Another solution is to implement RDFa in the
HTML code of the WordPress theme you have chosen. For this to be easy
and manageable over time, the easiest way is to use the HTML header
header in order to place `<meta>` tags that will contain some metadata.

Expressing metadata according to the RDF model via the RDFa syntax allows
machines (mainly search engines and indexers) to better process information because it becomes more explicit: for a machine, a string can be a title or a summary, if you don't tell it it's a title or a summary it
will not guess it. So, at the very least, it is possible to use the
tags to define an RDF structure that allows you to structure the minimal metadata
to structure the minimal metadata for example with the Dublin Core Element Set
documentary vocabulary Dublin Core Element Set.

#### How to do it practically?

First of all, it is necessary to indicate in the DOCTYPE of the web page, that it will
contain information that will use the RDF model, so the
DOCTYPE will be :

``xml
552
553
554
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
```

555
556
557
In the `<html>` tag, must be present the addresses of the
ontology (via their *NameSpace XML*) which are used to "type
to "type" the information. RDFa - which places metadata in the Semantic Web, requires at least the use of RDF and RDF Schema ontologies and the Dublin Core Element Set (dc). It is possible to use in addition - in order to refine the metadata - the Dublin Core Terms (dcterms):
558

559
``xml
560
561
562
563
564
565
566
<html xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/">
```

567
568
It is possible, to encode more information, to use more
document ontologies:
569

570
``xml
571
572
573
574
575
576
577
578
579
580
581
<html
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:cc="http://creativecommons.org/ns#">
```

582
In the example above, [foaf](http://www.foaf-project.org/) is used to encode information about a person or object described by the metadata. The [CC](https://creativecommons.org) ontology is used to indicate which license, from the *Creative Commons*, would apply to this content.
583

584
585
586
587
The RDFa structure through tags
tags in the `<head>` header of the HTML page. In a first step
first, using a `<link>` tag, we will define the digital object to which the
object to which the RDF encoded information will be attached:
588

589
``xml
590
591
592
<link rel="dc:identifier" href="http://monblog.com/monbillet.html" />
```

593
594
595
596
This tag therefore defines a container for the information that we
This tag defines a container for the information that we are going to indicate using the `<meta>` tags. This container is
identified by a URI which is a URL, i.e. the address of the
the address of the page on the web.
597
598


599
The `<meta>` tags then define a set of metadata, which in our case is descriptive information about the blog post's web page:
600

601
602
603
604
``xml
<meta property="dc:title" content="The title of my post" />
<meta property="dc:creator" content="First name Last name of author 1" />
<meta property="dc:creator" content="First name Last name of author 2" />
605
<meta property="dcterms:created" content="2011-01-27" />
606
<meta property="dcterms:abstract" content="A descriptive summary of my page's content" xml:lang="en" />
607
<meta property="dcterms:abstract" content="A summary in english" xml:lang="en" />
608
609
610
<meta property="dc:subject" content="keyword 3" />
<This is the first time you'll be able to see a summary of the content of the site.
<meta property="dc:type" content="ticket" />
611
<meta property="dc:format" content="text/html" />
612
<meta property="dc:relation" content="A link to a complementary web page" />
613
614
```

615
616
617
618
Depending on the nature of the content of the web page, it is of course possible
to be more precise, more refined and more complete in the information
encoded information. For example, it would be wise to use the DC vocabulary
Terms vocabulary.
619

620
The DC Terms allows for example to include a precise form for a bibliographic reference of the content:
621
622


623
624
``xml
<meta property="dcterms:bibliographicCitation" content="Put a bibliographic reference here" />
625
626
```

627
628
It would be possible to pass the entire text of a web page using the SIOC vocabulary [using the property
property](http://www.lespetitescases.net/rdfaiser-votre-blog-2-la-pratique).
629

630
631
632
It is also possible to link web pages together (to
define a corpus of authors for example) by using in the
DC Terms vocabulary the DC Terms property: `dcterms:isPartOf`.
633
634

```xml
635
<meta property="dcterms:isPartOf" content="URL of another web page" />
636
637
```

638
#### Creating the Sitemap
639

640
641
Once the RDFa encoding is done in the HTML pages, you still need to create
a Sitemap XML file listing the pages you want ISIDORE to harvest and submit the URL of this sitemap:
642

643
``xml
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
	<url>
		<loc>http://monsiteweb.com/</loc>
		<lastmod>2018-01-01</lastmod>
		<changefreq>monthly</changefreq>
		<priority>1.0</priority>
	</url>
	<url>
		<loc>http://monsiteweb.com/page1/</loc>
		<lastmod>2018-03-05</lastmod>
		<changefreq>weekly</changefreq>
		<priority>0.5</priority>
  </url>
</urlset>
```

660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
It is possible to test the extraction that ISIDORE will do of your
RDFa metadata using the "ISIDORE on demand" application
available at <https://rd.isidore.science/ondemand/fr/rdfa.html>

## ISIDORE perimeter

### Why are some items not found in ISIDORE?

If you do not find all of your scientific production
in [ISIDORE](https://isidore.science), there may be several
explanations. It may be that your articles are published in journals
journals that are not electronic or that do not make their articles available even
their articles even long after they have been published. Indeed, since its
creation, [ISIDORE](https://isidore.science) favors open
access: indexing is better for articles available in
open access. Many electronic journals have made this choice through
through portals such as Open Edition Journal (formerly
Érudit, Persée, and Cairn.info, Redalyc, OApen and the articles of these journals are collected and indexed by
and articles from these journals are therefore collected and indexed by
679
680
[ISIDORE](https://isidore.science).

681
682
683
684
It is also possible that your articles are published online, but not
on an electronic publishing platform (but a website), or on an electronic publishing platform
electronic publishing platform that does not allow indexing via the
(see the question and answer on OAI-PMH).
685

686
687
688
689
690
691
Other journals make their articles available, but only after an
an embargo period. In this case,
[ISIDORE](https://isidore.science) indexes only the metadata
of the article. If you connect via your university library
library, documentation center or via BibCNRS, it is possible that you will still have
you may still have access to these articles.
692

693
694
695
It is possible to search the collections indexed by
ISIDORE](https://isidore.science) by using the engine itself and by
indicating that you want to search the collections.
696

697
698
699
700
It is also possible that your article is published as a PDF image,
in this case only the indexing by
ISIDORE](https://isidore.science) will be allowed, but not its
full text indexing.
701

702
703
Finally, it is possible that some of your articles are published in
journals that are not classified in SHS.
704

705
706
707
708
In all these cases, you can deposit your articles in an
open archive like HAL (HAL-SHS in particular) which is also indexed by
ISIDORE](https://isidore.science) or contact your
bu/documentation center.
709

710
711
If you are not in any of these cases and thus think that it is an error, you can send us an
error, you can send us an e-mail to isidore@huma-num.fr.
712

713
### Why are some books/chapters of books not reported in ISIDORE?
714

715
716
ISIDORE knows how to identify that a document is of type "book", thus, there are
more than 500,000 books and chapters of books are reported in
717
718
ISIDORE.

719
720
721
It should be noted that there are relatively few platforms for publishing
of online books in open access. ISIDORE indexes in SHS, for example, the
contents of book platforms such as :
722

723
724
725
- [OpenEdition Books](https://isidore.science/search/?collection=10670/3.szxq6s) (at the chapter level, and to flag them) ;
- Scielo Books](https://isidore.science/search/?collection=10670/3.7oraz1) (Brazil);
- [OApen](https://isidore.science/search/?collection=10670/3.pwofj8) (Netherlands);
726
- [Erudit](https://isidore.science/s/collection?q=erudit) (Canada) ;
727
- ...
728

729
730
731
732
In addition, you can, in agreement with your publisher, deposit your work or
book or chapters of books in the open archive
HAL-SHS](https://halshs.archives-ouvertes.fr). It will then be indexed by
ISIDORE within the framework of the indexing of HAL-SHS and recognized as a chapter of work.
733

734
### Why some databases are not reported in ISIDORE?
735

736
Harvesting by ISIDORE requires standardized and normalized metadata exposure (documentary, scientific, etc.) (either using the OAI-PMH protocol or using an XML Sitemap and RDFa metadata, see above).
737

738
If you know of any databases that are not present in ISIDORE, please let us know so that we can check with their publishers/data producers.
739

740
## ISIDORE trainings
741

742
Here we list training courses, functional presentations and online self-training courses on the use of ISIDORE. Do not hesitate to let us know about any training you would like to organize:
743

744
745
- The Urfist Méditerranée proposes a new e-learning training on Isidore](https://urfist.univ-cotedazur.fr/nouvelle-formation-en-ligne-une-initiation-a-isidore/) (March 2021)
- ["Isidore, my personal research assistant"](https://ig.hypotheses.org/2215) by Johanna Daniel (April 2020)