Teaching
Introduction to TEI XML
URFIST de Rennes, September 27-28 2021
During this 12-hours formation to the Text Encoding Initiative, the participants were given an introduction to XML, then to TEI, followed by several encoding exercises, in order to get them to understand how it works (encoding of metadata, body, text layout, named entities, etc.). Finally, the last part of the formation was a demonstration of few applications that can be made with TEI, such as transformation with XSLT and TEI Publisher.
Course program: https://sygefor.reseau-urfist.fr/#/training/8931/
Course repository : https://github.com/FloChiff/Introduction_TEI
Introduction to TEI XML
URFIST de Rennes, November 24-25 2022
During this 12-hours formation to the Text Encoding Initiative, the participants were given an introduction to XML and to TEI, followed by several encoding exercises, in order to get them to understand how it works (encoding of metadata, body, text layout, named entities, etc.). Finally, the last part of the formation was a demonstration of few applications that can be made with TEI, such as text mining with XPath and XQuery, transformation with XSLT and publication with TEI Publisher.
Course program: https://sygefor.reseau-urfist.fr/#/training/9717/11500
Course repository: https://github.com/FloChiff/Introduction_TEI_2022
Create a Digital Scientific Edition
ObTIC Workshop, Room 70 (BNF), 2 PM-5 PM
- Workshop 1 - "Automatic Text Recognition" - 01/17/25
This first workshop is dedicated to automatic text recognition, a constantly evolving field that today, with the help of trained models, allows for efficient and rapid acquisition of machine-readable versions of text corpora. After an introduction to the discipline, the workshop will put into practice what has been presented, by applying segmentation and transcription models to the Gallica corpora we will work on, in order to obtain a usable version afterward. - Workshop 2 - "Text Encoding and Annotation" - 03/14/25
This second workshop is dedicated to encoding texts in XML-TEI, the standard currently used to encode literary texts. After an introduction to the XML markup language and the components of the TEI standard, participants will engage in hands-on practice, encoding metadata, the body of the text, and various annotations (semantic, critical, etc.) relevant to the corpus being worked on. - Workshop 3 - "Web Display of Text" - 05/16/25
This third and final workshop in the series is dedicated to the web display of the encoded corpus, allowing participants to concretely observe the various enhancements brought to their corpus through encoding. After a brief introduction to the importance and methods of such a step, the workshop will aim to present and work on several tools for web display.
Course repository: https://github.com/FloChiff/AtelierObTIC-creer-une-edition-scientifique-numerique
ATRIUM ATR Summer School
DARIAH Coordination Office, Berlin, Germany, September 1-5 2025
The ATRIUM Summer School will provide an in-depth approach to automatic text recognition with a focus on practical applications in concrete research scenarios. Participants will gain insights into the latest developments in OCR and HTR, focusing on open-source tools such as eScriptorium and workflows that facilitate the digitization and analysis of historical and modern texts.
During one week, the trainer team will alternate methodological input and supervision of hands-on sessions for the participants to improve their automatic text recognition pipelines. Input will cover not only the manipulation of pre-processing, segmentation, layout analysis, and post-processing, but also data management, empowering participants to achieve concrete goals in terms of the management, processing and reusability of their data within the duration of the summer school and beyond.
Course website: https://atrium-research.eu/events/atrium-atr-summer-school/
Course repository: https://zenodo.org/records/17159181
TEI Summer School
University of Oslo Library, Oslo, Norway, September 22-26 2025
This workshop provides a general introduction and guided practice for creating digital scholarly editions using the community-standard eXtensible Markup Language (XML).
We will begin with a general introduction to digital scholarly editions, focusing on XML and the data model for textual resources provided by the Text Encoding Initiative (TEI) P5 standard and community standards, such as EpiDoc for textual sources of Classical Antiquity, among others.
We will demonstrate how editions created with MS Word or other word processors can be automatically transformed into valid XML TEI P5 files and further enhanced.
Afterwards, we will introduce visualisation and publication tools for digital scholarly editions, as well as how to transform XML files into more reader-friendly publication formats such as HTML or PDF.
We will address how any part of a scholarly edition, from raw text files to transformation scenarios, can be preserved and archived on institutional repositories like Dataverse.no.
We leave ample time for practice and hands-on sessions to work with your materials.
Course website: Lien
TD Techniques numériques pour l'édition
Sorbonne University, Lundi 17h-19h
Teaching digital tools for publishing: Word, InDesign, Photoshop, Acrobat Pro, etc.
Creating a digital edition
ObTIC Workshop, Maison de la Recherche Serpente, 14h-17h
- Workshop - "Automatic Text Recognition" - 31/10/25
This first workshop is dedicated to automatic text recognition, a constantly evolving discipline that now allows, using trained models, the efficient and rapid acquisition of a machine-readable version of a text corpus. After an introduction to the discipline, the workshop will put the concepts into practice by applying segmentation and transcription models to the corpora we will work with, in order to obtain a usable version. - Workshop - "Text Encoding and Annotation" - 05/12/25
This second workshop is dedicated to text encoding in XML-TEI, the standard currently used for encoding literary texts. After an introduction to the XML markup language and the components of the TEI standard, participants will proceed with practical exercises, encoding metadata, the body of the text, and various annotations (semantics, critical analysis, etc.) relevant to the corpus being studied.
Course repository: https://github.com/FloChiff/AtelierObTIC-edition-numerique