Skip to main contentSkip to search
Episciences
Open Access Journals
ElPub - ELectronic PUBlishing logo
ElPub - ELectronic PUBlishing
ElPub - ELectronic PUBlishing logo
ElPub - ELectronic PUBlishing
Articles & Issues
All articlesAll volumesLast volumeProceedingsAuthors
About
About
Boards
Publish
For authors
ElPub - ELectronic PUBlishing logo
Contact
|
Credits
RSS
Episciences
Documentation
|
Acknowledgements
|
Publishing policy
Accessibility: non-compliant
|
Legal mentions
|
Privacy statement
|
Terms of use
  1. Home > Articles & Issues >
  2. Articles >
  3. Automatic Subject In ...
Conference paper

Automatic Subject Indexing and Classification Using Text Recognition and Computer-Based Analysis of Tables of Contents

Jan Pokorny (1)
(1) ENKI, o.p.s.
Download article
Open on HAL
Publication details
Submitted on
June 20, 2018
Accepted on
June 20, 2018
Published on
June 20, 2018
Last modified on
March 31, 2025
Proceedings 1
Connecting the Knowledge Commons: From Projects to Sustainable Infrastructure
Long Papers
DOI
10.4000/proceedings.elpub.2018.19
License
Attribution 4.0 International (CC BY 4.0)
Indicators
385
Views
1196
Downloads

Automatic Subject Indexing and Classification Using Text Recognition and Computer-Based Analysis of Tables of Contents

Jan Pokorny (1)
(1) ENKI, o.p.s.
Abstract
This paper will describe a method for machine-based creation of high quality subject indexing and classification for both electronic and print documents using tables of contents (ToCs). The technology described here is primarily focused on electronic and print documents for which, because of technical or licensing reasons, it is not possible to index full text. However, the technology would also be useful for full text documents, because it could significantly enhance the accuracy and relevance of subject description by analyzing the structure of ToCs.
Keywords
  • [SHS.INFO]Humanities and Social Sciences/Library and information sciences
  • machine learning system
  • computer-generated keywords
  • library automatization
  • text mining
  • computer-generated subject headings
Cited by

Source: OpenCitations

  • Hierarchical Multi-Label Classification of Library Subject Headings

    2022 International Conference on Cybernetics and Innovations (ICCI)

    Authors : Worrawan Wandee, Pokpong Songmuang ORCID

    Journal reference : Volume 8, 2022, pp. 1-5

    DOI : 10.1109/icci54995.2022.9744189
  • Automated Subject Indexing of Domain Specific Collections Using Word Embeddings and General Purpose Thesauri

    Communications in computer and information science

    Authors : Michalis Sfakakis ORCID, Leonidas Papachristopoulos ORCID, Kyriaki Zoutsou ORCID, Giannis Tsakonas ORCID, Christos Papatheodorou ORCID

    Journal reference : Volume , 2019, pp. 103-114

    DOI : 10.1007/978-3-030-36599-8_9
Preview
Loading PDF preview...