Package: JATSdecoder 1.2.0

JATSdecoder: A Metadata and Text Extraction and Manipulation Tool Set

Provides a function collection to extract metadata, sectioned text and study characteristics from scientific articles in 'NISO-JATS' format. Articles in PDF format can be converted to 'NISO-JATS' with the 'Content ExtRactor and MINEr' ('CERMINE', <https://github.com/CeON/CERMINE>). For convenience, two functions bundle the extraction heuristics: JATSdecoder() converts 'NISO-JATS'-tagged XML files to a structured list with elements title, author, journal, history, 'DOI', abstract, sectioned text and reference list. study.character() extracts multiple study characteristics like number of included studies, statistical methods used, alpha error, power, statistical results, correction method for multiple testing, software used. An estimation of the involved sample size is performed based on reports within the abstract and the reported degrees of freedom within statistical results. In addition, the package contains some useful functions to process text (text2sentences(), text2num(), ngram(), strsplit2(), grep2()). See Böschen, I. (2021) <doi:10.1007/s11192-021-04162-z> Böschen, I. (2021) <doi:10.1038/s41598-021-98782-3> and Böschen, I (2023) <doi:10.1038/s41598-022-27085-y>.

Authors:Ingmar Böschen [aut, cre]

JATSdecoder_1.2.0.tar.gz
JATSdecoder_1.2.0.zip(r-4.5)JATSdecoder_1.2.0.zip(r-4.4)JATSdecoder_1.2.0.zip(r-4.3)
JATSdecoder_1.2.0.tgz(r-4.4-any)JATSdecoder_1.2.0.tgz(r-4.3-any)
JATSdecoder_1.2.0.tar.gz(r-4.5-noble)JATSdecoder_1.2.0.tar.gz(r-4.4-noble)
JATSdecoder_1.2.0.tgz(r-4.4-emscripten)JATSdecoder_1.2.0.tgz(r-4.3-emscripten)
JATSdecoder.pdf |JATSdecoder.html
JATSdecoder/json (API)

# Install 'JATSdecoder' in R:
install.packages('JATSdecoder', repos = c('https://ingmarboeschen.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/ingmarboeschen/jatsdecoder/issues

Uses libs:
  • openjdk– OpenJDK Java runtime, using Hotspot JIT

On CRAN:

cermineniso-jatspubmedcentraltext-extractiontext-miningxml-files

4.43 score 18 stars 7 scripts 230 downloads 47 exports 4 dependencies

Last updated 1 days agofrom:0e6c2d0b5c. Checks:OK: 1 WARNING: 6. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 22 2024
R-4.5-winWARNINGNov 22 2024
R-4.5-linuxWARNINGNov 22 2024
R-4.4-winWARNINGNov 22 2024
R-4.4-macWARNINGNov 22 2024
R-4.3-winWARNINGNov 22 2024
R-4.3-macWARNINGNov 22 2024

Exports:allStatsest.ssget.abstractget.affget.alpha.errorget.assumptionsget.authorget.categoryget.contribget.countryget.doiget.editorget.historyget.journalget.keywordsget.methodget.multi.comparisonget.n.studiesget.outlier.defget.powerget.R.packageget.referencesget.sentence.with.patternget.sig.adjectivesget.softwareget.statsget.subjectget.tablesget.test.directionget.textget.titleget.typeget.volgrep2has.interactionJATSdecoderletter.convertngrampCheckpreCheckstandardStatsstrsplit2study.charactertext2numtext2sentencesvectorize.textwhich.term

Dependencies:NLPopenNLPopenNLPdatarJava

Readme and manuals

Help Manual

Help pageTopics
allStatsallStats
est.ssest.ss
get.abstractget.abstract
get.affget.aff
get.alpha.errorget.alpha.error
get.assumptionsget.assumptions
get.authorget.author
get.categoryget.category
get.countryget.country
get.doiget.doi
get.editorget.editor
get.historyget.history
get.journalget.journal
get.keywordsget.keywords
get.methodget.method
get.multi.comparisonget.multi.comparison
get.n.studiesget.n.studies
get.outlier.defget.outlier.def
get.powerget.power
get.R.packageget.R.package
get.referencesget.references
get.sig.adjectivesget.sig.adjectives
get.softwareget.software
get.statsget.stats
get.subjectget.subject
get.tablesget.tables
get.test.directionget.test.direction
get.textget.text
get.titleget.title
get.typeget.type
get.volget.vol
grep2grep2
has.interactionhas.interaction
JATSdecoderJATSdecoder
letter.convertletter.convert
ngramngram
pCheckpCheck
standardStatsstandardStats
strsplit2strsplit2
study.characterstudy.character
text2numtext2num
text2sentencestext2sentences
vectorize.textvectorize.text
which.termwhich.term