This repository contains a reproducible research compendium for the case study used in the book: Manika Lamba and Margam Madhusudhan (2021) Text Mining for Information Professionals: An Uncharted Territory, SpringerNature.
📫 For corrections/suggestions reach me at lambamanika07@gmail.com or create an issue here
Please cite this compendium as: Lamba, Manika, & Madhusudhan, Margam. (2021). Burst Detection of Documents using Three Different Tools (Version 1.2). https://doi.org/10.5281/zenodo.5203298
The compendium contains the data, code, and notebook associated with the case studies. It is divided into 6A, and 6B. 6A case study used Sci2 tool, and 6B case study used R programming language to perform burst detection. It is organized as follows:
- The
6a_dataset.csvfile contains the data for 6A case study.- The
6a_maximum_burst_level.csvis a supplementary file that is associated with 6A case study. - The
6a_results_barSizes.csvis a supplementary file that is associated with 6A case study. - The
6a_results_horizontal-line-graph.psis a supplementary file that is associated with 6A case study.
- The
- The
6b_dataset.csvfile contains the data for 6B case study. - The
burst_detection.Rfile contatins the R code for 6B case study. - The
Case_Study_6B.ipynbfile contatins the Jupyter notebook for 6B case study.
There are several ways to use the compendium’s contents and reproduce the analysis:
-
Download the compendium as a zip archive from this GitHub repository.
- After unpacking the downloaded zip archive, you can explore the files on your computer.
-
Reproduce the analysis in the cloud without having to install any software. The same Docker container replicating the computational environment used by the authors can be run using BinderHub on mybinder.org:
-
Click RStudio:
to launch an interactive RStudio session in your web browser for hands-on practice for 6B case study. In the virtual environment, open the
burst_detection.Rfile to run the code. -
Click Jupyter+R:
to launch an interactive Jupyter Notebook session in your web browser using R kernel. When you execute code within the notebook, the results appear beneath the code.
-
Limitations of Binder
- The server has limited memory so you cannot load large datasets or run big computations.
- Binder is meant for interactive and ephemeral interactive coding so an instance will die after 10 minutes of inactivity.
- An instance cannot be kept alive for more than 12 hours.
-
Figures, Code, Data, Hex-sticker : MIT License
