Software

We develop software packages, primarily for literature reviews. Our software is available in the CoLRev Environment and the fs-ise organization.

CoLRev

CoLRev (Collaborative Literature Reviews) is an open-source environment for collaborative literature reviews. It integrates with different synthesis tools, takes care of the data, and facilitates Git-based collaboration. To accomplish these goals, CoLRev advances the design of review technology at the intersection of methods, design, cognition, and community building. The following features stand out:

Supports all literature review steps: problem formulation, search, dedupe, (pre)screen, pdf retrieval and preparation, and synthesis
An open and extensible environment based on shared data and process standards
Builds on git and its transparent collaboration model for the entire literature review process
Offers a self-explanatory, fault-tolerant, and configurable user workflow
Operates a model for data quality, content curation, and reuse
Enables typological and methodological pluralism throughout the process

SearchQuery

SearchQuery is a Python package for parsing, validating, simplifying, and serializing search queries for academic databases. It currently supports PubMed, EBSCOHost, and Web of Science, using a standardized JSON schema (Haddaway et al., 2022).

Programmatic use, CLI interface, and optional integration via pre-commit hooks
Zero dependencies: easily embeddable across environments
Extensible parser/validator architecture
Tested on real-world queries from searchRxiv

BibDedupe

BibDedupe is an open-source Python library for deduplication of bibliographic records, tailored for literature reviews. Unlike traditional deduplication methods, BibDedupe focuses on entity resolution, linking duplicate records instead of simply deleting them.

Automated Duplicate Linking with Zero False Positives: BibDedupe automates the duplicate linking process with a focus on eliminating false positives.
Preprocessing Approach: BibDedupe uses a preprocessing approach that reflects the unique error generation process in academic databases, such as author re-formatting, journal abbreviation or translations.
Entity Resolution: BibDedupe does not simply delete duplicates, but it links duplicates to resolve the entity and integrates the data. This allows for validation, and undo operations.
Programmatic Access: BibDedupe is designed for seamless integration into existing research workflows, providing programmatic access for easy incorporation into scripts and applications.
Transparent and Reproducible Rules: BibDedupe’s blocking and matching rules are transparent and easily reproducible to promote reproducibility in deduplication processes.
Continuous Benchmarking: Continuous integration tests running on GitHub Actions ensure ongoing benchmarking, maintaining the library’s reliability and performance across datasets.
Efficient and Parallel Computation: BibDedupe implements computations efficiently and in parallel, using appropriate data structures and functions for optimal performance.

PRISMA Flow Diagram

PRISMA Flow Diagram is a Python package for creating PRISMA 2020–compliant flow diagrams programmatically. It is designed for transparency, sensible defaults, and layouts that adapt automatically to the structure and counts of a review.

The package supports:

New systematic reviews (standard PRISMA 2020 flow)
Updated reviews (previous + newly identified studies)
Other search methods as an optional, structured extension

Users can either:

Pass structured counts directly via a concise Python API, or
Derive counts automatically from CoLRev records.bib files, enabling seamless integration into review workflows.

Key features

Validated PRISMA logic that prevents common reporting errors (e.g., screened ≤ identified − removed)
Support for multi-lane PRISMA diagrams, including databases/registers and other search methods
PNG output (with additional formats planned)

PRISMA Flow Diagram can be used standalone or as part of the CoLRev workflow, where it acts as a data package endpoint to generate PRISMA diagrams directly from review data. This ensures consistency between review data and the reported flow diagrams.

Deep-CENIC

Deep-CENIC is a deep learning classifier that measures the ideational impact of Information Systems review articles.

Highlights

Combines citation context, sentiment, position, and semantic similarity as predictive features
Offers a gold-standard coded dataset for evaluating ML and DL models
Reproducible pipeline using Docker and a Cookiecutter-like data structure
Demonstrated in Decision Support Systems (scientometric and NLP evaluation)
Includes transparent feature engineering for citation-based impact research

ENLIT

ENLIT supports scholars in exploring new literature by making backward searches more efficient and by guiding how to read a literature corpus.

What ENLIT does

Extracts references from a literature corpus (set of PDFs) and compiles a deduplicated reference list
Provides statistics on journals and authors that are frequently cited in the corpus
Implements a novel exploratory reading strategy: first read the most influential papers, then skim the remaining ones
Builds on GROBID for robust extraction of bibliographic information

References

Eckhardt, P., Ernst, K. M., Fleischmann, T., Geßler, A., Schnickmann, K., & Wagner, G. (2026). Search-query: A python package for queries in academic literature searches. (Version 0.15.0) [Computer software].

Prester, J., Wagner, G., Schryen, G., & Hassan, N. R. (2021). Classifying the ideational impact of information systems review articles: A content-enriched deep learning approach. Decision Support Systems, 140, 113432. https://doi.org/10.1016/J.DSS.2020.113432

Wagner, G. (2026). BibDedupe: An open-source python library for bibliographic record deduplication (Version 0.11.0) [Computer software].

Wagner, G., Empl, P., & Schryen, G. (2020). Designing a novel strategy for exploring literature corpora. European Conference on Information Systems, 1–17. https://aisel.aisnet.org/ecis2020_rp/44

Wagner, G., & Prester, J. (2026). CoLRev: An open-source environment for collaborative reviews (Version 0.16.1) [Computer software].

--- title: "Software" bibliography: ../data/references.bib csl: ../assets/apa.csl nocite: | @WagnerPresterCoLRev @EckhardtEtAlSearchQuery @WagnerBibDedupe @PresterWagnerSchryenEtAl2021 @WagnerEmplSchryen2020 --- We develop software packages, primarily for literature reviews. Our software is available in the [CoLRev Environment](https://github.com/CoLRev-Environment){target=_blank} and the [fs-ise](https://github.com/orgs/fs-ise/repositories){target=_blank} organization. ## CoLRev <table class="center-table logo-grid"> <tr> <td><img src="https://raw.githubusercontent.com/CoLRev-Ecosystem/colrev/main/docs/figures/logo_small.png" alt="CoLRev Logo" width="300"></td> <td> <a href="https://github.com/CoLRev-Environment/colrev"><img src="https://img.shields.io/github/commit-activity/t/CoLRev-Environment/colrev" alt="Total commits"></a><br> <a href="https://github.com/CoLRev-Environment/colrev"><img src="https://img.shields.io/github/contributors-anon/CoLRev-Environment/colrev" alt="Contributors"></a><br> <a href="https://zenodo.org/badge/latestdoi/363073613"><img src="https://zenodo.org/badge/363073613.svg" alt="DOI"></a> </td> </tr> </table> [CoLRev](https://github.com/CoLRev-Environment/colrev){target=_blank} (Collaborative Literature Reviews) is an open-source environment for collaborative literature reviews. It integrates with different synthesis tools, takes care of the data, and facilitates Git-based collaboration. To accomplish these goals, CoLRev advances the design of review technology at the intersection of methods, design, cognition, and community building. The following features stand out: - Supports all literature review steps: problem formulation, search, dedupe, (pre)screen, pdf retrieval and preparation, and synthesis - An open and extensible environment based on shared data and process standards - Builds on git and its transparent collaboration model for the entire literature review process - Offers a self-explanatory, fault-tolerant, and configurable user workflow - Operates a model for data quality, content curation, and reuse - Enables typological and methodological pluralism throughout the process ## SearchQuery <table class="center-table logo-grid"> <tr> <td> <img src="https://raw.githubusercontent.com/CoLRev-Environment/search-query/refs/heads/main/docs/source/_static/search_query_logo.svg" alt="SearchQuery Logo" width="180"> </td> <td> <a href="https://github.com/CoLRev-Environment/search-query"> <img src="https://img.shields.io/github/commit-activity/t/CoLRev-Environment/search-query" alt="Total commits"> </a><br> <a href="https://github.com/CoLRev-Environment/search-query"> <img src="https://img.shields.io/github/contributors-anon/CoLRev-Environment/search-query" alt="Contributors"> </a><br> <a href="https://joss.theoj.org/papers/ea1fcafb8f80fa98bcbd857cf1cfada9"><img src="https://joss.theoj.org/papers/ea1fcafb8f80fa98bcbd857cf1cfada9/status.svg" alt="DOI"></a> </td> </tr> </table> **SearchQuery** is a Python package for parsing, validating, simplifying, and serializing search queries for academic databases. It currently supports PubMed, EBSCOHost, and Web of Science, using a standardized JSON schema (Haddaway et al., 2022). - Programmatic use, CLI interface, and optional integration via pre-commit hooks - Zero dependencies: easily embeddable across environments - Extensible parser/validator architecture - Tested on real-world queries from [searchRxiv](https://www.searchrxiv.org/){target=_blank} ## BibDedupe <table class="center-table logo-grid"> <tr> <td> <img src="https://raw.githubusercontent.com/CoLRev-Environment/bib-dedupe/main/docs/figures/logo.png" alt="BibDedupe Logo" width="180"> </td> <td> <a href="https://github.com/CoLRev-Environment/bib-dedupe"> <img src="https://img.shields.io/github/commit-activity/t/CoLRev-Environment/bib-dedupe" alt="Total commits"> </a><br> <a href="https://github.com/CoLRev-Environment/bib-dedupe"> <img src="https://img.shields.io/github/contributors-anon/CoLRev-Environment/bib-dedupe" alt="Contributors"> </a><br> <a href="https://joss.theoj.org/papers/b954027d06d602c106430e275fe72130"><img src="https://joss.theoj.org/papers/b954027d06d602c106430e275fe72130/status.svg" alt="DOI"></a> </td> </tr> </table> [BibDedupe](https://github.com/CoLRev-Environment/bib-dedupe){target=_blank} is an open-source **Python library for deduplication of bibliographic records**, tailored for literature reviews. Unlike traditional deduplication methods, BibDedupe focuses on entity resolution, linking duplicate records instead of simply deleting them. - **Automated Duplicate Linking with Zero False Positives**: BibDedupe automates the duplicate linking process with a focus on eliminating false positives. - **Preprocessing Approach**: BibDedupe uses a preprocessing approach that reflects the unique error generation process in academic databases, such as author re-formatting, journal abbreviation or translations. - **Entity Resolution**: BibDedupe does not simply delete duplicates, but it links duplicates to resolve the entity and integrates the data. This allows for validation, and undo operations. - **Programmatic Access**: BibDedupe is designed for seamless integration into existing research workflows, providing programmatic access for easy incorporation into scripts and applications. - **Transparent and Reproducible Rules**: BibDedupe's blocking and matching rules are transparent and easily reproducible to promote reproducibility in deduplication processes. - **Continuous Benchmarking**: Continuous integration tests running on GitHub Actions ensure ongoing benchmarking, maintaining the library's reliability and performance across datasets. - **Efficient and Parallel Computation**: BibDedupe implements computations efficiently and in parallel, using appropriate data structures and functions for optimal performance. ## PRISMA Flow Diagram <table class="center-table logo-grid"> <tr> <td> <img src="https://raw.githubusercontent.com/CoLRev-Environment/prisma-flow-diagram/refs/heads/main/docs/figures/logo_small.png" alt="prisma-flow-diagram Logo" width="180"> </td> <td> <a href="https://github.com/CoLRev-Environment/prisma-flow-diagram"> <img src="https://img.shields.io/github/commit-activity/t/CoLRev-Environment/prisma-flow-diagram" alt="Total commits"> </a><br> <a href="https://github.com/CoLRev-Environment/prisma-flow-diagram"> <img src="https://img.shields.io/github/contributors-anon/CoLRev-Environment/prisma-flow-diagram" alt="Contributors"> </a> </td> </tr> </table> [PRISMA Flow Diagram](https://github.com/CoLRev-Environment/prisma-flow-diagram){target=_blank} is a Python package for creating **PRISMA 2020–compliant flow diagrams** programmatically. It is designed for transparency, sensible defaults, and layouts that adapt automatically to the structure and counts of a review. The package supports: - **New systematic reviews** (standard PRISMA 2020 flow) - **Updated reviews** (previous + newly identified studies) - **Other search methods** as an optional, structured extension Users can either: - Pass **structured counts directly** via a concise Python API, or - **Derive counts automatically from CoLRev `records.bib` files**, enabling seamless integration into review workflows. ### Key features - **Validated PRISMA logic** that prevents common reporting errors (e.g., screened ≤ identified − removed) - **Support for multi-lane PRISMA diagrams**, including databases/registers and other search methods - **PNG output** (with additional formats planned) PRISMA Flow Diagram can be used standalone or as part of the **CoLRev workflow**, where it acts as a data package endpoint to generate PRISMA diagrams directly from review data. This ensures consistency between review data and the reported flow diagrams. ## Deep-CENIC <table class="center-table logo-grid"> <tr> <td> <img src="https://raw.githubusercontent.com/julianprester/deep-cenic/master/deep-cenic-model.png" alt="Deep-CENIC Model" width="220"> </td> <td> <a href="https://github.com/julianprester/deep-cenic"> <img src="https://img.shields.io/github/commit-activity/t/julianprester/deep-cenic" alt="Total commits"> </a><br> <a href="https://github.com/julianprester/deep-cenic"> <img src="https://img.shields.io/github/contributors-anon/julianprester/deep-cenic" alt="Contributors"> </a><br> <a href="https://www.sciencedirect.com/science/article/pii/S0167923620301871" target="_blank"> <img src="https://img.shields.io/static/v1?label=Peer-reviewed&message=DSS&labelColor=000000&color=28a745" alt="Peer-reviewed | Decision Support Systems"> </a> </td> </tr> </table> [Deep-CENIC](https://github.com/julianprester/deep-cenic){target=_blank} is a deep learning classifier that measures the **ideational impact** of Information Systems review articles. **Highlights** - Combines citation context, sentiment, position, and semantic similarity as predictive features - Offers a **gold-standard coded dataset** for evaluating ML and DL models - Reproducible pipeline using **Docker** and a Cookiecutter-like data structure - Demonstrated in **Decision Support Systems** (scientometric and NLP evaluation) - Includes transparent feature engineering for citation-based impact research ## ENLIT <table class="center-table logo-grid"> <tr> <td> <img src="https://raw.githubusercontent.com/fs-ise/enlit/master/src/main/resources/logo.png" alt="ENLIT Logo" width="40"> </td> <td> <a href="https://github.com/fs-ise/enlit"> <img src="https://img.shields.io/github/commit-activity/t/fs-ise/enlit" alt="Total commits"> </a><br> <a href="https://github.com/fs-ise/enlit"> <img src="https://img.shields.io/github/contributors-anon/fs-ise/enlit" alt="Contributors"> </a><br> <a href="https://aisel.aisnet.org/ecis2020_rp/44/" target="_blank"> <img src="https://img.shields.io/static/v1?label=Peer-reviewed&message=ECIS2020&labelColor=000000&color=28a745" alt="Peer-reviewed | ECIS2020"> </a> </tr> </table> [ENLIT](https://github.com/fs-ise/enlit){target=_blank} supports scholars in **exploring new literature** by making backward searches more efficient and by guiding how to read a literature corpus. **What ENLIT does** - Extracts references from a literature corpus (set of PDFs) and compiles a **deduplicated reference list** - Provides **statistics on journals and authors** that are frequently cited in the corpus - Implements a **novel exploratory reading strategy**: first read the most influential papers, then skim the remaining ones - Builds on GROBID for robust extraction of bibliographic information