Skip to content

Latest commit

 

History

History
25 lines (18 loc) · 769 Bytes

README.md

File metadata and controls

25 lines (18 loc) · 769 Bytes

PMC Article Downloader

This Python script retrieves PMC article identifiers (PMCIDs) based on a search term, downloads the corresponding PDF files from the NCBI FTP server, and extracts the downloaded data.

Features

  • Retrieves PMC article identifiers based on a search term using NCBI's E-utilities.
  • Downloads PDF files from the NCBI FTP server.
  • Extracts and cleans up the downloaded files.

Requirements

  • Python 3.12
  • Required Python libraries:
    • requests
    • pandas
    • beautifulsoup4
    • tarfile (included in Python standard library)
    • shutil (included in Python standard library)
    • os (included in Python standard library)

You can install the required libraries using pip:

pip install requests pandas beautifulsoup4