An interactive command-line tool for managing Internet Archive repositories.
Use this script to list files, upload files, download files, delete files, move files, and create new repositories with detailed metadata input.
- Internet Archive Interact
- Usage
- Running the Script
- Full Breakdown and Feature Notes
- Overview
- Detailed Breakdown by Function
- 1.
get_repo_identifier(repo_link) - 2.
upload_file_with_progress(identifier, file_path, directory) - 3.
list_repository_files(identifier) - 4.
delete_file(identifier, file_path) - 5.
move_file(identifier, file_name, source_dir, target_dir) - 6.
create_rules_file(folder_path) - 7.
prompt_metadata() - 8.
initialize_repository(folder_path, identifier, metadata, mode) - 9.
print_help() - 10.
main()
- 1.
- Creator Notes
- Interactive Menu: Choose options to list files, upload files, download files, delete or move files, or create a new repository.
- Basic GUI Mode: Includes a login page for S3 keys, repository file list selection for download, and local file list selection for upload.
- Single Launcher: One executable supports both modes (CLI by default, GUI via
--gui). - Flexible Repository Input: Accepts a plain identifier (for example
my_item) or Archive URLs such as/details/,/download/, and/metadata/. - Test Mode & Permanent Mode: Run in simulation (Test Mode, where no changes are made) or execute actual changes (Permanent Mode).
- Metadata Support: Input metadata including title, description, creator, date, language, license URL, collection, subject tags, and test item status.
- Collection Options: Supports collections such as
community,opensource,texts,movies,audio,image,etree,folksoundomy,games, andsoftware. - Progress Bars: Uses
tqdmto display file upload progress. - S3 Authentication: Uses S3 access keys (set as environment variables) for secure communication with the Internet Archive.
Before you begin, ensure you have:
- A Linux environment (Ubuntu, Debian, etc.)
- Python 3 installed
- Tkinter support (
python3-tkon Linux if your distro does not include it by default) - S3 Access Keys from Internet Archive
- An active internet connection
- Internet Archive Python Tool
Update Your System:
sudo apt update && sudo apt upgrade -y
Install Python 3 and pip:
sudo apt install -y python3 python3-pip
Install internet archive CLI via pipx:
pipx install internetarchive
Download the Script into your home or internet archive folder
-
Option 2: Clone from GitHub (if available):
-
Option 3: Create the Script Manually:
- Open your text editor and create a file named
ia-interact.py: - Paste the full script code into the file, then save and exit.
(Optional) Make the Script Executable:
chmod +x ia-interact.py
Install Required Libraries:
pip3 install requests tqdm
Verify the Library Installation:
pip3 show requests tqdm
Obtain Your S3 Access Keys:
Retrieve your S3 keys from Internet Archive Account Page.
Set Up Environment Variables:
Edit your shell configuration file (e.g., ~/.bashrc or ~/.zshrc) and add:
export S3_ACCESS_KEY="your-access-key"
export S3_SECRET_KEY="your-secret-key"
Replace "your-access-key" and "your-secret-key" with your actual keys.
Reload the Configuration:
source ~/.bashrc
Test the Environment Variables:
echo $S3_ACCESS_KEY
echo $S3_SECRET_KEY
Install build dependencies:
pip3 install -r requirements-build.txt
Build a single-file Linux executable:
pyinstaller --clean --onefile --name ia-interact ia-interact.py
Release icon asset used by packaged binaries:
assets/icons/internet-archive.png
The binary will be output to:
dist/ia-interact
On Windows, the file will be:
dist/ia-interact.exe
Run CLI mode from the packaged binary:
./dist/ia-interact --cli
Run GUI mode from the same packaged binary:
./dist/ia-interact --gui
Build a Linux .AppImage (x86_64):
chmod +x scripts/build-appimage.sh
./scripts/build-appimage.sh
The x86_64 AppImage will be output to:
release/ia-interact-linux-x86_64.AppImage
Build a Linux .AppImage (arm64):
APPIMAGE_ARCH=aarch64 ./scripts/build-appimage.sh
The arm64 AppImage will be output to:
release/ia-interact-linux-aarch64.AppImage
This repository includes:
.github/workflows/build-binaries.yml
The workflow builds portable release artifacts for:
- Linux x86_64:
ia-interact-linux-x86_64.AppImage - Linux arm64:
ia-interact-linux-aarch64.AppImage - Windows x86_64:
ia-interact-windows-x86_64.exe - Windows arm64:
ia-interact-windows-arm64.exe - macOS universal (arm64 + x86_64):
ia-interact-macos-universal.app.zip
These release targets use Internet Archive icon assets from assets/icons/.
Each run uploads these as workflow artifacts.
When you push a tag matching v* (for example v1.0.0), the workflow also publishes these files to the GitHub Release for that tag.
Issue reference: https://github.com/harrypm/IA-Interact/issues/1
The issue confirms the script can be run directly on Windows with Python.
Recommended flow:
-
Install Python on Windows and enable Add Python to PATH during install.
-
Open
cmdin the folder containingia-interact.py(File Explorer address bar -> typecmd). -
Install dependencies:
python -m pip install requests tqdm -
Run the tool:
python -m ia-interactAlternative:
python ia-interact.py -
Configure S3 keys before use (recommended: environment variables rather than hardcoding keys in the script).
- The issue notes also mention replacing
os.getenv("S3_ACCESS_KEY")andos.getenv("S3_SECRET_KEY")inline in the script. - If you do that for troubleshooting, keep it local-only and do not commit secrets.
- The issue notes also mention replacing
Large upload note from the issue:
- In the upload flow, choosing new folder is more reliable for large uploads.
- Using existing folder with
./may return a400error.
Run the GUI:
python3 ia-interact-gui.py
Or use the unified launcher:
python3 ia-interact.py --gui
GUI flow:
- Enter S3 access and secret keys on the login page.
- Enter your repository URL or identifier (for example
archive.org/details/<id>,archive.org/download/<id>/...,archive.org/metadata/<id>, or just<id>). - Select repository files (left list) to download.
- Add/select local files (right list) to upload.
- Set target upload directory and run upload/download actions.
- Upload can start directly from a valid repository field value.
- Download requires a loaded repository file list.
To execute the script, run:
python3 ia-interact.py
To launch the GUI version, run: python3 ia-interact.py --gui python3 ia-interact-gui.py
To force CLI mode explicitly, run: python3 ia-interact.py --cli
To interact with a repo, you can use either the item identifier directly or a full Archive URL:
xxxxxxxxxx
https://archive.org/details/xxxxxxxxxx
https://archive.org/download/xxxxxxxxxx/some/file.ext
https://archive.org/metadata/xxxxxxxxxx
From AppImage output, run CLI mode:
./release/ia-interact-linux-x86_64.AppImage --cli
From the same AppImage, run GUI mode:
./release/ia-interact-linux-x86_64.AppImage --gui
When opened/clicked from a desktop environment, the packaged app defaults to GUI mode.
When the script runs, it displays an interactive menu with the following options:
- List Files: Display the contents of an existing repository.
- Upload Files: Add files to a repository.
- Download Files: Download one file or all files from a repository to a local folder.
- Delete/Move Files: Manage files within a repository.
- Create a New Repository: Upload an entire folder and configure repository metadata.
During repository creation, you will be prompted to:
- Input metadata (title, description, creator, date, language, license URL).
- Select a collection from the provided list.
- Enter subject tags (e.g.,
music, history). - Specify if the repository is a test item (Note: Test items are automatically deleted after 30 days).
- Choose between Test Mode (simulate actions without an actual upload) and Permanent Mode (execute actual uploads).
-
Missing Libraries:
If you encounter errors about missing libraries, run:pip3 install requests tqdm
-
Environment Variables Not Set:
Ensure your environment variables are defined in your shell configuration file and reload it:source ~/.bashrc
-
API or Network Issues:
Verify that your S3 keys are correct and that your internet connection is stable. -
Logging:
To help with debugging, you can redirect output to a log file:python3 ia-interact.py > script_output.log 2>&1
"IA Interact" is an interactive command-line tool designed to manage repositories on the Internet Archive. It supports operations such as:
- Uploading Files: Upload individual files or entire folders to an Internet Archive repository.
- Listing Repository Contents: Retrieve and display the contents of a repository using the metadata API.
- Deleting Files: Remove specified files from a repository.
- Moving Files: Change a file’s location within a repository by copying it and then deleting the original.
- Downloading Files: Download a single file or all files from a repository to a local path.
- Creating a New Repository: Upload a folder as a new repository and submit metadata.
- User Interaction: Offers an interactive menu with a help option, test mode (simulation) vs. permanent mode, and filtering to avoid showing files from ".thumbs" directories.
This script uses the Internet Archive’s S3-compatible interface and Metadata API, and it includes robust file upload logic (with chunking, progress bars, and retry strategies).
- Purpose:
Extracts the repository identifier from either a full Internet Archive URL or a plain identifier. - How It Works:
Normalizes and parses Archive URLs (/details/,/download/,/metadata/) and returns the item identifier.
If a plain identifier is provided, it is accepted directly. If input is invalid, it alerts the user and returnsNone.
- Purpose:
Uploads a file to a specified directory within a repository. - Key Features:
- Chunking: Reads the file in 2MB chunks.
- Progress Tracking: Uses the
tqdmlibrary to display a real-time progress bar. - Retry Logic: Implements retry strategy (5 retries) using an HTTPAdapter.
- S3 Authentication: Reads S3 keys from environment variables and includes them in the request headers.
- Note:
The function sends HTTP PUT requests to the URLhttps://s3.us.archive.org/{identifier}/{directory}/{filename}to perform the upload.
- Purpose:
Retrieves and lists the files in a repository. - Key Features:
- Metadata API: Sends a GET request to
https://archive.org/metadata/{identifier}to fetch repository metadata (in JSON). - Filtering: Excludes any files that reside in directories with names ending in ".thumbs" (i.e. if any component of the path ends with ".thumbs").
- Display: Prints out a numbered list of the filtered file names.
- Metadata API: Sends a GET request to
- Purpose:
Deletes a file from the repository. - Key Features:
- HTTP DELETE: Sends a DELETE request to the S3 endpoint
https://s3.us.archive.org/{identifier}/{file_path}. - S3 Authentication: Utilizes S3 credentials stored in environment variables.
- Feedback: Notifies the user whether the file deletion succeeded (checks for HTTP 200 or 204).
- HTTP DELETE: Sends a DELETE request to the S3 endpoint
- Purpose:
Moves a file from one location in the repository to another. - Key Features:
- Copy-Delete Approach:
- Copy: Uses an HTTP PUT request with the
x-amz-copy-sourceheader to copy the file to the target directory. - Delete: If the copy is successful, deletes the original file.
- Copy: Uses an HTTP PUT request with the
- S3 Authentication: Requires S3 keys from environment variables.
- Error Handling: Provides error messages if the copy or delete fails.
- Copy-Delete Approach:
- Purpose:
Ensures that a local folder contains a_rules.conffile. - Key Features:
- Default Rules: If
_rules.confdoes not exist, it is created with the default contentCAT.ALL. - Usage: This file can help control file visibility during repository uploads.
- Default Rules: If
- Purpose:
Collects metadata from the user required to create a new repository. - Collected Metadata Includes:
- Basic Information: Title, description, creator, date, language, license URL.
- Collection: The user selects one from a predefined list (e.g., community, opensource, texts, movies, audio, image, etree, folksoundomy, games, software).
- Subject Tags: A comma-separated list, such as "music, history".
- Test Item Flag: A flag to indicate if the repository is a test item (if "yes" is entered, it sends "true"; if "no", the field is omitted).
- Purpose:
Uploads all files from a specified folder as a new repository. - Key Features:
- Mode Selection:
- Test Mode: Simulates the upload process without transferring any files.
- Permanent Mode: Uploads each file via HTTP PUT requests.
- Recursive Upload: Iterates through all files in the folder (using
os.walk). - Metadata Submission: After file uploads, sends repository metadata via a POST request.
- Mode Selection:
- Purpose:
Displays a help message describing each menu option and its usage. - Features:
Provides detailed instructions for each operation and references the official Internet Archive CLI documentation for further details.
- Purpose:
Serves as the entry point of the script with an interactive menu. - Key Features:
- Main Menu: Displays options for uploading, listing, deleting, moving, and downloading files, creating a repository, or viewing help.
- Conditional Prompts:
- For Existing Repositories (options 1–5): Prompts for the repository URL after the option selection.
- For Folder-based Repository Creation (option 6): Gathers folder path, mode, and metadata.
- Action Dispatch: Calls the corresponding function based on the user’s selection.
This script was built using Microsoft Copilot, and 5 hours of Harry Munday's lifespan, Enjoy.


