Kickstarter: Automating Backup, Replicate & Restore in Greenplum
This toolset is a core component of Mugnano Data Consulting’s DBA Enablement service offering and ships in our DBA Operations Kickstarter suite, providing turnkey practices and tooling for Greenplum operations.
Disaster Recovery isn’t just “take a backup.” At MPP scale, you need a
repeatable end-to-end flow that tags each backup, replicates it to DR,
and restores it with minimal coordination and downtime. This post
walks through the DBA Operations Kickstarter framework that wraps
gpbackup
, gpbackup_manager
, and gprestore
into a fully
automated pipeline.
Why this matters: We tag each backup with a timestamp, then use that tag to drive replication and restore so the entire DR flow is deterministic and scriptable.
High-Level Approach
- Backup: Separate METADATA (catalog) from DATA to reduce lock time, then parallelize data sets.
- Replicate: Ship artifacts to DR (e.g., Data Domain) using configurable interfaces.
- Restore: Read the latest “marker file” for the backup timestamp and rehydrate the target DB.
Why Wrapper Scripts Are Necessary
Vendor tools are excellent, but at scale you hit operational realities the wrappers solve:
- Parallelism vs. Single-File Backups: Backup appliances (e.g., Data Domain) perform best with fewer, larger files, so the single-data-file option is preferred. That disables the built-in parallel jobs flag—our wrappers reintroduce safe parallelism by splitting work into logical backup tags (e.g., by schema) and coordinating them via auto-generated Makefiles.
- Timestamp Collisions:
gpbackup
uses timestamps as the backup key. Parallel runs can collide if they start in the same second. The wrappers serialize tag start times (a predecessor check + “running” marker) to guarantee unique timestamps and to propagate the correct TS downstream to replicate/restore. - Thread Limits & Race Conditions: Data Domain can have different limits for write/read/replication threads. Wrappers decouple backup, replicate, and restore into independent atomic units so each stage can saturate its own thread pool—and add backoff/retry to avoid rare collisions.
- Deterministic DR: Marker files (
<ts>|<location>|<DD|DIR>
) make the entire chain deterministic. Replicate/restore run in wait mode and wake exactly when the right marker appears.
Key Components
1) Environment Configuration
Global and per-database shell vars live under ~/gptools/backup_restore/conf
:
env_vars.sh
– cluster-wide defaults (paths, marker dir, plugin slots).<dbname>_vars.sh
– DB-specific overrides (interfaces, target DB names, etc.).
2) Data Domain Integration
If you use Dell EMC Data Domain, YAML plugin files define local and remote
targets, streams, and optional “restore-from” settings. Passwords are encrypted
with gpbackup_manager encrypt-password
(stored via pgcrypto), and a helper
(gen_backup_ymls.sh
) generates consistent YAMLs per environment.
Password tip: encrypt once, then copy the hidden .encrypt
key from $MASTER_DATA_DIRECTORY
to all hosts that must decrypt it.
3) Backup Set Configuration (SQL)
Declare how to split and parallelize backups in a single table dbaconfig.backup_config
.
The framework auto-generates Makefiles on each run—no manual edits required.
4) Marker Files (Orchestration)
Each backup writes a “marker” with the backup timestamp and location:
<timestamp>|<plugin-or-dir>|<DD|DIR>
. Replication and restore read these
markers to know exactly what to act on—enabling hands-off DR chaining.
Running the Pipeline
Backup
# Backup specific type/db; runs METADATA, then parallel DATA sets
./backup.sh -t <all|daily|weekly|monthly|yearly> -d <dbname>
Replication
# Wait mode: sits idle until a new marker arrives, then replicates that TS
./replicate.sh -t <type> -d <dbname> --wait
Restore
# Wait mode: restores when backup marker arrives; runs METADATA, then DATA
./restore.sh -t <type> -d <dbname> --wait
Process Visuals



Monitoring & Control
Use dr_process_manager.sh
to view running PIDs, kill/restart by tag, and
launch ad-hoc schema/table restores.
Scheduling & Validation
- Run
validate_backup_config.sh
before scheduled windows. - Email results via
rpt_validate_backup_config.sh
. - Include the marker archive directory in your retention policy.
Conclusion
Automating backup, replication, and restore turns DR from a manual “hope it works” exercise into a repeatable, observable pipeline. The wrapper scripts give you determinism (timestamped markers), performance (parallel tags with single-file efficiency), and control (Makefile orchestration and process management) on top of Greenplum’s native tools.
- Faster & safer: Metadata/Data split minimizes lock time while keeping restores predictable.
- Production-ready: Central SQL config (
backup_config
) + marker files + validators. - Operable:
dr_process_manager.sh
to monitor, kill, or restart by tag when you need to intervene.