When “Nothing Changed” Breaks Your Greenplum Performance

Mar 2

If you run Greenplum and have ever said, “Nothing changed, but now Informatica jobs are slow,” this post is for you.

Recently we worked with a customer running Greenplum on bare metal who experienced significant performance degradation in ETL workloads, especially Source Qualifier heavy Informatica jobs. The slowdown began shortly after what was described as a minor infrastructure change.

Nothing was down
No application errors
CPU, disk, and memory looked normal

But throughput dropped in a noticeable way.

The symptoms pointed in the wrong direction

At first glance, the environment looked healthy:

Client connectivity was stable
Simple queries ran quickly
No storage bottlenecks were obvious
No crashes or fatal errors

However, under sustained load, data movement queries slowed significantly. Subtle warnings began appearing in the database logs. They were not fatal errors, so they were easy to overlook.

The system was not failing outright.

It was retrying internally.

And those retries add up.

Why ETL workloads feel this first

Large ETL jobs are often the first to expose backend instability in distributed databases.

They:

Pull large result sets
Trigger redistribution across segments
Stress internal communication paths much more than simple queries

When the backend network is even slightly unstable, ETL is usually the first workload to show it.

The layer most teams do not monitor

Most monitoring tools focus on:

Client to database latency
CPU and memory
Disk I/O

What they often do not clearly expose is:

Segment to segment communication
Packet handling inside the interconnect network
Internal retries that stretch execution time

That is where this issue lived.

“Disabling the monitoring agent fixed it”

One interesting detail was that disabling an observability agent appeared to improve performance.

It did not fix the root cause.

What it did was reduce system pressure just enough to hide the underlying weakness.

That distinction is important. If the architecture remains sensitive, the problem can return during:

Peak ETL windows
Higher data volumes
Future infrastructure changes

The real takeaway

Distributed databases like Greenplum are highly sensitive to backend network behavior, especially when:

Hosts have multiple network interfaces
Internal traffic uses UDP
Firewalls or security tooling are introduced
Routing or MTU settings change

These platforms do not always fail loudly.

Sometimes they just get slower.

Does this sound familiar?

If you are seeing any of the following, it may be worth a deeper review:

ETL jobs slowing after infrastructure changes
Performance issues that disappear when load is reduced
Interconnect warnings in logs that are not fatal
Monitoring tools show healthy infrastructure but applications disagree
Problems that appear only under concurrency

Need a second set of eyes?

If you are running Greenplum or a Greenplum compatible MPP platform and performance does not align with system metrics, we help customers diagnose and correct exactly these types of issues.

Contact Mugnano Data Consulting to discuss your environment.

PerformanceDBA OperationsGreenplumWarehousePGEDB PostgresAISynxDBCloudberry

Louis Mugnano https://www.mugnanodc.com