logo of Akmatori
27.09.2025

Simplify Data Movement with ingestr: Zero-Code Replication for Modern Teams

head-image

Data teams often wait on custom scripts or brittle ELT jobs just to land a new table in analytics. ingestr answers that with a self-contained CLI: point it at a source, choose a destination, and let it manage batching, retries, and incremental syncs. You can explore the code and roadmap on GitHub.

What is ingestr?

ingestr is an open-source command-line tool from Bruin Data that wraps the dlt and SQLAlchemy ecosystems into a single, opinionated workflow. The project is developed in the open on GitHub and focuses on:

  • Connector breadth: databases like Postgres, BigQuery, ClickHouse, DuckDB, and Elasticsearch; SaaS APIs such as Salesforce, Shopify, Notion, and GitHub; flat files from S3 or local CSV/Parquet.
  • Incremental strategies: choose append, merge, or delete+insert semantics per table without writing pipeline code.
  • Backend-free operation: run it locally, in CI, or from a cron container—no control plane to babysit.

Installation

The maintainers recommend uv for the fastest setup:

pip install uv
uvx ingestr

If you prefer a global install:

uv pip install --system ingestr

While pip install ingestr works, uv dramatically shortens dependency resolution by shipping pre-built wheels.

Quickstart: Replicate Postgres to BigQuery

ingestr ingest \
  --source-uri 'postgresql://admin:admin@localhost:8837/web?sslmode=disable' \
  --source-table 'public.some_data' \
  --dest-uri 'bigquery://demo-project?credentials_path=/tmp/sa.json' \
  --dest-table 'landing.some_data'

This single command fetches public.some_data from Postgres, stages it, then hydrates landing.some_data inside BigQuery. ingestr auto-detects data types, negotiates batch sizes, and surfaces a run log you can ship to observability tooling.

Operating Modes You Should Know

  • Incremental flags: Use --load-method append, merge, or delete-insert to control deduplication and idempotency.
  • Schema evolution: ingestr inspects column metadata on each run and propagates new fields downstream, reducing manual DDL.
  • Credential flexibility: connection strings handle secrets; for cloud warehouses supply key files via query params (credentials_path, aws_profile, etc.).
  • Dry runs and validation: pair --record-limit with a sandbox destination to sample data before the first full sync.

Production Tips

  • Schedule safely: wrap ingestr inside GitHub Actions, Airflow, or a Kubernetes CronJob. Treat each command invocation as a stateless batch.
  • Log retention: stream STDOUT/STDERR to your log stack to capture load metrics and retry hints.
  • Catalog hygiene: split large workloads into multiple invocations so you can tune load methods per table.
  • Community support: join the Bruin Data Slack to request new connectors or report edge cases.

Why SRE and Platform Teams Care

Simpler ingestion helps response loops. When an incident demands historical context from an operational database, ingestr can hydrate analytics stores in minutes without waiting on code deploys. It also lowers the barrier to replicate metrics into staging environments for chaos drills or load tests.

Conclusion

ingestr collapses weeks of pipeline plumbing into a reproducible CLI workflow. Install it with uv, point it at your source and target, and let the tool manage sync semantics while you focus on insights.

For efficient incident management and to prevent on-call burnout, consider using Akmatori. Akmatori automates incident response, reduces downtime, and simplifies troubleshooting.

Additionally, for reliable virtual machines and bare metal servers worldwide, check out Gcore.

Maximize your website or application's performance and reliability!