I'm Michael, a data engineer from Darlington in the north-east of England. I come from a farming family (yes, that's where the surname comes from), which probably explains why I ended up in a job that's mostly about getting things to grow reliably in unpredictable conditions. Just with data pipelines instead of barley fields.
I've spent my career so far building the infrastructure that moves data from where it lands to where it's useful. I like the work because it sits at the intersection of software engineering and problem-solving at scale, and because when you get it right, everything downstream just works.
What I Do
At Salesfire, I engineer the data infrastructure behind an ecommerce personalisation platform: real-time pipelines ingesting billions of behavioural events across 700+ retailers, a ClickHouse data warehouse powering analytics, and an identity resolution system built on Neptune ML that stitches anonymous shoppers into unified profiles. Day to day that means SQL, TypeScript, and a lot of AWS and Terraform, with Python for data analysis.
Before this I worked across ecommerce, fintech, and ad-tech, starting as a software engineer building Python scrapers and product feed platforms. I gravitated toward data problems early: high-volume ingestion, messy source systems, the gap between raw data and something a business can actually use. I eventually made the switch to data engineering full time.
How I Think About Engineering
Every pipeline breaks eventually. I build for idempotency, at-least-once delivery, and clear failure modes so that re-running is always safe and the blast radius is small.
If a pipeline runs without metrics, quality checks, and alerting, you don't know if it's working. You just know it hasn't visibly broken yet. I instrument everything.
If it's not in a Terraform file or a Docker image, it doesn't exist. Manual setup is a liability. Reproducibility is a feature.
Postgres for transactional work, ClickHouse for analytical queries, Kafka for event streaming. I pick tools based on the workload, not the hype cycle.