GoAzureOpenTelemetryDistributed Systems·2024–Present

Infrastructure for the Energy Transition

Building systems that keep solar farms running when the weather doesn't cooperate.

CompanyNextpower LLC (formerly NEXTracker)RoleSoftware Engineer

Started as an intern building CI pipelines and hardening security. Stayed as a full-time engineer architecting the systems that keep utility-scale solar plants online — including a failover controller that prevents outages during storms, and an API optimization that cut response times by 65%.

What it is

Nextpower (formerly NEXTracker) makes the hardware and software that controls utility-scale solar farms — the systems that physically rotate panels to track the sun across thousands of acres. When those systems go down, energy output drops. When they go down during a storm, the damage can run into the millions.

I joined as an intern in May 2024. By February 2025 I was a full-time engineer. The work spans reliability engineering, API performance, and the kind of infrastructure decisions that matter more than they look.

What I built — internship

As an intern, I built a CI pipeline using Azure DevOps that automated builds across multiple projects, cutting developer workload by 10 hours a week. I also led static and composition analysis across our dependencies, reducing high and medium security vulnerabilities by 92% — the interesting part wasn't the tooling, it was understanding why those vulnerabilities existed and designing checks that would catch them structurally, not just reactively.

What I built — full-time

The most consequential thing I've built here is an active–standby site controller system in Go. Solar plants can lose their primary controller during storms. The failover needs to be seamless — no manual intervention, no data loss, no gap in control. Getting that right required thinking carefully about state replication, split-brain scenarios, and what "seamless" actually means in a system where the hardware doesn't stop moving.

I also optimized API performance using OpenTelemetry and Azure Monitor Application Insights, reducing average response time from 1100ms to 385ms — a 65% improvement. The gains came from understanding where the actual latency was hiding, not from blanket caching.

A third project: enabling hardware reuse across deployments. Solar operators move equipment between sites. Previously that meant manual decommissioning workflows. I built a feature that lets users disconnect a Datahub from one site and reconnect it to another — saving thousands of dollars per reuse and a meaningful amount of operational friction.

What I'm taking from this

Working on physical infrastructure — systems where failures have real-world consequences measured in energy and dollars — sharpens your thinking about reliability in a way that pure software problems don't. You can't roll back a storm.

I've also learned that performance work is mostly detective work. The 65% API improvement wasn't from clever code — it was from finding the three places where we were doing work twice and stopping.

NextBuilding a Company from a Dorm Room