Senior Software Engineer — Distributed Systems

Date Posted
TodayNew!
Remote Work Level
Hybrid Remote
Location
New York, NY
Salary
$130,000 - $230,000
Job Schedule
Full-time
Benefits
New-hire stock equity (RSUs)Employee Stock Purchase Plan (ESPP)Generous global benefitsContinuous professional developmentIntradepartmental mentor program
Apply Now →View original posting →

Requirements & tools

Education
BS, MS, or PhD in Computer Science or equivalent industry experience
Tools & systems
GoKafkaCassandraElasticsearchDatadog observability

Datadog

The Senior Software Engineer, Distributed Systems at Datadog joins the team building the foundational platforms that power Datadog’s observability products. Day-to-day work involves designing fault-tolerant, horizontally scalable solutions running in multi-tenant environments; writing production code in Go, Java, Rust, or C++; and working with open-source infrastructure including Kafka for streaming, Redis and Cassandra for storage, Elasticsearch for search, and other systems components. The role requires the ability to go down to the low-level when debugging production issues and a focus on simple designs that perform well at internet scale. Required: 6+ years of backend programming experience, a BS/MS/PhD in computer science or equivalent industry experience, significant backend programming in one or more of Go, Java, Rust, or C++, exposure to high-durability or low-latency problem spaces, and demonstrated ability to use AI coding tools in day-to-day workflows. Bonus: motivation to push the boundaries of how AI improves software engineering practices. Datadog operates a hybrid workplace with three coordinated in-office days per week at Boston or NYC. Compensation: $130,000-$230,000 base plus equity (RSUs), ESPP, generous global benefits.

Role context

Senior Distributed Systems Engineers at observability platforms design the foundational backbones that ingest, store, and query billions of telemetry events per second from companies worldwide. At Datadog, these systems must be optimized for durability, high availability, low latency, internet-scale footprint, and operability — the kind of infrastructure that breaks in interesting ways at multi-tenant scale. The role works in Go, Java, Rust, or C++, alongside open-source components like Kafka, Redis, Cassandra, and Elasticsearch. Six-plus years of backend programming experience is the baseline. Hybrid schedule with three coordinated days in the Boston or NYC office. Compensation reflects Datadog’s standard distributed systems engineering pay banding.

Quick facts

State employment
105,000
Min experience
6 years
Hiring cycle
35 days
Top skills
Distributed systems design and fault toleranceBackend programming (Go, Java, Rust, C++)Streaming systems (Kafka) and storage (Cassandra, Redis)High-availability and low-latency optimizationAI-augmented engineering workflows
Apply Now →
Submit your application in under 2 minutes

Frequently Asked Questions

What does "distributed systems" mean in practice at Datadog?

Datadog's distributed systems power telemetry ingestion (billions of events per second from millions of hosts), storage (petabyte-scale time-series and log data), and query (sub-second latency on massive datasets). Engineers work on services like the metrics ingestion pipeline, log indexing infrastructure, distributed tracing storage, and the query engines that span them. Concrete problems: multi-region replication, consistency-vs-availability tradeoffs at scale, capacity planning for spiky workloads, and managing operational complexity across thousands of hosts.

Is AI coding tools experience required at the senior level?

Datadog now explicitly lists AI coding tool fluency as a hiring requirement for senior engineers. The expectation isn't just "use Copilot occasionally" but rather demonstrate the ability to validate, critique, and refine AI-generated output — knowing when AI suggestions are wrong and how to course-correct. Engineers without active AI-augmented workflows are at a disadvantage; those who have integrated Claude, ChatGPT, or similar tools into their daily flow have an edge.

What languages does the Distributed Systems team primarily use?

Go is the most common language for new services. Java still powers many legacy core systems, especially storage layers. Rust is increasingly used for performance-critical infrastructure (e.g., the metrics ingestion pipeline). C++ appears in select high-performance components. The team values polyglot engineers comfortable across the stack rather than language specialists.

This listing aggregates publicly posted role information and adds market context. AIJobSearch.us operates in commercial relationship with our partner platform.
Scroll to Top
Apply Now →
$175,000/yr
Apply Now →
Tech Software Engineering Observability

About Datadog

Datadog
Operating in:
Tech Software Engineering Observability
Location New York, NY
Listing on AIJobSearch.us · partner platform