Startup technology firm with offices in New York, Connecticut, and San Francisco. Our client’s platform helps data science teams accelerate research, increase collaboration, and rapidly deploy predictive models.
Responsibilities:
- Engineer reliability and performance into our product and services
- Instrument and monitor service health
- Manage and secure our cloud-based infrastructure
- Diagnose and fix issues in a distributed, containerized application
- Incident response (on-call) and root cause analysis
- Implement and manage access control and security services
- Collaborate with developers and PMs to continuously improve the platform
- Develop tools and processes to improve efficiency and reduce toil
Qualifications:
- Experience with managing cloud environments (AWS, GCP, Azure)
- Strong coding ability (Python, Bash)
- Systems fluency (Linux, storage, networking)
- Experience with container management (Kubernetes, Docker)
- Observability systems (New Relic, Prometheus)
- Operating stacks based on modern software components (Redis, ElasticSearch, RabbitMQ, MongoDB, PostgreSQL, Play)
- Programming experience (Python, Go, Bash)
- Infrastructure and configuration automation (Terraform, SaltStack)
- Exceptional problem solving acumen
Base Salary Range: $100,000 – $150,000