Senior Director of Platform Engineering

Remote, USA Full-time
DeepSee delivers an open and flexible agentic platform to accelerate AI adoption for financial services in front, middle, and back-office operations. Our cloud-based platform seamlessly integrates with existing bank architectures, whether they're just starting their AI transformation journey or looking to enhance existing in-house capabilities with Agentic AI solutions. With DeepSee's pre-trained & pre-configured agents, banking and capital markets firms can automate and orchestrate manual, repetitive tasks-freeing domain experts for strategic work, reducing risk, and streamlining operations to drive greater efficiency. We are looking for a Senior Director of Platform Engineering to lead our backend, frontend, infrastructure, and MLOps/DevOps/CICD teams. You'll scale our Kubernetes platform across AKS, EKS, and on-prem, ensure high availability and performance, and evolve our agentic AI and MCP-based integrations for bank-grade reliability. You'll partner tightly with the Chief Architect and our Product team to deliver a secure, observable, auditable platform for regulated clients. Job Responsibilities: • Own and drive the platform roadmap and strategy for multi-cloud/on-prem Kubernetes (AKS, EKS, vanilla K8s), compute, data, networking, ML serving, and high availability/performance. • Lead, build, and develop multiple teams (Backend, Frontend, Infrastructure, MLOps/DevOps), including leadership, career ladders, and operational rhythms. • Scale Kubernetes reliably: capacity planning, autoscaling (HPA/VPA/Cluster Autoscaler/KEDA), cost controls for mixed CPU/GPU workloads. • Advance and mature GitOps, IaC, and observability practices (Argo CD, Terraform, Helm, OpenTelemetry, Datadog, Prometheus), including rollout strategies, standardization, monitoring, incident response, and post-mortems. • Advance MLOps for LLMs/SLMs/ML/DL (KServe, MLflow pipelines, model governance, inference patterns, GPU scheduling, canary rollouts). • Evolve and operate eventing and stateful architecture at scale (Kafka/ZooKeeper/KRaft, Postgres, S3/Blob, protobuf, schema evolution/versioning, resilient data planes). • Directly contribute technically via coding, reviews, and debugging distributed systems. • Partner closely with Chief Architect, Principal AI, Product, and other leads to deliver secure, observable, auditable, regulated banking solutions, supporting agentic AI and workflow automation. Must Haves: • Significant leadership experience: 10 years on distributed platforms and 5 years leading multi-disciplinary platform teams. • Deep, hands-on Kubernetes expertise (networking, security, tenancy, upgrades; AKS/EKS operations). • Proven hands-on expertise with GitOps, IaC, change management, rollout safety, and production observability (Argo CD, Terraform, Helm, OpenTelemetry, Datadog/Prometheus, SLOs/on-call). • Advanced MLOps experience (KServe, MLflow, model registry/governance, GPU scheduling, cost tuning, canary rollouts, safe rollouts). • Experience with designing/operating event streaming, stateful data, and resilient architecture at scale (Kafka/ZooKeeper/KRaft, Postgres, S3/Blob, protobuf, schema/versioning). • Deep proficiency in core languages (Java, Python, Go), cloud SDKs, and strong architectural communication to executive-level and clients. • Regulated FinServ experience (SOC 2/ISO 27001, SR 11-7, SEC/FINRA, model governance, OpenTelemetry, trace-driven perf, KServe ModelMesh or similar tools). Nice to Haves: • Hands-on skills with most listed technologies: Kubernetes (vanilla, AKS, EKS), Docker, Argo CD, Helm, Terraform, Kafka/ZooKeeper/KRaft, KServe, MLflow, OpenTelemetry, Datadog, Prometheus, protobuf, HPA, VPA, Karpenter or Cluster Autoscaler, LightRAG, Milvus, Postgres, S3/Blob, Redis, Airflow/dbt, Java, Python, Go. • Experience working alongside a variety of engineering leaders and principal engineers (Chief Architect, CISO, Principal Knowledge Graph Engineer, AI Engineer, Lead BE, Principal FE, Product). • Platform-as-a-product advocacy and developer experience focus, CNCF platform engineering guidance. Finally, it is important that you align with our Stuff That Matters. Knowledge Over Noise: We prioritize actionable insights One Team, One Dream: We collaborate seamlessly across functions Be a Seeker: We constantly pursue innovation and learning Stay Human: We keep our solutions people-centric Act Boldly: We take calculated risks to drive progress Believe: We're passionate about our mission Own It: We take responsibility for our work and its impact Why DeepSee.ai? Competitive compensation package including equity, with remote work options 100% company-paid premiums on health, dental, and vision insurance Opportunity to work on cutting-edge AI technology with real impact Collaborative and innovative work environment Join us in shaping the future of AI-powered automation and make a significant impact in a rapidly growing startup. If you're a hands-on problem solver who thrives in fast-paced environments and is excited about leveraging AI to solve complex problems, we want to hear from you! Apply tot his job
Apply Now

Similar Jobs

Principal Hardware Engineer / Director of Hardware

Remote, USA Full-time

Associate Director, Software Development Engineering

Remote, USA Full-time

Senior Plant Engineer

Remote, USA Full-time

Tutor- English (High School)

Remote, USA Full-time

Enterprise Account Executive - Retail Large Accounts

Remote, USA Full-time

Enterprise Account Executive - Florida

Remote, USA Full-time

Premium Services Enterprise Account Executive - Dedicated Mexico - Remote

Remote, USA Full-time

Enterprise Architect - High Tech, Telco, and Media

Remote, USA Full-time

Enterprise Architect (.NET / Microsoft Stack / AWS)

Remote, USA Full-time

Enterprise Architect, MS D365

Remote, USA Full-time

[Remote] Threat and Vulnerability Analyst

Remote, USA Full-time

Experienced Data Analyst and Customer Support Specialist – Remote Work Opportunity with arenaflex for Enthusiastic and Detail-Oriented Individuals

Remote, USA Full-time

[FULL TIME Remote] (Data Entry Work At Home) Walgreens Remote

Remote, USA Full-time

Director, Sales, Premier Actimize

Remote, USA Full-time

Executive Assistant, Marketing & Publicity (Hulu)

Remote, USA Full-time

Experienced EAP Worklife Customer Support Associate – Delivering Exceptional Member Experiences in a Fully Remote, Dynamic Call Center Environment (Sunday-Thursday 1:30pm-10:00pm EST)

Remote, USA Full-time

**Experienced Full Stack Data Entry Clerk – Virtual Data Management and Administration**

Remote, USA Full-time

Site Reliability Engineer - USDS

Remote, USA Full-time

Experienced Live Chat Support Specialist – Delivering Exceptional Member Experiences in the Biotechnology Industry at arenaflex

Remote, USA Full-time

Credentialing Admin Support Associate - FlexStaff (TEMP)

Remote, USA Full-time
Back to Home