Senior Principal Software Engineer - IAG Platform / Reliability & DevOps Engineering
**Job Description**
We are seeking a **Senior Principal Software Engineer** to provide senior technical leadership for our **Identity & Access Governance (IAG)** services-initially focused on IAG and evolving into broader **cross-organization technical leadership** . This is a **Software Engineer role first** , with strong **DevOps/SRE-grade** capabilities. You build robust software systems, deliver major features into production, and take full ownership of reliability, operability, and secure-by-default engineering.
You have deep experience building and operating distributed cloud services and understand control plane architecture, service-to-service communication, and production-grade operational design. You will drive the design of major service components, partner closely with Engineering Managers, Architects, and TPMs, and provide direct technical guidance to engineers across levels.
You are equally comfortable writing architecture documentation and leading peer reviews as you are prototyping, writing production code, reviewing pull requests, improving build/deploy pipelines, and leading incident response when needed. You balance speed and quality through iteration and leave systems-and teams-meaningfully better through automation, instrumentation, and clear engineering standards.
**Responsibilities**
+ Lead the architecture and implementation of major capabilities across IAG services and critical platform dependencies, building software that is **scalable, secure, and operationally excellent** .
+ Set technical direction on reliability patterns, service maturity, and delivery standards, including **SLIs/SLOs** , error budgets, safe rollout strategies, backward-compatible changes, operational readiness expectations, and clear ownership boundaries between services.
+ Improve the end-to-end developer-to-production lifecycle by building and evolving:
+ CI/CD pipelines
+ Automated testing and validation
+ Infrastructure-as-code patterns
+ Deployment strategies (canary and progressive delivery)
+ Drive **observability by design** (metrics, logs, traces) and improve alerting quality, runbooks, and on-call effectiveness by reducing toil and ensuring teams have the right signals and tools to operate what they build.
+ Serve as a **technical escalation resource** and **first responder** for emergent operational work. For high-severity or technically complex production issues, lead real-time triage, mitigation, and coordination through stabilization.
+ Drive root cause analysis and durable remediation-turning incidents into engineering outcomes through fixes, automation, and a reliability backlog that measurably reduces recurrence.
+ Mentor and enable development teams by helping design operable systems, bootstrap new services, and raise the engineering bar through strong code reviews, reference implementations, and practical coaching.
+ Support security and compliance needs, including threat modeling, security reviews, and operational controls/audit readiness for regulated environments.
IC5 Career Level
**Qualifications**
+ BS in Computer Science or related field (MS preferred), or equivalent practical experience
+ **10+ years** of software development experience building and operating distributed services in production
+ Strong proficiency in one or more modern programming languages (e.g., **Java, Go, C++, Python** ) with a proven record of shipping production code
+ Proven ability to lead design and delivery of major service capabilities from concept through launch and sustained operations
+ Deep understanding of distributed systems fundamentals (data structures/algorithms, networking, concurrency, failure modes)
+ Strong knowledge of cloud architecture patterns, including **control plane and service-to-service operational design**
+ Demonstrated experience building **DevOps capabilities** : CI/CD pipelines, automated testing, deployment automation, and infrastructure-as-code
+ Strong production debugging skills across networking and persistence layers; understanding of databases and distributed persistence (SQL/NoSQL, replication, consistency tradeoffs)
+ Demonstrated experience leading **high-severity incident response** as a technical lead/escalation engineer, including rapid diagnosis, mitigation, and post-incident corrective actions
+ Strong Linux knowledge (or demonstrated ability to learn quickly in Linux-based production environments)
+ Experience partnering closely with Architects, Principals, Engineering Managers, Product, and Program/TPM leaders to deliver outcomes on time and with high quality
**Preferred Qualifications**
+ Hands-on experience developing and operating services on a public cloud platform ( **OCI strongly preferred** ; AWS/Azure also valuable)
+ Experience with container orchestration and cloud-native patterns (e.g., Kubernetes/OKE or equivalent), service mesh/API gateways, and modern identity/security patterns
+ Experience operating services across **multi-AD/multi-AZ** and/or **multi-region** footprints; strong understanding of regional resiliency strategies
+ Track record driving reliability programs such as SLO adoption, error budgets, production readiness reviews, game days, and resilience testing
+ Experience building mature CI/CD pipelines with robust testing and safe deployment strategies (canary/blue-green/progressive delivery)
+ Experience in regulated/compliance environments (e.g., **FedRAMP, PCI DSS** , or similar) and supporting audit requirements with strong operational controls
+ Expertise applying threat modeling or other risk identification techniques and translating findings into practical engineering changes
+ **Ability to obtain and maintain a U.S. Government security clearance (or currently cleared) strongly preferred** , for work in regulated environments where applicable **Responsibilities**
As a member of the software engineering division, you will take an active role in the definition and evolution of standard practices and procedures. Define specifications for significant new projects and specify, design and develop software according to those specifications. You will perform professional software development tasks associated with the developing, designing and debugging of software applications or operating systems.
Disclaimer:
**Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.**
**Range and benefit information provided in this posting are specific to the stated locations only**
US: Hiring Range in USD from: $96,800 to $251,600 per annum. May be eligible for bonus, equity, and compensation deferral.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - IC5
**About Us**
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That's why we're committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling 1-888-404-2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Job #NLX288612902