Manager, IT Operations Analytics Problem Management
Description
JOB TITLE: | Manager, IT Operations Analytics Problem Management |
SALARY RANGE: | $146,330 - $172,718 |
DEPT/DIV: | Information Technology |
SUPERVISOR: | Senior Director, Portfolio/Project Management, I&OE |
LOCATION: | 2 Broadway |
HOURS OF WORK: | 9:00 am - 5:30 pm (7.5 hours/day) or as required |
This position is eligible for teleworking, which is currently 2 days per week. New hires are eligible to apply 30 days after their effective date of hire.
Opening
The Metropolitan Transportation Authority is North America's largest transportation network, serving a population of 15.3 million people across a 5,000-square-mile travel area surrounding New York City, Long Island, southeastern New York State, and Connecticut. The MTA network comprises the nation’s largest bus fleet and more subway and commuter rail cars than all other U.S. transit systems combined. MTA strives to provide a safe and reliable commute, excellent customer service, and rewarding opportunities.
Summary
The Manager, IT Operations Analytics & Problem Management provides leadership, vision, and strategic direction for MTA's enterprise Problem Management function and IT Operations Analytics program. This role ensures recurring and systemic issues impacting IT services are rapidly identified, root causes are clearly owned and validated, and corrective actions are effectively tracked and driven to resolution in collaboration with accountable teams.
The Manager oversees the end-to-end Problem Management process across the organization, ensuring strong integration with other IT service management practices and leading cross-functional root cause analysis efforts. This role drives continuous improvement initiatives that proactively identify and mitigate operational risk, strengthening service stability, availability, and reliability. Additionally, the position leads the development and delivery of executive-level risk analytics, dashboards, and performance metrics, providing senior leadership with actionable insights across key domains including service availability, talent capacity, financial posture, State of Good Repair (SOGR), disaster recovery readiness, and cybersecurity risk. Working across technical, operational, and business domains, the Manager strengthens MTA's operational resilience through data-driven decision-making, governance, and collaboration.
This role requires deep expertise in ITIL Problem Management, operational risk assessment, and advanced analytics, along with strong leadership and stakeholder engagement skills to improve service performance, reduce downtime, and support enterprise reliability objectives.
Responsibilities
- Establishes proactive problem detection and operational risk identification practices using trend analysis, monitoring insights, cross-system data correlation, and advanced Excel modeling
- Establish and enforce standards for analytics architecture, semantic modeling, DAX optimization, ETL design, performance tuning, and dashboard governance, ensuring solutions align with enterprise data architecture and operational objectives.
- Oversee the integration of analytics solutions across multiple IT domains (Availability, SOGR, Disaster Recovery, Cybersecurity, Finance, Talent), ensuring alignment with enterprise data architecture, ServiceNow/CMDB dependencies, and reporting requirements.
- Facilitates and oversees cross-functional root cause analysis (RCA) sessions with technical teams, service owners, and business stakeholders to ensure comprehensive, sustainable remediation strategies.
- Oversees and leads the enterprise-wide Problem Management process, ensuring recurring and major incidents are investigated, root causes are owned, and corrective actions are driven to completion.
- Oversees the design, implementation, and continuous improvement of problem management processes and automation, ensuring integration with IT operations analytics, cloud-based solutions, and alignment with enterprise IT goals.
- Communicates operational risk trends, insights, and improvement opportunities to senior leadership and business stakeholders to support enterprise decision-making and strategic alignment ensuring visualizations and dashboards are designed for clarity and executive consumption using UI/UX principles and Excel outputs.
- Reviews and prioritizes problem records based on business impact, operational risk, and recurrence patterns with relevant teams.
- Leads communication with IT executives, business stakeholders, and vendors to provide updates on problem status, RCA progress, corrective actions, and long-term remediation plans.
- Partners with cross-functional IT teams to gather, validate, and present monthly availability metrics and other IT risk metrics for dashboards covering availability, SOGR, disaster recovery readiness, cybersecurity, financial, and talent risks.
- Drives responsible adoption of AI-assisted analytics and development tools (e.g., Copilot) to improve productivity, enable governed self-service analytics, and enhance insight generation.
- Manage and oversee the design, development, architecture, and maintenance of using business intelligence tools such as Power BI analytics solutions, leveraging programming languages such as SQL, Python, and formula languages such as DAX, Power Query (M), Power Pivot, and governed semantic models to ensure accuracy, scalability, high performance, and compliance.
- Function as developer and architect using programming languages, not limited to SQL and Python, wherever applicable.
- Design, develop, and manage enterprise-grade Power BI dashboards applying UI/UX principles for layout, interaction, accessibility, and visualization across IT domains, including Availability, Workforce Capacity, Financial Health, SOGR, Disaster Recovery, and Cybersecurity posture.
- Oversee and guide technical teams to function as developers using programming languages, not limited to SQL and Python, applying best practices in ETL, data modeling, integration, and solution engineering leveraging Azure/cloud platforms.
- Oversee and apply advanced Microsoft Excel capabilities—including Power Query, Power Pivot, PivotTables with calculated fields and measures, VBA macros, and advanced formulas—alongside SQL/Python analytics, to support ETL validation, reconciliation, data analysis, and executive reporting.
- Partner with IT and business stakeholders to define reporting strategies, prioritize analytics initiatives, and ensure adherence to security, privacy, and governance requirements.
- Conducts regular problem review meetings with stakeholders to prioritize high-impact issues, track remediation progress, and resolve systemic risks.
- Manage, mentor, and develop staff responsible for problem management analytics and reporting, fostering technical skill growth, Power BI/UI/UX and Excel excellence, and collaborative problem-solving.
- Ensure problem management and risk analytics processes align with ITIL best practices and enterprise service management objectives, driving continuous improvement, adoption of best practices, and integration of known errors and workarounds across IT teams.
- Participates in vendor and tool evaluations for problem management, providing direction on functionality, integration, and reporting needs to support enterprise IT risk analytics.
- Drives continuous process improvement and ensure workarounds and known errors are documented, communicated, and integrated across IT teams and business stakeholders.
- Develops and maintains process documentation, standard operating procedures, and knowledge base articles to support consistent problem resolution.
- Identifies opportunities for automation and self-service solutions within problem management and risk reporting workflows.
- Manages resource allocation, team priorities, and workload distribution to meet service levels and project objectives.
- Oversees quality assurance to ensure accuracy, reliability, and usability of problem management data, analytics outputs, and dashboards.
- Establishes and enforces policies for problem ticket prioritization, ownership, and closure, ensuring accountability across resolver groups.
- Provides senior leadership with consolidated risk and impact analysis of outstanding problem tickets, highlighting resource requirements, SLA adherence, and business benefits achieved.
- Ensures problem management metrics, CSFs, and KPIs (e.g., MTTR, recurrence rates, KEDB utilization, downtime reduction) are defined, tracked, and reported to drive continual service improvement.
- Identifies process, skills, and tool gaps across resolver groups and develop plans for training, process redesign, and capability uplift.
- Champions adoption of problem management best practices and ITIL-aligned methodologies across the IT organization, driving a culture of prevention, resilience, and data-driven decision-making.
Required for All Jobs
- Performs other duties as assigned
- Complies with all policies and standards
- May be required to work hours outside regular work hours, as applicable
- Observes the work performed by contractors, as applicable
- Reviews invoices and approves them if the work has contractual standards, as applicable
- Addresses performance issues with the contractor when possible, as applicable
- Escalates issues to other parties when needed, as applicable
Required Qualifications
- Bachelor’s degree, preferably in Computer Science, Engineering, or Information Services. An equivalent combination of education and experience may be considered in lieu of a degree.
- Minimum of 5 years of relevant experience with at least 4 years in a managerial/supervisory role and a demonstrated ability to inspire, motivate, and empower people to achieve organizational goals.
Technical Skills
- Experience with IT service management and operational platforms, including ServiceNow (reporting and CMDB analysis), IBM Maximo, and performance monitoring tools such as SolarWinds, to support IT risk, availability, and problem management analytics.
- Working knowledge of ITIL-based practices for problem management, root cause analysis, service level management, and continuous improvement, applied within enterprise reporting and risk analytics contexts.
- Designs and implements Power BI backend architecture leveraging cloud platforms (e.g., Azure), applying solution engineering principles to ensure scalable, governed, and high-performing analytics solutions.
- Ability to design, architect, and oversee frameworks, workflows, and cloud-based data models supporting enterprise dashboards, data integration, and data-driven decision-making.
- Experience using project and portfolio management tools (JIRA, SmartSheet, MS Project, Monday.com) and collaboration platforms (Confluence, SharePoint, Teams) to support governance, cross-functional coordination, and delivery oversight.
- Broad understanding of current and emerging technologies.
- Proven leadership experience in managing technical teams, establishing analytics and data governance standards, and driving enterprise adoption of reporting, risk analytics, and performance management capabilities.
- Expert-level knowledge of Power BI (Desktop & Service), including semantic modeling, DAX, Power Query (M), Power Pivot, SQL, Python, and metadata management, with the ability to function as developer and architect wherever applicable to design, review, guide, and govern complex, multi-source analytics solutions across enterprise IT domains (Availability, SOGR, Disaster Recovery, Cybersecurity, Finance, Talent).
- Design and implement Power BI dashboards applying UI/UX principles for layout, interaction, accessibility, and visualization, ensuring usability, clarity, and actionable insights across executive and operational dashboards.
- Designs and implements Power BI backend architecture leveraging cloud platforms (e.g., Azure), applying solution engineering principles to ensure scalable, governed, and high-performing analytics solutions.
- Advanced proficiency in Microsoft Excel, including Power Query, Power Pivot, PivotTables with calculated fields and measures, advanced formulas (XLOOKUP, INDEX/MATCH), data modeling, and what-if analysis, with strong alignment to Power BI and DAX-style measure logic, applied to ETL validation, financial analysis, multi-source data reconciliation, and executive reporting.
- Familiarity with AI-assisted analytics and development tools, with the ability to evaluate enterprise suitability, governance considerations, and productivity benefits.
- Strong understanding of data modeling, ETL architecture, integration patterns, performance optimization, and security models, sufficient to establish standards, approve solutions, and mentor senior technical staff.
- Proficiency in data integration, automation, and visualization using Microsoft Fabric components and Power BI, integrating data from diverse sources including SQL Server, Oracle, Azure, and ServiceNow.
- Preferred Certifications: ITIL and PMP
Leadership Skills
- Expert leadership in leading change by developing inter/intra team communication and cohesiveness; sustainment of culture and supporting staff during organizational growth/changes.
- Expert leadership in leading people by working with staff to develop systems to ensure consistent, high-quality project management discipline for all technology related initiatives and endeavors.
- Expert leadership in driving results by meeting organizational goals and customer expectations and make decisions that produce high-quality results by applying technical knowledge, analyzing problems, and calculating risks.
- Expert leadership in business acumen by providing direction on evaluation, selection, implementation, and maintenance of information systems, ensuring appropriate investment in strategic and operational systems.
- Expert leadership in building coalition by internally and externally building partnerships with key stakeholders to help achieve the MTA’s mission or common goals through influence or negations.
Behavioral Skills
- Demonstrated ability to lead teams, provide coaching and direct feedback.
- Expert in active listening, attention to detail, customer service, prioritization, and problem-solving skills.
- Expert in working independently and strategically.
- Expert in identifying and analyzing risks and developing effective mitigation strategies.
- Expert technical knowledge and diverse skillset to understand various technologies, systems, and potential risks.
- Expert in critical thinking, problem-solving, and decision-making skills.
- Expert in interpersonal and verbal and written communication skills, with the ability to effectively collaborate with both technical and non-technical peers.
- Expert in managing multiple projects simultaneously and prioritizing tasks based on urgency and impact.
- Extensive hands-on experience with related tools.
- Expert experience with working under pressure and meeting deadlines individually and collaboratively. Thinks logically, assesses problems, and is results oriented.
- Expert in identifying complex business and technology risks and associated vulnerabilities.
- Expert in communicating effectively, both orally and in writing, to interact with team members, customers, management, and support personnel (technical and non-technical).
- Expert in establishing and maintaining effective working relationships with employees at all levels within the organization, and with both internal and external customers.
Other Information
Pursuant to the New York State Public Officers Law & the MTA Code of Ethics, all employees who hold a policymaking position must file an Annual Statement of Financial Disclosure (FDS) with the NYS Commission on Ethics and Lobbying in Government (the “Commission”).
Equal Employment Opportunity
MTA and its subsidiary and affiliated agencies are Equal Opportunity Employers, including with respect to veteran status and individuals with disabilities.
The MTA encourages qualified applicants from diverse backgrounds, experiences, and abilities, including military service members, to apply.