Laurent Giraid is a seasoned technologist who has spent years navigating the intersection of machine learning, natural language processing, and the ethical architecture of artificial intelligence. As financial institutions increasingly lean into automated decision-making, Giraid has become a leading voice on how to balance rapid innovation with the heavy regulatory demands of the banking and treasury sectors. His perspective is particularly relevant now, following the US Treasury’s release of a structured Guidebook for managing AI risks, a document born from the collaboration of over 100 financial institutions and technical bodies. In this discussion, we explore the shift from generic AI standards to sector-specific controls, the nuances of evaluating non-deterministic models, and the roadmap for firms moving from experimental AI to fully embedded, mission-critical systems.
The following conversation examines the evolution of the Financial Services AI Risk Management Framework and how it addresses unique sector challenges like algorithmic bias, cyber vulnerabilities, and the inherent unpredictability of large language models. We delve into the practicalities of the four stages of AI maturity—initial, minimal, evolving, and embedded—and discuss why a centralized repository for tracking incidents is no longer optional for firms aiming to maintain public trust.
General risk frameworks often lack the granular detail required for highly regulated sectors. How do these finance-specific controls improve upon broader standards, and what are the first steps for a firm to integrate them into existing compliance workflows?
While the NIST AI Risk Management Framework provides a solid foundational philosophy, it often feels too abstract for a compliance officer at a major bank who needs to answer to specific regulatory expectations. The Financial Services AI Risk Management Framework (FS AI RMF) bridges this gap by introducing 230 specific control objectives that are explicitly tailored to the operational realities of banking. These controls are organized into four primary functions—govern, map, measure, and manage—which allow a firm to move beyond high-level “safety” goals and into actionable technical requirements. For a firm looking to integrate these, the very first step is utilizing the AI adoption stage questionnaire provided in the Guidebook. This isn’t just a checklist; it’s a strategic diagnostic tool that evaluates business impact, data sensitivity, and third-party dependencies to determine where the firm actually stands. By doing this, an institution can avoid the “noise” of irrelevant controls and focus strictly on the 230 objectives that correspond to their specific level of deployment and risk exposure.
Unlike traditional software, AI systems produce variable outputs based on context, which complicates risk assessment. How should institutions adapt their validation processes for these non-deterministic behaviors, and what specific metrics can prove a model is consistently reliable?
Traditional software validation is built on the idea of determinism—if you provide “Input A,” you must always get “Output B.” AI, particularly large language models, shatters this paradigm because the behavior can be difficult to interpret or predict even with identical inputs. To manage this, institutions must shift their focus toward a lifecycle evaluation that prioritizes “trustworthy AI” principles such as validity and reliability. This involves stress-testing the model under diverse contexts to see how the variance in output affects safety and security. We look at metrics that measure the degree of “drift” in decision-making and the resilience of the system against adversarial cyber threats that might exploit this inherent variability. Essentially, the framework pushes teams to document evidence of explainability, ensuring that even if the output varies, the underlying logic remains transparent and defensible to regulators.
Organizations often move through distinct stages of AI adoption, from isolated experiments to deeply embedded systems. How can a firm accurately assess its current maturity level, and what specific governance changes must be prioritized as AI moves into core business operations?
The Guidebook categorizes maturity into four distinct buckets: initial, minimal, evolving, and embedded. A firm at the “initial” stage might only be considering AI without any operational deployment, whereas an “embedded” firm has AI playing a significant role in core business operations and high-stakes decision-making. To accurately assess where they sit, leadership needs to look at factors like whether they are using third-party AI providers or if they are handling highly sensitive customer data in a customer-facing role. As a firm migrates from the “minimal” stage—where AI is isolated in low-risk areas—to the “evolving” or “embedded” stages, the governance must become much more robust. The priority shifts from simple experimentation to rigorous data quality management and operational resilience. At these higher levels, the framework introduces additional, more stringent controls because the reputational and financial damage of a failure is exponentially higher when the AI is woven into the very fabric of the institution.
Maintaining transparency and fairness is critical when AI affects customer outcomes or regulatory standing. What specific methods can teams use to make complex model decisions explainable, and how do you prevent algorithmic bias from creeping into sensitive data sets?
Preventing algorithmic bias is a continuous battle that requires both technical monitoring and proactive governance. The FS AI RMF emphasizes fairness and bias monitoring as core control objectives, suggesting that institutions must maintain a clear trail of evidence regarding how their models process sensitive data. One of the most effective methods is to implement explainability protocols that break down a complex model’s “black box” decisions into understandable factors that have regulatory relevance. This involves scrutinizing the data sets used for training to ensure they aren’t reinforcing historical inequities, which is where “creeping” bias often starts. By aligning these efforts with the “measure” and “manage” functions of the framework, teams can detect failures in fairness early. It is about creating a system where a decision that impacts a customer can be traced back to a logical, unbiased data point, satisfying both the customer’s need for fairness and the regulator’s demand for accountability.
Effective governance requires collaboration between technology teams, risk officers, and compliance specialists. What strategies help align these diverse units, and what are the practical benefits of maintaining a centralized repository for tracking and analyzing AI-related incidents?
Alignment between these diverse units is often the biggest hurdle, as developers want to move fast while risk officers are paid to be cautious. The strategy here is to use the FS AI RMF as a “common language” that everyone—from the technical teams to senior leaders—can understand and use to evaluate risk consistently. One of the most practical recommendations in the Guidebook is the creation of a central repository for tracking AI-related incidents. This isn’t just for record-keeping; it acts as a diagnostic hub that allows technology and compliance teams to analyze failures in real-time and improve governance over time. Having a single source of truth for every incident helps the organization detect patterns of failure that might be missed if departments are working in silos. Ultimately, this centralized approach builds institutional confidence, allowing the firm to innovate more aggressively because they know they have the infrastructure to catch and learn from mistakes.
What is your forecast for AI risk management in the financial sector?
I believe we are entering an era where AI risk management will no longer be treated as a separate IT function, but as a foundational pillar of institutional stability, much like liquidity or credit risk. As AI technologies continue to develop at a breakneck pace, the financial institutions that thrive will be those that keep their risk governance in a state of constant evolution, moving in lockstep with technological gains. We will see a shift toward more automated compliance, where the 230 control objectives mentioned in the framework are monitored by AI systems themselves, creating a self-correcting loop of governance. However, the human element will remain the ultimate backstop. For our readers, my advice is to embrace the structure provided by frameworks like the FS AI RMF early on; don’t wait for a regulatory failure to build your governance. Those who build transparent, explainable, and accountable systems today will not only avoid the scrutiny of tomorrow but will also win the long-term trust of their customers in an increasingly automated world.
