AI in Nepali Banking: A Compliance Guide for NRB-Regulated Deployments

Deploying AI in a Nepali bank looks deceptively similar to deploying it anywhere else — until you hit the Nepal Rastra Bank (NRB) compliance review and discover the architecture you’ve designed sends customer data through three jurisdictions before producing an answer. This is the angle most cloud-first AI advice misses for the Nepal banking sector. Here’s the version that survives the review.

What NRB cares about (and what it doesn’t)

NRB’s directives are not specifically about AI. They’re about banking data. The rules that apply to AI implementation in a bank are the same rules that have always applied:

Customer data residency. Customer data and transaction records must remain within agreed jurisdictions. Cross-border processing is restricted and requires specific approvals.
Auditability. Every decision that affects a customer — credit, AML flag, fraud block — must be traceable to a defined process and a defined approver.
Data minimisation. Only the data necessary for the function should be processed.
Outsourcing controls. Material outsourcing (including to cloud providers and AI vendors) requires governance, contracts, and the right to audit.

None of these are AI-specific. They’re regulator basics. The AI part is hard because most AI tooling — public LLM APIs, hosted vector DBs, cloud GPU services — assumes you can ship data anywhere. In a regulated Nepali bank, you can’t.

The architecture decision that drives everything else

There’s one decision at the start of any AI banking project in Nepal that determines the rest of the build: where does the model run?

Three options:

1. Public cloud LLM API (OpenAI, Anthropic, Google)

Simplest to build, hardest to defend in a regulator review. Customer data leaves the country, leaves the bank’s control, and lands in a vendor’s data center subject to that vendor’s policies. Defensible only with: (a) explicit NRB approval; (b) contractual data-handling guarantees; (c) data minimisation so heavy that what crosses the border isn’t really customer data anymore.

This works for narrow use cases — public-facing customer service chat on public product information, marketing copy generation, internal employee productivity tools — where no real customer data ever reaches the model.

2. Cloud-hosted but in-region

Major cloud providers offer in-region inference (AWS Bedrock in Mumbai, Azure OpenAI in Singapore). Closer to compliance, but Nepal is not yet a region for any major cloud, so the data still crosses a border. Worth pursuing if NRB has approved the specific region for the specific bank.

3. On-prem or sovereign-cloud deployment

The defensible default for material AI applications in Nepali banking. Run open-weight models (Llama 3.x, Mistral, DeepSeek, or specialised smaller models) on hardware located in Nepal or in a NRB-approved jurisdiction. The bank controls the inference path end-to-end. Compliance is provable; auditability is straightforward; data never leaves a known boundary.

On-prem AI used to be impractical. As of 2026 it isn’t. The open-weight model landscape has matured to the point where a 70B-class model running on a single 4×H100 box can serve most retail-banking use cases at acceptable latency.

Where AI actually earns its keep in a Nepali bank

Five use cases where AI implementation creates measurable value inside the regulatory boundary:

1. Document processing. Loan applications, KYC docs, account-opening forms — pulling structured data out of unstructured paperwork. An on-prem vision-language model with a small fine-tune is the right tool. Reduces processing time per application from hours to minutes.

2. AML and fraud triage. A classifier (not a generative model) ranks transactions by suspicion score, and an LLM summarises the case for the analyst. The analyst still makes the call — the AI just removes the busywork.

3. Customer service triage. Internal-only: incoming complaint emails get classified, summarised, and routed. The agent gets a draft response. The customer never talks to the model directly.

4. Internal knowledge retrieval. RAG over policy documents, product specs, regulatory circulars. Branch staff and operations teams get instant answers to “what’s the current process for X?” instead of digging through PDFs.

5. Operational forecasting. Cash demand at branches, ATM stocking, call-center staffing. These are time-series problems and they don’t need the latest LLM — but they’re where Operational Automation creates the most measurable cost savings.

Notice what’s not on the list: AI making customer-facing credit decisions. That’s where the regulatory bar is highest and the model risk is real. Get the operational wins first; build to AI-augmented credit slowly, with explicit NRB engagement.

The audit trail every AI deployment in banking needs

Whatever architecture you pick, the audit requirement is non-negotiable:

Every inference call must be logged with: input, output, model version, timestamp, user / system context.
Every decision an AI system contributes to must be traceable: which inference influenced which decision, what the human reviewer concluded, what the final action was.
Logs must be retained per NRB requirements and protected from tampering.

This isn’t optional. When NRB asks “show me how this credit decision was made,” you need to produce the full chain — including any AI input — within minutes. If you can’t, the AI system is a liability, not an asset.

The right pattern: per-inference observability from day one (the same discipline we apply to any AI in production), with an additional banking-specific audit layer that links inferences to downstream decisions and the human in the loop.

Vendor and outsourcing controls

If you’re using any external AI service — even just an embedding API, even just a vector database — it’s an outsourcing arrangement under NRB rules. Which means:

Written contracts specifying data handling, security, and audit rights.
Risk assessment documented before onboarding.
Annual review.
Right to terminate and migrate without penalty.

Most “free tier” or “self-serve” AI SaaS contracts don’t meet these requirements out of the box. Either move to enterprise contracts that do, or self-host.

The architecture pattern we deploy

For Nepali banks evaluating AI seriously, the pdpspectra default architecture is:

Inference layer: on-prem GPU server(s) running open-weight models, optionally with a smaller in-region cloud fallback for burst.
Data layer: all PII stays in the bank’s existing data warehouse (Postgres or ClickHouse), with vector embeddings stored alongside. No customer data leaves the bank’s network for inference.
Orchestration layer: Airflow or equivalent for batch jobs; FastAPI or a managed gateway for real-time inference. All calls go through a logging and policy layer.
Observability: per-inference logging into an audit-grade store, with retention policies aligned to NRB requirements.

The pattern is boring on purpose. Boring survives audits.

If you’re scoping AI for a Nepali bank and the architecture has to pass NRB review before it can ship, we’ve done this before. Talk to us about Banking Automation in Nepal, read our five AI use cases for Nepali banks, or tell us what you’re building.