How To Control Access in LLM Data Plus Distributed Authorization

Oso explains how to use a vector database and retrieval-augmented generation to lock data in LLMs to permissions and decouples authorization data and logic.

Apr 30th, 2024 9:49am by Susan Hall

Featued image for: How To Control Access in LLM Data Plus Distributed Authorization

Image from MaximP on Shutterstock.

If you build a large language model (LLM) to use specifically with your company data, how do you filter results so that the person querying the data only receives results that they’re authorized to see?

Filtering to protect sensitive data is one of the many emerging questions surrounding the authorization and use of LLMs.

But with a retrieval-augmented generation (RAG)-based architecture that can pull from a vector database, you can write rules similar to other authorization policies to limit who can see what, according to Graham Neray, CEO and co-founder of authorization startup Oso.

In a previous TNS article, Badrul Sarwar, co-founder and CTO of AIOps service CloudAEye, explained that though generally used to add to an LLM’s training data, RAG can harness an application’s internal data and augment an LLM’s knowledge to find the specific answer to a question.

Explained Neray: “The same mechanism that Oso uses for filtering out data, like let’s say we wanted to show the top 10 pages that you’re an owner of … with that same mechanism, you can integrate with vector databases and do the exact same kind of filtering.”

For instance, with an LLM-based app operating over Salesforce and other customer data, as part of an RAG-based architecture, developers can write rules returning information specific only to those accounts a salesperson is allowed access to.

This is generating a lot of interest in the authorization world, Neray said.

In a blog post, Oso explains how this works, supposing someone used a chatbot to ask, “Tell me what coworker X’s medical waiver says” or some other query for sensitive data. That requires the chatbot to have the same access controls as the rest of your app. The lookup entails a vector similarity search, but the same LLM must be used to evaluate the prompts as used in that vector search. That comes back to authorization.

The RAG system doesn’t know which data is considered sensitive.

Rather than create a custom database for each user, you can restrict results to only the data that the user has access to. For example, that SQL query might include `WHERE folder IN <folders_user_is_allowed_to_access>` to find the relevant pieces of content that the user can see.

“Part of the challenge in this market is that no one really understands what they’re doing. It’s not like the database market, which has been around forever,” Neray said.

“Well, if you think no one knows how to do authorization in the real world, let me tell you about the level of knowledge in the LLM world. Everyone’s still trying to figure it out. … Everyone’s trying to figure out how to do it in a way that’s not going to get them into trouble. … And it’s not some exotic solution. We’re basically saying, ‘Hey, the same mechanism that you use for filtering stuff from databases you can use for a vector database.’”

In a previous article, NGINX’s Liam Crilly described that kind of restriction on LLMs as something AI gateways will take care of, as well as a whole lot more.

“The gateway handles both authentication and zero trust, serving as the gatekeeper for AI services and API access. It also provides an authorization layer to make sure that only approved users can access specific services or that services are approved to be consumed according to defined policies. Policies might restrict use based on geography, business unit, role, infrastructure provider or type of infrastructure,” he explained.

An AI gateway will have implications for RAG and vector databases, according to Alex Salkever, author and adviser to NGINX.

“Some of it might also be governing or protecting against AI, security issues and policing prompts from the outside. Some is centralizing API governance and simplifying for developers,” he said in an email.

Those are topics NGINX is expected to be contributing more articles about.

Decoupling Authorization Data and Logic

Oso originally set out to unbundle authentication and authorization and provide developers with an easy-to-use authorization service in the vein of Stripe for payments or Twilio for communications.

Authorization remains a hard problem, especially in a microservices world where there are so many moving parts. While Google’s Zanzibar approach, which requires companies to centralize and copy all relevant authorization data into a single authorization service in advance has been de rigueur for authorization, Oso has called it out as too much work and too error-prone.

Oso recently released what it’s calling Distributed Authorization. Rather than having to sync and centralize all your data, with Distributed Authorization you can centralize only your common authorization data and leave the rest in the databases where it currently resides. Oso then integrates with those databases and stitches all the data together on the backend.

Neray likened that centralized approach to putting together a jigsaw puzzle.

“In a monolithic world, all those puzzle pieces are in one database. In a microservices world or world where we’re using LLMs, they tend to be in multiple different services and multiple different databases,” he explained. And most companies don’t have the resources of Google.

“So companies outside of Google that want to implement this model where they’re centralizing all this data, they set up all these complicated systems to sync and copy and reconcile and dedupe and manage drift between their core services and the central authorization systems,” he said.

“The average engineer working on this problem wants to set up that infrastructure like they want to hole in the head. So what we’ve built is a system that allows those engineers to centralize only the data that is shared across all these services … then leave the rest of the data in all these microservices, and we stitch it all together for them in the background.”

Oso developer advocate Greg Sarjeant explained in a blog post:

“Our application got brittle because the application logic and the authorization logic were intertwined.”

“… What if, instead, you could centralize your logic and your common authorization data (things like roles) in the authorization service, and then distribute the evaluation of authorization questions between the server and the client? That’s Distributed Authorization. Instead of only answering `yes` or `no`, Oso Cloud can now respond to an authorization question with `yes, if`.”

A list of conditions follows that `yes, if`. Those conditions are evaluated by the client using your local data. Basically, the Oso Cloud service evaluates as much of the request as it knows, then hands off the request to the client in your application to make a final determination.

‘[To get] the ability to provide this enterprise-grade authorization without having to set up all this syncing infrastructure is a huge win for [platform teams].’
— Graham Neray, Oso CEO and co-founder

Neray maintains that integrating with data closer to its source is more secure than moving and trying to sync it.

“By moving to a model like this, you get what is effectively a deterministic approach. So now people can actually test their authorization logic, they can log it, they can audit it, they can see that it’s working. And it becomes this like discrete part of their stack in the same way that a database would be, instead of it being some sort of 30 nested IF statements of code,” he said, adding that it’s a boon for platform engineering teams.

“These teams are under an immense amount of pressure; they’re being asked to do more for the business. They’re definitely not getting an additional headcount. They need to cut their data Datadog bill. They’re on call and they’re doing all these things. And the last thing that they want to be doing is also setting up a bespoke data syncing infrastructure, which is notoriously challenging and error-prone. … [To get] the ability to provide this enterprise-grade authorization without having to set up all this syncing infrastructure is a huge win for them. It’s a huge win in terms of time, it’s a huge win in terms of risk. And it’s a huge win in terms of what they can provide to their internal customers.”

Customer service technology vendor Intercom has been using Oso since before it had a commercial product. Senior principal engineer Brian Scanlan praised Oso as “building for builders, and going after one specific problem.”

“Most of the adjacent technology stacks were either trying to do too much (all of authorization and authentication for your application — we didn’t want to have to do a large overhaul of all of our user management), didn’t have compelling developer documentation or didn’t have easy ways to get started with their library/tool,” he said in an email.

“The promise of Oso’s approach to authorization has delivered [for us]. We have removed practically all sources of authorization bugs in our application, removing a lot of high-pressure work from our teams and giving us more confidence to build more features based on strict authorization controls. We haven’t really touched Oso in a couple of years. It is a largely solved problem for us, which is exactly what you want from a fundamental building block of modern multitenant SaaS applications with sophisticated authorization needs.”

Susan Hall is the Sponsor Editor for The New Stack. Her job is to help sponsors attain the widest readership possible for their contributed content. She has written for The New Stack since its early days, as well as sites...