Adding OAuth2 to Jupyter Notebooks on Kubernetes
TrueFoundry users can deploy Jupyter Notebooks on their personal cloud accounts, such as AWS, Azure, or GCP. This feature allows them to conduct machine learning experiments and training jobs on their own machines with ease. Initially, notebooks deployed through TrueFoundry were secured using a username-password combination. However, in response to widespread client requests, we have integrated Single Sign-On. This means users can now conveniently access their notebooks with the same login they use for TrueFoundry. This blog post delves into the specifics of how we implemented this feature.
Notebooks on TrueFoundry
TrueFoundry internally uses a fork of Kubeflow Notebook Controller to orchestrate the deployment of notebooks. The controller provides various features that we leverage, like:
- Simplified notebook spec: The Kubeflow Notebook APIs are simple and the controller orchestrates the creation of the Jupyter Notebook deployments.
- Automatic Culling: The controller automatically shuts down the notebook after a certain period of inactivity. This is incredibly useful to our clients who run experiments on notebooks backed by GPU machines.
- Persistent Home Directory: The controller takes care of creating a persistent volume that saves user progress on the notebook across sessions.
- Extensible base images: The controller supports a suite of base notebook images of Jupyter Notebook and VS Code maintained by TrueFoundry. The user can extend the features on these Docker images by adding a startup script or installing specific libraries.
For context, here’s what a simple Kubeflow Notebook object looks like:
apiVersion: kubeflow.org/v1
kind: Notebook
metadata:
name: my-notebook
spec:
template:
spec:
containers:
- name: my-notebook
image: kubeflownotebookswg/jupyter:master
args:
[
"start.sh",
"lab",
"--LabApp.token=''",
"--LabApp.allow_remote_access='True'",
"--LabApp.allow_root='True'",
"--LabApp.ip='*'",
"--LabApp.base_url=/test/my-notebook/",
"--port=8888",
"--no-browser",
]
Basic Auth for Notebooks
Before implementing OAuth2, TrueFoundry provided users with the option to enhance the security of their public notebooks by integrating basic authentication. This added layer of security was crucial to ensure that only authorized individuals could access the sensitive content of these notebooks. To implement this feature, TrueFoundry utilized the capabilities of WebAssembly (Wasm) plugins within the Istio proxy, specifically the Envoy proxy.
Istio, an open-source service mesh, offers a framework for managing network communications between various service workloads. With Istio, TrueFoundry was empowered to inject custom logic directly into the network layer, which is managed by the Envoy proxy. This approach allowed for effective control and security of the traffic flowing to and from their Jupyter Notebooks. The key to the implementation of basic auth was the WasmPlugin, a feature of Istio that facilitates the deployment of WebAssembly modules within the Envoy proxy.
This basic authentication WasmPlugin is integrated into a sequence of network filters within the Envoy proxy. These filters enable the execution of higher-level functions related to access control, transformation, data enrichment, auditing, and more, thereby enhancing the overall security and functionality of the service mesh. Here’s a simplified version of the spec for adding basic auth filter to the Envoy filter chain:
apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
name: basic-auth
namespace: istio-ingress
spec:
phase: AUTHN
pluginConfig:
basic_auth_rules:
- credentials:
- user:pass
hosts: www.example.com
prefix: /secret/
selector:
matchLabels:
istio: ingressgateway
url: oci://ghcr.io/istio-ecosystem/wasm-extensions/basic_auth:1.12.0
OAuth2 for Notebook
For implementing OAuth2 in our notebooks, we utilized an Envoy filter, but the approach differed from that of basic authentication. Unlike the basic auth where we could conveniently insert a pre-built WasmPlugin into the filter chain, OAuth2 required a more tailored solution. To achieve this, we employed an HTTP filter specifically designed for OAuth. At TrueFoundry, our Single Sign-On system integrates with FusionAuth, serving as our OAuth provider.
Here’s how the Envoy Filter spec looks like – refer to the comments in the file for more details:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: truefoundry-notebook-tfy-oauth2 # Name of the EnvoyFilter
namespace: auth-test # Namespace where the EnvoyFilter is deployed
spec:
workloadSelector:
labels:
truefoundry.com/application: truefoundry-notebook # Selector targeting workloads with specific labels
configPatches:
- applyTo: CLUSTER
match:
context: SIDECAR_OUTBOUND
patch:
operation: ADD
value:
name: tfy-oauth2 # Name of the cluster for OAuth2 authentication service
type: LOGICAL_DNS # Type of service discovery (DNS)
connect_timeout: 5s # Timeout for establishing a connection
lb_policy: ROUND_ROBIN # Load balancing policy
# other load balancing config
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: "envoy.filters.network.http_connection_manager"
subFilter:
name: envoy.filters.http.jwt_authn
patch:
operation: INSERT_BEFORE # Inserting this filter before the JWT auth filter
value:
name: envoy.filters.http.tfy-oauth # Name of the OAuth filter
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.oauth2.v3.OAuth2
config:
use_refresh_token: false # Whether to use a refresh token
pass_through_matcher:
- name: Authorization
present_match: true # Pass through if Authorization header is present
forward_bearer_token: true # Forward bearer token to upstream
auth_type: BASIC_AUTH # Type of authentication used
token_endpoint:
cluster: tfy-oauth2 # Cluster for token endpoint
uri: <token-endpoint-uri-of-oauth-provider>
timeout: 5s # Timeout for token endpoint
authorization_endpoint: <authorization-endpoint-uri-of-oauth-provider>
redirect_uri: https://%REQ(:authority)%/truefoundry-notebook/_auth/callback # Redirect URI for callback
redirect_path_matcher:
path:
exact: /truefoundry-notebook/_auth/callback # Path for redirect URI
signout_path:
path:
exact: /truefoundry-notebook/_auth/signout # Path for signout
credentials:
client_id: <client-id-for-oauth>
token_secret:
# configuration to fetch token secret
# read more about how we fetch secrets here:
# https://www.envoyproxy.io/docs/envoy/latest/configuration/security/secret
hmac_secret:
# configuration to fetch hmac
When a user attempts to access a service protected by the OAuth2 filter for the first time, they are redirected to the authorization_endpoint
. This endpoint is the URL of our external OAuth Provider, which, in our implementation, is the FusionAuth-based TrueFoundry login modal. This redirection is a critical step in the OAuth process, guiding users to a secure location where they can authenticate and consequently grant the necessary permissions for access to the service.
Once the login is complete, FusionAuth will redirect you to the redirect_uri
(configured in the filter specification), adding a secret, temporary authorization code there. This request is intercepted by the filter and it makes a request to token_endpoint
, exchanging the code for a JWT token. Finally, the filter sets cookies with the JWT token.
Subsequent accesses to the service are passed through the HTTP Filter since the cookie sets the Authorization
header with JWT as the value. The filter is configured to pass through such requests (refer pass_through_matcher
in the spec). To validate that the JWT is a valid token, we create a RequestAuthentication policy that will check with the OAuth provider:
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
# ...
spec:
selector:
# ...
jwtRules:
- issuer: "truefoundry.com"
fromHeaders:
- name: Authorization
prefix: "Bearer "
audiences:
- <client-id>
jwksUri: <oauth-provider-jwks-uri>
forwardOriginalToken: true
Finally, we add the Authorization Policy that specify what requests to apply RequestAuthentication
to. We want to apply authorization to all requests on port 8888:
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: best-notebook-tfy-oauth2
namespace: auth-test
spec:
selector:
matchLabels:
truefoundry.com/application: best-notebook
action: DENY
rules:
- from:
- source:
notRequestPrincipals: ["*"]
to:
- operation:
ports:
- "8888"