SageMaker transport for the Deepgram Python SDK
Project description
Deepgram SageMaker Transport
SageMaker transport for the Deepgram Python SDK. Uses AWS SageMaker's HTTP/2 bidirectional streaming API as an alternative to WebSocket, allowing transparent switching between Deepgram Cloud and Deepgram on SageMaker.
Requires Python 3.12+ (due to AWS SDK dependencies).
Installation
pip install deepgram-sagemaker
This installs aws-sdk-sagemaker-runtime-http2 and boto3 automatically.
Usage
The SageMaker transport is async-only and must be used with AsyncDeepgramClient:
import asyncio
from deepgram import AsyncDeepgramClient
from deepgram.core.events import EventType
from deepgram_sagemaker import SageMakerTransportFactory
factory = SageMakerTransportFactory(
endpoint_name="my-deepgram-endpoint",
region="us-west-2",
)
# SageMaker uses AWS credentials (not Deepgram API keys)
client = AsyncDeepgramClient(api_key="unused", transport_factory=factory)
async def main():
async with client.listen.v1.connect(model="nova-3") as connection:
connection.on(EventType.MESSAGE, lambda msg: print(msg))
await connection.start_listening()
asyncio.run(main())
Configuration
For burst-tuned timeouts and retry behavior, build a SageMakerConfig and pass
it via config=:
from deepgram_sagemaker import SageMakerConfig, SageMakerTransportFactory
config = SageMakerConfig(
endpoint_name="my-deepgram-endpoint",
region="us-east-2",
connection_timeout=5.0,
connection_acquire_timeout=15.0,
)
factory = SageMakerTransportFactory(config=config)
All time-based fields are float seconds (matching asyncio convention).
| Parameter | Required | Default | Description |
|---|---|---|---|
endpoint_name |
Yes | — | SageMaker endpoint name |
region |
No | us-west-2 |
AWS region |
connection_timeout |
No | 30.0 |
Max time for the underlying TCP/TLS connect (AWS default is ~2 s — bumped here so cold-start endpoints under burst load have time to accept TLS handshakes). |
connection_acquire_timeout |
No | 60.0 |
Max time to acquire a connection from the underlying HTTP/2 pool (AWS default is ~10 s — bumped so a 200–500-stream burst doesn't drain the acquire pool). |
subscription_timeout |
No | 60.0 |
Max time the transport waits for the SageMaker bidi stream to open before failing. A timeout here is treated as a transient connect failure and counts against max_retries / retry_budget. |
max_concurrency |
No | 500 |
Cap on simultaneous in-flight HTTP/2 streams. Advisory in Python today (the underlying smithy HTTP/2 stack does not expose a hard cap), but kept for surface parity with the Java transport. |
max_retries |
No | 5 |
Max retries on transient AWS errors (throttling, pool-exhausted, transient connect/timeout). Set to 0 to disable internal retry. Terminal errors (auth, validation) bypass this. |
initial_backoff |
No | 0.1 |
First backoff delay applied after the initial failure. |
max_backoff |
No | 5.0 |
Cap on the per-attempt backoff delay regardless of multiplier. |
backoff_multiplier |
No | 2.0 |
Exponential growth factor between retry attempts. Must be >= 1.0. |
retry_budget |
No | 30.0 |
Total wall-clock cap across all retry attempts before giving up and surfacing the error to the application. |
max_replay_buffer_bytes |
No | 8 * 1024 * 1024 |
Cap on the in-memory replay buffer that holds sent-but-unacked stream events. Set to 0 to disable replay (sent events are dropped on internal reset). |
High-concurrency notes
The transport's defaults are tuned for high-burst workloads (large numbers of streams opened in a tight loop against an endpoint that may need to scale up). If you're opening 200–500 streams simultaneously against a cold endpoint, the AWS SDK's general-purpose defaults (~2 s connect, ~10 s acquire) will fire before the load balancer has accepted all of the inbound TLS handshakes — you will see a wave of acquire / connect timeouts that look like server-side problems but are really client-side fail-fast tripping early.
This transport ships with more lenient defaults (30 s / 60 s) so the common high-concurrency path works out of the box. Tighten them if you need fail-fast behavior in low-latency pipelines:
config = SageMakerConfig(
endpoint_name="my-deepgram-endpoint",
region="us-east-2",
connection_timeout=5.0,
connection_acquire_timeout=15.0,
)
Retry & storm absorption
Transient AWS-side failures (ThrottlingException, connection-pool exhaustion,
transient connect/timeout failures) are absorbed by the transport itself:
classified as retryable, retried with jittered exponential backoff up to
max_retries and retry_budget, with messages buffered during the reset
window replayed onto the new stream so audio isn't dropped. Only terminal
errors (auth, validation, resource-not-found) and budget-exhausted retryable
errors propagate to the application.
config = SageMakerConfig(
endpoint_name="my-deepgram-endpoint",
max_retries=10,
initial_backoff=0.2,
max_backoff=10.0,
retry_budget=60.0,
)
Set max_retries=0 to disable internal retry entirely (every transient AWS
error then surfaces immediately to the application).
AWS Credentials
The transport resolves AWS credentials using boto3's credential chain:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - Shared credentials file (
~/.aws/credentials) - IAM role (EC2, ECS, Lambda)
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepgram_sagemaker-0.3.0.tar.gz.
File metadata
- Download URL: deepgram_sagemaker-0.3.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d6510558f995b8f27c60f35244e23d3ba3566fe055af7cdb6dfb81d895923e6
|
|
| MD5 |
f13c0f96b088f4b68be3fc5744d425f7
|
|
| BLAKE2b-256 |
be9f41b8eec4c7d2e81d768db8218c313c7179e2166f9466d05407b27c4e7f30
|
Provenance
The following attestation bundles were made for deepgram_sagemaker-0.3.0.tar.gz:
Publisher:
release-please.yml on deepgram/deepgram-python-sdk-transport-sagemaker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deepgram_sagemaker-0.3.0.tar.gz -
Subject digest:
7d6510558f995b8f27c60f35244e23d3ba3566fe055af7cdb6dfb81d895923e6 - Sigstore transparency entry: 1693129801
- Sigstore integration time:
-
Permalink:
deepgram/deepgram-python-sdk-transport-sagemaker@d7a45ce0d6ffbd511557b6fe6b22ff1b67603c82 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/deepgram
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@d7a45ce0d6ffbd511557b6fe6b22ff1b67603c82 -
Trigger Event:
push
-
Statement type:
File details
Details for the file deepgram_sagemaker-0.3.0-py3-none-any.whl.
File metadata
- Download URL: deepgram_sagemaker-0.3.0-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b33c3229b35aa431244388c20b5514fd28ac99fe679278d537d2501c77fe659e
|
|
| MD5 |
810d0cd1c1f893e1d7c68f4d0510ea55
|
|
| BLAKE2b-256 |
3bd42c4b876b06e111f71451d10609d0d1fc575b1c763bc237c84ced623cb1fe
|
Provenance
The following attestation bundles were made for deepgram_sagemaker-0.3.0-py3-none-any.whl:
Publisher:
release-please.yml on deepgram/deepgram-python-sdk-transport-sagemaker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deepgram_sagemaker-0.3.0-py3-none-any.whl -
Subject digest:
b33c3229b35aa431244388c20b5514fd28ac99fe679278d537d2501c77fe659e - Sigstore transparency entry: 1693129911
- Sigstore integration time:
-
Permalink:
deepgram/deepgram-python-sdk-transport-sagemaker@d7a45ce0d6ffbd511557b6fe6b22ff1b67603c82 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/deepgram
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@d7a45ce0d6ffbd511557b6fe6b22ff1b67603c82 -
Trigger Event:
push
-
Statement type: