FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy#576
FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy#576bewithgaurav wants to merge 13 commits into
Conversation
2e81b7f to
cd80d9c
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds bulk copy support for Authentication=ActiveDirectoryServicePrincipal by registering an Entra ID token-factory callback (invoked mid-handshake) so the tenant can be derived from the STS URL returned by the server, aligning bulk copy behavior with ODBC’s native Service Principal handling for the normal query path.
Changes:
- Added
AuthType.SERVICE_PRINCIPALand updated auth-type extraction/processing to recognize Service Principal mode. - Implemented
ServicePrincipalAuth.make_token_factory()and_parse_tenant_id()to generate UTF-16LE JWTs during FedAuth handshake. - Updated
cursor.bulkcopy()to registerentra_id_token_factoryfor Service Principal auth; added tests and a changelog entry.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
mssql_python/constants.py |
Adds SERVICE_PRINCIPAL enum value for connection-string auth detection. |
mssql_python/auth.py |
Adds tenant parsing + token-factory callback for Service Principal; updates auth detection/mapping. |
mssql_python/cursor.py |
Registers the callback for bulkcopy when _auth_type == "serviceprincipal"; preserves credentials for py-core auth resolution. |
tests/test_008_auth.py |
Adds unit tests for tenant parsing, factory behavior, and new auth-type detection. |
CHANGELOG.md |
Documents new bulk copy Service Principal support and notes py-core requirement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/auth.pyLines 198-207 198 from urllib.parse import urlparse
199
200 try:
201 parsed = urlparse(sts_url)
! 202 except (ValueError, AttributeError):
! 203 return None
204 # Reject anything that isn't an https URL with a netloc. ``urlparse`` will
205 # happily put a bare string like ``"tenant-guid"`` into ``path``, which
206 # would then look like a valid tenant. Azure AD STS URLs are always https.
207 if parsed.scheme != "https" or not parsed.netloc:Lines 210-218 210 if not path:
211 return None
212 first_segment = path.split("/", 1)[0]
213 if not first_segment:
! 214 return None
215 if first_segment.lower() in _RESERVED_TENANTS:
216 return None
217 return first_segmentLines 249-258 249 try:
250 from azure.identity import ClientSecretCredential
251 from azure.core.exceptions import ClientAuthenticationError
252 from azure.core.pipeline.transport import RequestsTransport
! 253 except ImportError as e:
! 254 raise RuntimeError(
255 "Azure authentication libraries are not installed. "
256 "Please install with: pip install azure-identity azure-core"
257 ) from e📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 59.9%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.connection.connection.cpp: 76.2%
mssql_python.pybind.ddbc_bindings.cpp: 76.2%
mssql_python.row.py: 76.9%
mssql_python.__init__.py: 77.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.logging.py: 85.5%
mssql_python.connection.py: 85.6%🔗 Quick Links
|
Wires a Python token-factory callback into the mssql-py-core connection context so bulk copy can authenticate with `Authentication=ActiveDirectoryServicePrincipal`. The callback is invoked by mssql-tds mid-handshake (FedAuth workflow 0x02), receives the STS URL from the server, parses the tenant_id from it, and uses `azure.identity.ClientSecretCredential` to acquire a JWT. Necessary because tenant_id is not known client-side until the server returns it during the handshake, so the pre-acquired-token model (Model A) used by Default / DeviceCode / Interactive / MSI cannot be used here. Builds on the shared module-level `_credential_cache` introduced for MSI in #573, keyed by ("serviceprincipal", tenant_id, client_id), so SP gets the same per-instance token reuse semantics as the other AD methods. `client_secret` is intentionally not in the cache key; credentials are looked up by identity, not secret. Other behaviors: - `_parse_tenant_id` rejects non-https / non-URL inputs so a malformed STS URL cannot silently become a tenant id. - Empty SPN from the server is rejected early with a clear message rather than producing scope="/.default". - The surfaced `RuntimeError` is intentionally generic; provider message stays in `logger.error` only so any sensitive provider text does not reach the user-facing exception chain. - Bulk copy `finally` cleanup also pops `entra_id_token_factory` from pycore_context. Tests (in tests/test_008_auth.py): - 7 `TestParseTenantId` cases: GUID/domain tenants, query string, extra path segments, empty, no path, bare-string rejection, path-only URL rejection, http scheme rejection. - 11 `TestServicePrincipalAuth` cases: factory shape, missing client_id/secret, UTF-16LE return, credential kwargs forwarded to `ClientSecretCredential`, scope construction with/without `/.default` suffix, unparseable STS URL, authentication error propagation, no provider-message leak into chained exception, empty-SPN rejection, per-tenant credential caching (asserts 1 construction across 3 calls for one tenant, 2 across two tenants). - `TestProcessAuthParameters` cases: SP leaves ODBC path alone, case-insensitive recognition. - `TestExtractAuthType.test_serviceprincipal`. Requires mssql-py-core 0.1.5+ for the `entra_id_token_factory` dict key wiring (ADO mssql-rs PR 7542). py-core pin bump will land in the release bundle. Partial fix for #534. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
8dcff3a to
28f9569
Compare
Without this, a rotated client_secret was silently masked for up to ~1 hour: the (tenant_id, client_id) cache key matched the previously cached ClientSecretCredential, and azure-identity's internal token cache kept returning the token issued under the OLD secret until expiry. A user updating PWD in the connection string after an external secret rotation would observe bulkcopy continuing to work with the stale credential. Fix: hash client_secret (SHA-256) into the cache key so rotation produces a fresh ClientSecretCredential instance, defeating azure-identity's per-instance token cache. Hashing avoids storing the raw secret in the dict key. Tests: - test_factory_rotates_credential_when_secret_changes: two factories for same tenant+client_id with different secrets produce 2 distinct ClientSecretCredential instances. Re-calling either factory uses its own cache entry (does not construct again). - test_factory_cache_key_does_not_contain_raw_secret: asserts the raw secret is not present in any cache key (only its hash). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # mssql_python/auth.py # tests/test_008_auth.py
…ifiers
Replace free-floating short-string literals ("serviceprincipal",
"interactive", "msi", etc.) with a new _AuthInternal class in
constants.py. The strings now live in exactly one place; _AUTH_TYPE_MAP
references them, and all five comparison sites in auth.py,
connection.py, and cursor.py use the named constants.
Prevents drift between the map and the comparisons (a typo would only
surface at runtime on the affected auth path), and removes the per-PR
temptation to add another bare string literal when wiring a new auth
mode.
No behavior change. 85 auth tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two defensive hardening changes on the ServicePrincipal token factory introduced in #534, plus a TODO-style comment documenting one known limitation that is out of scope for this PR. 1. Bound the AAD network round-trip. The factory runs on a mssql-py-core blocking-pool worker (tokio spawn_blocking). Without explicit timeouts, azure-identity's defaults can let a slow / unreachable STS endpoint block that worker for tens of seconds. Pass a RequestsTransport with 10s connection / 15s read timeouts. SP is non-interactive so token issuance is typically <1s; these limits are generous and still bounded. 2. Reject multi-tenant aliases (common / organizations / consumers) in _parse_tenant_id. Confidential clients (SP) cannot authenticate against them; AAD returns AADSTS50194 which is cryptic. Failing fast in the factory surfaces a clearer error than the AAD round-trip. 3. Document the sovereign-cloud limitation in-line: ClientSecretCredential defaults to login.microsoftonline.com because authority= is not passed. Sovereign clouds (Azure US Gov, Azure China) will fail with 'tenant not found'. Tracked as a follow-up; the fix is to derive authority from urlparse(sts_url).netloc. Tests (+5, total 90): - 4 new TestParseTenantId cases for reserved aliases (common, organizations, consumers, case-insensitive) - 1 new TestServicePrincipalAuth case asserting RequestsTransport is constructed with finite connection_timeout + read_timeout and is passed through to ClientSecretCredential - Existing test_factory_forwards_credentials loosened from exact-dict equality to per-field assertions so future kwargs additions don't break it Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Note in cursor.py that cert-based ServicePrincipal is not supported on the callback branch (msodbcsql also can't do it without msodbcsqlmsqa.dll, which we don't ship; see #480). - Update the missing-UID/PWD error to say 'currently supports client-secret only' so cert users get a clearer signal. - Replace 'Model A' / 'Model B' shorthand with descriptive labels ('pre-acquired token' / 'callback-based') in cursor.py and auth.py. The A/B naming was internal-vocabulary only. No behavior change. 90 auth tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The _AuthInternal import addition pushed the existing import line over the 100-char limit. Black splits it into a parenthesized multi-line import. No behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jahnvi480
left a comment
There was a problem hiding this comment.
PR lgtm!
I have just posted few questions and wanted clarification and understanding on others.
Comments are non-blocking, hence approving the PR.
| raise ValueError("ServicePrincipal auth requires a non-empty client_secret (PWD)") | ||
|
|
||
| def _factory(spn: str, sts_url: str, auth_method: str) -> bytes: | ||
| # pylint: disable=import-outside-toplevel,unused-argument |
There was a problem hiding this comment.
auth_method is silenced via unused-argument. Are we expecting it to be used in furture?
| client_secret=client_secret, | ||
| transport=transport, | ||
| ) | ||
| _credential_cache[cache_key] = credential |
There was a problem hiding this comment.
_credential_cache is never evicted. After a client_secret rotation, a fresh entry is added under the new secret_hash, but the old ClientSecretCredential stays in the dict for the life of the process - holding the old secret in memory plus its internal token cache. For long-lived processes that rotate secrets periodically this is a slow leak.
Consider evicting any prior entries matching (serviceprincipal, tenant_id, client_id, *) when inserting a new secret_hash, or documenting the long-running-process caveat. Not blocking.
| str(e), | ||
| ) | ||
| raise RuntimeError( | ||
| "ServicePrincipal authentication failed; " "see debug logs for provider details" |
There was a problem hiding this comment.
message says "see debug logs for provider details" but the provider error is logged at logger.error(...) a few lines up, not debug. Either drop the log to debug or update the message to "see error logs" so the user knows where to actually look.
| # would keep returning the previously-issued token (good for | ||
| # up to ~1 hour) until expiry, masking the rotation. Hashing | ||
| # avoids storing the raw secret in the dict key. | ||
| secret_hash = hashlib.sha256(client_secret.encode("utf-8")).hexdigest() |
There was a problem hiding this comment.
secret_hash is recomputed on every factory invocation, but client_secret is fixed for the lifetime of the closure returned by make_token_factory. Compute the hash once in the outer scope and capture it:
secret_hash = hashlib.sha256(client_secret.encode("utf-8")).hexdigest()
def _factory(spn, sts_url, auth_method):
...
cache_key = _credential_cache_key(_AuthInternal.SERVICE_PRINCIPAL,
{"tenant_id": tenant_id, "client_id": client_id, "secret_hash": secret_hash})Save the SHA-256 per call (tens of μs each) and remove a redundant import of hashlib from the hot path.
| "ServicePrincipal token factory: token acquired, length=%d chars", | ||
| len(token), | ||
| ) | ||
| return token.encode("utf-16-le") |
There was a problem hiding this comment.
AADAuth.get_token_struct (used for the access_token path) returns a <I length-prefixed UTF-16LE struct. This factory returns bare token.encode("utf-16-le") with no length prefix.
one line in the docstring like "py-core handles the FedAuth length-prefix wrapping; do not pre-wrap" so a future reader doesn't try to "fix" this to match get_token_struct.
Work Item / Issue Reference
Summary
Adds
Authentication=ActiveDirectoryServicePrincipalsupport for bulk copy. Partial fix for #534. Addresses the exact error in the issue report ("Authentication method 'ActiveDirectoryServicePrincipal' is not supported. No token provider was registered for this method.").Why a callback rather than pre-acquire
The other Entra ID modes already supported in bulk copy (Default, DeviceCode, Interactive) pre-acquire a JWT in Python and pass it as
access_token. ServicePrincipal can't work that way:ClientSecretCredential(tenant_id, client_id, client_secret)requirestenant_idas a constructor argument.tenant_idon the connection string).So bulk copy must register a callback that's invoked mid-handshake. mssql-py-core hands the callback
(spn, sts_url, auth_method)and we parse the tenant from the STS URL on every call.Flow
Test
27 new tests cover:
TestAuthType::test_auth_type_constantsextended forSERVICE_PRINCIPALTestProcessAuthParameters::test_service_principal_auth_leaves_odbc_path_alone,test_service_principal_auth_case_insensitiveTestExtractAuthType::test_serviceprincipalTestParseTenantId: GUID, GUID without trailing slash, domain, query string, extra path segments, empty string, no pathTestServicePrincipalAuth: factory returns callable, requires non-emptyclient_idandclient_secret, returns UTF-16LE bytes, forwards credentials toClientSecretCredential, builds scope from SPN (and keeps/.defaultsuffix when already present), errors on unparseable STS URL, propagatesClientAuthenticationErrorasRuntimeErrorConnection string example
Same connection string keeps working for regular queries (ODBC handles ServicePrincipal natively in the non-bulk-copy path).
Out of scope
ActiveDirectoryPasswordneeds the same callback shape but a different azure-identity credential class. Separate PR; py-core still rejects it today with a clear error.ActiveDirectoryIntegratedneeds SSPI/Kerberos wiring at the Rust layer, not a Python callback. Separate work.Dependencies
mssql-py-core0.1.5+. This Python PR will fail at import or runtime against earlier py-core versions because theentra_id_token_factorydict key is silently ignored there. Will update the dependency pin once a new mssql-py-core wheel is published.