Skip to content

FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy#576

Open
bewithgaurav wants to merge 13 commits into
mainfrom
bewithgaurav/gh534-bulkcopy-service-principal
Open

FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy#576
bewithgaurav wants to merge 13 commits into
mainfrom
bewithgaurav/gh534-bulkcopy-service-principal

Conversation

@bewithgaurav

@bewithgaurav bewithgaurav commented May 13, 2026

Copy link
Copy Markdown
Collaborator

Work Item / Issue Reference

AB#45137

GitHub Issue: #534


Summary

Adds Authentication=ActiveDirectoryServicePrincipal support for bulk copy. Partial fix for #534. Addresses the exact error in the issue report ("Authentication method 'ActiveDirectoryServicePrincipal' is not supported. No token provider was registered for this method.").

Why a callback rather than pre-acquire

The other Entra ID modes already supported in bulk copy (Default, DeviceCode, Interactive) pre-acquire a JWT in Python and pass it as access_token. ServicePrincipal can't work that way:

  • ClientSecretCredential(tenant_id, client_id, client_secret) requires tenant_id as a constructor argument.
  • The connection string doesn't carry tenant_id; azure-identity doesn't discover it from anywhere.
  • The only place tenant is available client-side is the STS URL the server hands back during the FedAuth pre-login (this is what ODBC does internally; that's why ODBC has never needed tenant_id on the connection string).

So bulk copy must register a callback that's invoked mid-handshake. mssql-py-core hands the callback (spn, sts_url, auth_method) and we parse the tenant from the STS URL on every call.

Flow

mssql-python                              mssql-py-core (Rust)              mssql-tds (Rust)

cursor.bulkcopy()
  parses conn string (UID/PWD)
  ServicePrincipalAuth
    .make_token_factory(uid, pwd)
  returns _factory closure

  pycore_context = {
      "authentication": "AD-SP",
      "user_name":  client_id,
      "password":   client_secret,
      "entra_id_token_factory": _factory,
      ...
  }
                                          PyCoreConnection(ctx)
                                            validate_auth / transform_auth
                                            register factory in
                                              auth_method_map[SP]
                                                                                TDS pre-login, open TCP/TLS
                                                                                server: FedAuthInfo
                                                                                  (spn, sts_url)
                                            context.entra_id_
                                              token_factory()
                                            factory.create_token invoked
  _factory(spn, sts_url, method)        <-- (spn, sts_url, m)
    _parse_tenant_id(sts_url)
    ClientSecretCredential(
        tenant_id, client_id,
        client_secret)
    .get_token(scope=spn + "/.default")
    return jwt.encode("utf-16-le")
                                                                                FedAuth token packet
                                                                                SQL Server validates

Test

$ python -m pytest tests/test_008_auth.py -q
65 passed in 0.77s

27 new tests cover:

  • TestAuthType::test_auth_type_constants extended for SERVICE_PRINCIPAL
  • TestProcessAuthParameters::test_service_principal_auth_leaves_odbc_path_alone, test_service_principal_auth_case_insensitive
  • TestExtractAuthType::test_serviceprincipal
  • TestParseTenantId: GUID, GUID without trailing slash, domain, query string, extra path segments, empty string, no path
  • TestServicePrincipalAuth: factory returns callable, requires non-empty client_id and client_secret, returns UTF-16LE bytes, forwards credentials to ClientSecretCredential, builds scope from SPN (and keeps /.default suffix when already present), errors on unparseable STS URL, propagates ClientAuthenticationError as RuntimeError

Connection string example

import mssql_python

conn = mssql_python.connect(
    "Server=tcp:myserver.database.windows.net;"
    "Database=mydb;"
    "Authentication=ActiveDirectoryServicePrincipal;"
    "UID=<client-id-guid>;"
    "PWD=<client-secret>;"
    "Encrypt=yes;"
)
cur = conn.cursor()
cur.bulkcopy("mytable", [(1, "a"), (2, "b")])  # uses the callback under the hood

Same connection string keeps working for regular queries (ODBC handles ServicePrincipal natively in the non-bulk-copy path).

Out of scope

  • ActiveDirectoryPassword needs the same callback shape but a different azure-identity credential class. Separate PR; py-core still rejects it today with a clear error.
  • ActiveDirectoryIntegrated needs SSPI/Kerberos wiring at the Rust layer, not a Python callback. Separate work.

Dependencies

  • Requires mssql-py-core 0.1.5+. This Python PR will fail at import or runtime against earlier py-core versions because the entra_id_token_factory dict key is silently ignored there. Will update the dependency pin once a new mssql-py-core wheel is published.

Copilot AI review requested due to automatic review settings May 13, 2026 12:14
@bewithgaurav bewithgaurav changed the title Add ActiveDirectoryServicePrincipal support for bulk copy FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy May 13, 2026
@bewithgaurav bewithgaurav marked this pull request as draft May 13, 2026 12:16
@bewithgaurav bewithgaurav force-pushed the bewithgaurav/gh534-bulkcopy-service-principal branch from 2e81b7f to cd80d9c Compare May 13, 2026 12:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds bulk copy support for Authentication=ActiveDirectoryServicePrincipal by registering an Entra ID token-factory callback (invoked mid-handshake) so the tenant can be derived from the STS URL returned by the server, aligning bulk copy behavior with ODBC’s native Service Principal handling for the normal query path.

Changes:

  • Added AuthType.SERVICE_PRINCIPAL and updated auth-type extraction/processing to recognize Service Principal mode.
  • Implemented ServicePrincipalAuth.make_token_factory() and _parse_tenant_id() to generate UTF-16LE JWTs during FedAuth handshake.
  • Updated cursor.bulkcopy() to register entra_id_token_factory for Service Principal auth; added tests and a changelog entry.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
mssql_python/constants.py Adds SERVICE_PRINCIPAL enum value for connection-string auth detection.
mssql_python/auth.py Adds tenant parsing + token-factory callback for Service Principal; updates auth detection/mapping.
mssql_python/cursor.py Registers the callback for bulkcopy when _auth_type == "serviceprincipal"; preserves credentials for py-core auth resolution.
tests/test_008_auth.py Adds unit tests for tenant parsing, factory behavior, and new auth-type detection.
CHANGELOG.md Documents new bulk copy Service Principal support and notes py-core requirement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mssql_python/cursor.py
Comment thread mssql_python/cursor.py Outdated
Comment thread mssql_python/auth.py Outdated
Comment thread mssql_python/cursor.py
@github-actions github-actions Bot added the pr-size: medium Moderate update size label May 14, 2026
Comment thread tests/test_008_auth.py Fixed
Comment thread mssql_python/auth.py Fixed
@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown

📊 Code Coverage Report

🔥 Diff Coverage

92%


🎯 Overall Coverage

80%


📈 Total Lines Covered: 6737 out of 8350
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/auth.py (91.2%): Missing lines 202-203,214,253-254
  • mssql_python/connection.py (100%)
  • mssql_python/constants.py (100%)

Summary

  • Total: 66 lines
  • Missing: 5 lines
  • Coverage: 92%

mssql_python/auth.py

Lines 198-207

  198     from urllib.parse import urlparse
  199 
  200     try:
  201         parsed = urlparse(sts_url)
! 202     except (ValueError, AttributeError):
! 203         return None
  204     # Reject anything that isn't an https URL with a netloc. ``urlparse`` will
  205     # happily put a bare string like ``"tenant-guid"`` into ``path``, which
  206     # would then look like a valid tenant. Azure AD STS URLs are always https.
  207     if parsed.scheme != "https" or not parsed.netloc:

Lines 210-218

  210     if not path:
  211         return None
  212     first_segment = path.split("/", 1)[0]
  213     if not first_segment:
! 214         return None
  215     if first_segment.lower() in _RESERVED_TENANTS:
  216         return None
  217     return first_segment

Lines 249-258

  249             try:
  250                 from azure.identity import ClientSecretCredential
  251                 from azure.core.exceptions import ClientAuthenticationError
  252                 from azure.core.pipeline.transport import RequestsTransport
! 253             except ImportError as e:
! 254                 raise RuntimeError(
  255                     "Azure authentication libraries are not installed. "
  256                     "Please install with: pip install azure-identity azure-core"
  257                 ) from e


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 59.9%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.connection.connection.cpp: 76.2%
mssql_python.pybind.ddbc_bindings.cpp: 76.2%
mssql_python.row.py: 76.9%
mssql_python.__init__.py: 77.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.logging.py: 85.5%
mssql_python.connection.py: 85.6%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

@github-actions github-actions Bot added pr-size: large Substantial code update and removed pr-size: medium Moderate update size labels May 15, 2026
@bewithgaurav bewithgaurav changed the title FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy FEAT: [WIP] Add ActiveDirectoryServicePrincipal support for bulk copy May 18, 2026
Wires a Python token-factory callback into the mssql-py-core connection
context so bulk copy can authenticate with
`Authentication=ActiveDirectoryServicePrincipal`. The callback is
invoked by mssql-tds mid-handshake (FedAuth workflow 0x02), receives the
STS URL from the server, parses the tenant_id from it, and uses
`azure.identity.ClientSecretCredential` to acquire a JWT. Necessary
because tenant_id is not known client-side until the server returns it
during the handshake, so the pre-acquired-token model (Model A) used by
Default / DeviceCode / Interactive / MSI cannot be used here.

Builds on the shared module-level `_credential_cache` introduced for
MSI in #573, keyed by ("serviceprincipal", tenant_id, client_id), so SP
gets the same per-instance token reuse semantics as the other AD
methods. `client_secret` is intentionally not in the cache key;
credentials are looked up by identity, not secret.

Other behaviors:
- `_parse_tenant_id` rejects non-https / non-URL inputs so a malformed
  STS URL cannot silently become a tenant id.
- Empty SPN from the server is rejected early with a clear message
  rather than producing scope="/.default".
- The surfaced `RuntimeError` is intentionally generic; provider
  message stays in `logger.error` only so any sensitive provider text
  does not reach the user-facing exception chain.
- Bulk copy `finally` cleanup also pops `entra_id_token_factory` from
  pycore_context.

Tests (in tests/test_008_auth.py):
- 7 `TestParseTenantId` cases: GUID/domain tenants, query string,
  extra path segments, empty, no path, bare-string rejection, path-only
  URL rejection, http scheme rejection.
- 11 `TestServicePrincipalAuth` cases: factory shape, missing
  client_id/secret, UTF-16LE return, credential kwargs forwarded to
  `ClientSecretCredential`, scope construction with/without
  `/.default` suffix, unparseable STS URL, authentication error
  propagation, no provider-message leak into chained exception,
  empty-SPN rejection, per-tenant credential caching (asserts 1
  construction across 3 calls for one tenant, 2 across two tenants).
- `TestProcessAuthParameters` cases: SP leaves ODBC path alone,
  case-insensitive recognition.
- `TestExtractAuthType.test_serviceprincipal`.

Requires mssql-py-core 0.1.5+ for the `entra_id_token_factory` dict
key wiring (ADO mssql-rs PR 7542). py-core pin bump will land in the
release bundle.

Partial fix for #534.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bewithgaurav bewithgaurav force-pushed the bewithgaurav/gh534-bulkcopy-service-principal branch from 8dcff3a to 28f9569 Compare May 25, 2026 04:35
bewithgaurav and others added 4 commits May 25, 2026 12:21
Without this, a rotated client_secret was silently masked for up to
~1 hour: the (tenant_id, client_id) cache key matched the previously
cached ClientSecretCredential, and azure-identity's internal token
cache kept returning the token issued under the OLD secret until
expiry. A user updating PWD in the connection string after an external
secret rotation would observe bulkcopy continuing to work with the
stale credential.

Fix: hash client_secret (SHA-256) into the cache key so rotation
produces a fresh ClientSecretCredential instance, defeating
azure-identity's per-instance token cache. Hashing avoids storing the
raw secret in the dict key.

Tests:
- test_factory_rotates_credential_when_secret_changes: two factories
  for same tenant+client_id with different secrets produce 2 distinct
  ClientSecretCredential instances. Re-calling either factory uses
  its own cache entry (does not construct again).
- test_factory_cache_key_does_not_contain_raw_secret: asserts the raw
  secret is not present in any cache key (only its hash).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts:
#	mssql_python/auth.py
#	tests/test_008_auth.py
@bewithgaurav bewithgaurav changed the title FEAT: [WIP] Add ActiveDirectoryServicePrincipal support for bulk copy FEAT: Add ActiveDirectoryServicePrincipal support for bulk copy Jun 5, 2026
@bewithgaurav bewithgaurav marked this pull request as ready for review June 5, 2026 12:46
bewithgaurav and others added 2 commits June 10, 2026 07:17
…ifiers

Replace free-floating short-string literals ("serviceprincipal",
"interactive", "msi", etc.) with a new _AuthInternal class in
constants.py. The strings now live in exactly one place; _AUTH_TYPE_MAP
references them, and all five comparison sites in auth.py,
connection.py, and cursor.py use the named constants.

Prevents drift between the map and the comparisons (a typo would only
surface at runtime on the affected auth path), and removes the per-PR
temptation to add another bare string literal when wiring a new auth
mode.

No behavior change. 85 auth tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread mssql_python/auth.py
bewithgaurav and others added 3 commits June 10, 2026 15:01
Two defensive hardening changes on the ServicePrincipal token factory
introduced in #534, plus a TODO-style comment documenting one known
limitation that is out of scope for this PR.

1. Bound the AAD network round-trip.
   The factory runs on a mssql-py-core blocking-pool worker (tokio
   spawn_blocking). Without explicit timeouts, azure-identity's defaults
   can let a slow / unreachable STS endpoint block that worker for tens
   of seconds. Pass a RequestsTransport with 10s connection / 15s read
   timeouts. SP is non-interactive so token issuance is typically <1s;
   these limits are generous and still bounded.

2. Reject multi-tenant aliases (common / organizations / consumers) in
   _parse_tenant_id. Confidential clients (SP) cannot authenticate
   against them; AAD returns AADSTS50194 which is cryptic. Failing fast
   in the factory surfaces a clearer error than the AAD round-trip.

3. Document the sovereign-cloud limitation in-line: ClientSecretCredential
   defaults to login.microsoftonline.com because authority= is not
   passed. Sovereign clouds (Azure US Gov, Azure China) will fail with
   'tenant not found'. Tracked as a follow-up; the fix is to derive
   authority from urlparse(sts_url).netloc.

Tests (+5, total 90):
- 4 new TestParseTenantId cases for reserved aliases
  (common, organizations, consumers, case-insensitive)
- 1 new TestServicePrincipalAuth case asserting RequestsTransport is
  constructed with finite connection_timeout + read_timeout and is
  passed through to ClientSecretCredential
- Existing test_factory_forwards_credentials loosened from exact-dict
  equality to per-field assertions so future kwargs additions don't
  break it

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Note in cursor.py that cert-based ServicePrincipal is not supported
  on the callback branch (msodbcsql also can't do it without
  msodbcsqlmsqa.dll, which we don't ship; see #480).
- Update the missing-UID/PWD error to say 'currently supports
  client-secret only' so cert users get a clearer signal.
- Replace 'Model A' / 'Model B' shorthand with descriptive labels
  ('pre-acquired token' / 'callback-based') in cursor.py and auth.py.
  The A/B naming was internal-vocabulary only.

No behavior change. 90 auth tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread tests/test_008_auth.py Dismissed
The _AuthInternal import addition pushed the existing import line over
the 100-char limit. Black splits it into a parenthesized multi-line
import. No behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@jahnvi480 jahnvi480 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR lgtm!
I have just posted few questions and wanted clarification and understanding on others.
Comments are non-blocking, hence approving the PR.

Comment thread mssql_python/auth.py
raise ValueError("ServicePrincipal auth requires a non-empty client_secret (PWD)")

def _factory(spn: str, sts_url: str, auth_method: str) -> bytes:
# pylint: disable=import-outside-toplevel,unused-argument

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auth_method is silenced via unused-argument. Are we expecting it to be used in furture?

Comment thread mssql_python/auth.py
client_secret=client_secret,
transport=transport,
)
_credential_cache[cache_key] = credential

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_credential_cache is never evicted. After a client_secret rotation, a fresh entry is added under the new secret_hash, but the old ClientSecretCredential stays in the dict for the life of the process - holding the old secret in memory plus its internal token cache. For long-lived processes that rotate secrets periodically this is a slow leak.

Consider evicting any prior entries matching (serviceprincipal, tenant_id, client_id, *) when inserting a new secret_hash, or documenting the long-running-process caveat. Not blocking.

Comment thread mssql_python/auth.py
str(e),
)
raise RuntimeError(
"ServicePrincipal authentication failed; " "see debug logs for provider details"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message says "see debug logs for provider details" but the provider error is logged at logger.error(...) a few lines up, not debug. Either drop the log to debug or update the message to "see error logs" so the user knows where to actually look.

Comment thread mssql_python/auth.py
# would keep returning the previously-issued token (good for
# up to ~1 hour) until expiry, masking the rotation. Hashing
# avoids storing the raw secret in the dict key.
secret_hash = hashlib.sha256(client_secret.encode("utf-8")).hexdigest()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secret_hash is recomputed on every factory invocation, but client_secret is fixed for the lifetime of the closure returned by make_token_factory. Compute the hash once in the outer scope and capture it:

secret_hash = hashlib.sha256(client_secret.encode("utf-8")).hexdigest()
def _factory(spn, sts_url, auth_method):
    ...
    cache_key = _credential_cache_key(_AuthInternal.SERVICE_PRINCIPAL,
        {"tenant_id": tenant_id, "client_id": client_id, "secret_hash": secret_hash})

Save the SHA-256 per call (tens of μs each) and remove a redundant import of hashlib from the hot path.

Comment thread mssql_python/auth.py
"ServicePrincipal token factory: token acquired, length=%d chars",
len(token),
)
return token.encode("utf-16-le")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AADAuth.get_token_struct (used for the access_token path) returns a <I length-prefixed UTF-16LE struct. This factory returns bare token.encode("utf-16-le") with no length prefix.

one line in the docstring like "py-core handles the FedAuth length-prefix wrapping; do not pre-wrap" so a future reader doesn't try to "fix" this to match get_token_struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: large Substantial code update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants