| Scenario / Requirement | Correct Feature | Why | Common Trap |
|---|---|---|---|
| Users must only see their own rows | Row-Level Security (RLS) โ FILTER predicate | Transparent to app. Policy enforced in DB regardless of query source. | Using WHERE in every query โ dev can forget it. |
| Show masked SSN to most users, real value to admins | Dynamic Data Masking (DDM) | db_owner and UNMASK-privileged users see real data. Others see mask. | DDM is NOT encryption โ app-tier only. Privileged queries bypass it. |
| Encrypt a column. App needs to search/join on it (equality). | Always Encrypted โ DETERMINISTIC | Same plaintext โ same ciphertext โ equality comparisons work. | Choosing RANDOMIZED blocks any filtering on the column. |
| Encrypt highly sensitive column. No filtering needed. | Always Encrypted โ RANDOMIZED | Each encryption unique โ no pattern leakage. Most secure. | Using DETERMINISTIC unnecessarily exposes patterns. |
| Azure resource must connect to Azure SQL โ no passwords in code | Managed Identity + Azure AD token | No secret to rotate/leak. Integrates with Azure RBAC. | Storing connection string in app config or pipeline YAML. |
| Prevent user UPDATE from changing row to one they can't see | RLS BEFORE UPDATE BLOCK predicate | Checks the new values against the predicate before allowing UPDATE. | Using FILTER only โ doesn't prevent exfiltration via UPDATE. |
| Store API key for external REST calls in T-SQL | DATABASE SCOPED CREDENTIAL | Encrypted in DB, referenced by name. Never appears in plain T-SQL. | Hardcoding in stored procedure or T-SQL variable. |
| Secure DAB GraphQL/REST endpoint in production | Managed Identity + Azure AD JWT validation | Roles map to permissions in dab-config.json. No key management. | Using anonymous role for non-public data; API key in URL params. |
| Requirement | Search Type | Implementation | Key Functions |
|---|---|---|---|
| Find documents containing exact phrase "product recall" | Full-Text | Full-text index + catalog on content column | CONTAINS(col, '"product recall"') |
| Find documents about "car accident" even if they say "vehicle collision" | Semantic Vector | Embedding column + VECTOR_DISTANCE cosine | VECTOR_DISTANCE('cosine', emb, @qEmb) |
| Find documents matching both keyword and semantic meaning โ production RAG | Hybrid (Full-text + Vector + RRF) | Both indexes. FREETEXTTABLE + VECTOR_SEARCH. Merge via RRF. | RRF: 1.0/(60+rank) summed, ORDER BY combined score |
| Multilingual search โ users query in French, docs in English | Semantic Vector | Multilingual embedding model (e.g., text-embedding-3-large) | Embeddings capture cross-lingual semantics |
| 100% recall required on 5,000 rows, latency not critical | ENN (Exact Nearest Neighbor) | ORDER BY VECTOR_DISTANCE โ no index needed | Full table scan No index |
| Sub-50ms search on 5 million rows | ANN (DiskANN index) | CREATE VECTOR INDEX, VECTOR_SEARCH() | Index required DiskANN |
| Stemming โ "runs" matches "running", "ran" | Full-Text with FORMSOF | Full-text index + CONTAINS FORMSOF(INFLECTIONAL) | CONTAINS(col, 'FORMSOF(INFLECTIONAL, run)') |
| Merge full-text (RANK 0-1000) and vector (distance 0-1) scores | RRF (Reciprocal Rank Fusion) | Rank each list separately, apply 1/(60+rank), sum scores | Scale-invariant: works regardless of score ranges |
| Method | What It Captures | Overhead | Best For | Not For |
|---|---|---|---|---|
| Change Tracking | Row identity + change type (INSERT/UPDATE/DELETE) since a sync version. No column values. | Sync scenarios. Lightweight embedding trigger where you re-read the row. | When you need before/after column values without re-querying. | |
| CDC (Change Data Capture) | Before AND after column values for each change, from transaction log. | ETL pipelines. Embedding refresh when you need exact new column values. Audit with history. | Very high-write tables (log read overhead). Doesn't work with memory-optimized tables. | |
| CES (Change Event Streaming) | Real-time event stream of changes as they happen. | Event-driven AI pipelines. Sub-second embedding freshness. Stream processing. | Simple batch sync. Teams without stream processing expertise. | |
| Azure Functions SQL Trigger | CDC-based row change events delivered to a serverless function. | Event-driven embedding refresh. Serverless, auto-scales. No polling overhead on DB. | When you need before-values (gets only new values). Long-running processing per row. | |
| AFTER Trigger (SQL) | DML operation. Synchronous โ fires within the transaction. | Simple audit logging. Small tables with infrequent writes. Cascading updates. | Calling external APIs (embedding) โ blocks write transaction, risk of failure. | |
| Azure Logic Apps | Connector-based polling or event reception from SQL/storage. | Low-code integration. Connecting SQL to non-technical downstream systems. | High-volume, low-latency scenarios. Cost-sensitive high-frequency polling. |
| Scenario | Index Type | Key Characteristics |
|---|---|---|
| OLTP: point lookups, frequent single-row reads by PK | Clustered Index (B-tree) | Row store. Fast single row / range scans. One per table. Defines physical row order. |
| Support queries that filter on non-PK columns, avoid key lookups | Non-Clustered Index + INCLUDE | Leaf level includes extra columns. Eliminates key lookup. Multiple per table. |
| Analytics: aggregations, GROUP BY, large table scans | Clustered Columnstore Index (CCI) | Column storage. Compression. Vectorized processing. Replaces row store. ~10x read perf for analytics. |
| Mixed OLTP + Analytics on same table | Non-Clustered Columnstore Index (NCCI) | Added alongside clustered row-store index. Analytics benefit without losing row-store OLTP perf. |
| Partitioned table โ enforce partition elimination | Aligned Index (same partition scheme) | Index uses same partition function. Optimizer can skip irrelevant partitions. |
| Semantic vector similarity search โ production scale (millions) | DiskANN Vector Index | ANN graph index. O(log n). Used by VECTOR_SEARCH(). Supports cosine/euclidean/dot_product. |
| Full-text keyword search with stemming/proximity | Full-Text Index | Requires full-text catalog. Linguistic processing. CONTAINS/FREETEXT predicates. |
| View needs to return aggregated data fast without recomputing | Indexed View (Clustered Index on View) | Materializes the view result. SCHEMABINDING required. Auto-maintained on DML. Enterprise: auto-used; Standard: NOEXPAND hint. |