Debugging value length with SQL
Silent string truncation is the bug that finds you weeks later — emails merging into the wrong subscriber, codes losing their suffix, names cut at character N. Three queries to find truncated values, audit destination DE column widths, and prevent the next one.
You're sure the email addresses you imported were correct. The CSV had german.medina@reallylongdomain.example.com. The destination DE has EmailAddress stored. A week later, two unrelated subscribers are receiving each other's mail because both their stored email addresses are now german.medina@reallylongdomain.example. The destination DE column is VARCHAR(40) and Marketing Cloud truncated both addresses to the same 40-character prefix.
This is silent length truncation. It's gotcha #5 at scale, and it shows up most often in three scenarios: address fields after CRM imports, custom code/SKU fields with longer-than-expected suffixes, and SubscriberKey fields when the source is a UUID or a hashed identifier. The diagnostic queries below find truncated values that have already landed and the columns at risk of truncating future writes.
The diagnostic flow
1. Audit destination DE column widths (what's the cap on each column?)
2. Measure source MAX(LEN()) (what would be inserted, untruncated?)
3. Find rows already truncated (LEN() == max for the destination column)
4. Find downstream collisions (duplicates created by truncation)Step 1 — Audit destination DE column widths
Marketing Cloud's UI tells you each column's Length setting in the DE properties, but if you have many DEs to audit, query the source-of-truth _DataExtension and _DataExtensionField System Data Views once and snapshot:
-- One-time audit: schema + length for every column in every DE you own.
-- Snapshot to a de_log_ so you can compare against future runs.
INSERT INTO de_log_de_field_widths
SELECT
de.Name AS DEName,
def.Name AS ColumnName,
def.FieldType AS FieldType,
def.MaxLength AS MaxLength,
def.IsNullable AS IsNullable,
def.IsPrimaryKey AS IsPrimaryKey,
GETDATE() AS SnapshotAt
FROM _DataExtensionField def
INNER JOIN _DataExtension de
ON def.DataExtensionCustomerKey = de.CustomerKey
WHERE def.FieldType IN ('Text', 'EmailAddress')
AND de.Name LIKE 'DE\_%' ESCAPE '\'; -- only your master DEsNow you have a snapshot of every text column's MaxLength. Diff the snapshot week-over-week to surface columns whose length was changed by someone editing the DE in the UI.
Step 2 — Measure source MAX(LEN())
For every text column that's about to be inserted, run a single query against the source to find the longest value:
-- Replace master_subscribers with your source DE
SELECT
COUNT(*) AS TotalRows,
MAX(LEN(EmailAddress)) AS MaxEmailLen,
MAX(LEN(FirstName)) AS MaxFirstNameLen,
MAX(LEN(LastName)) AS MaxLastNameLen,
MAX(LEN(LoyaltyTier)) AS MaxTierLen,
MAX(LEN(SubscriberKey)) AS MaxSubKeyLen
FROM master_subscribers;Compare these numbers against the destination DE's MaxLength from step 1. Every column where the source max exceeds the destination cap is going to truncate on the next insert.
The defense:
-- Diagnostic: which rows would be truncated, and by how much?
SELECT
SubscriberKey,
EmailAddress,
LEN(EmailAddress) AS SourceLen,
50 AS DestinationLen, -- the destination cap
LEN(EmailAddress) - 50 AS WouldTruncateBy
FROM master_subscribers
WHERE LEN(EmailAddress) > 50
ORDER BY LEN(EmailAddress) DESC;Run this against production volume before an INSERT INTO ... SELECT that targets a narrower destination. Either resize the destination column or write a documented LEFT() call so the truncation is intentional.
Step 3 — Find rows already truncated
If the destination DE has been receiving truncated values for a while, the bug is already in production. The signature of a truncated value is LEN(col) = destination_max_length.
-- Find every row in the production DE where the email address is
-- exactly the destination column's max length. These are the rows
-- where the original value was AT LEAST that long; some are legitimate,
-- many are truncated.
SELECT
SubscriberKey,
EmailAddress,
LEN(EmailAddress) AS Len
FROM destination_de
WHERE LEN(EmailAddress) = 50 -- the destination's max length
ORDER BY EmailAddress;A row where EmailAddress ends in .com, .org, .net etc. is most likely not truncated — the domain ended cleanly. A row where EmailAddress ends mid-word ('german.medina@reallylongdomain.example', with no TLD) is almost certainly truncated.
For numeric IDs / SubscriberKeys, the signature is the same: LEN(SubscriberKey) = destination_max_length plus the value not having the structural separators you'd expect (a UUID without its trailing characters, an integer ID that's an exact multiple of the column width).
Step 4 — Find downstream collisions
The damage of truncation isn't usually the truncated value itself — it's the duplicates truncation creates. Two distinct source values that share the same first N characters become the same value in the destination, and downstream joins / dedup logic now treats them as the same subscriber.
-- Find truncated emails that collide with another email in the DE
SELECT
EmailAddress,
COUNT(*) AS DuplicateCount
FROM destination_de
WHERE LEN(EmailAddress) = 50
GROUP BY EmailAddress
HAVING COUNT(*) > 1
ORDER BY DuplicateCount DESC;Each row with DuplicateCount > 1 represents N distinct source subscribers that are now indistinguishable in the destination. If the Send Definition deduplicates by EmailAddress, only one of them will receive each email. If your loyalty calculations group by EmailAddress, their points are now pooled.
Common causes ranked by frequency
| Cause | Where to look |
|---|---|
| Destination DE was created with a default VARCHAR(50) length when the source allows longer | Step 1 snapshot vs Step 2 max measurements |
| CSV import added a new column and the destination wasn't resized | Diff de_log_de_field_widths week-over-week |
| LEFT(col, n) was added to the SELECT to "fit" the column without checking the cost | Audit recent commits to the SQL Activity |
| Source was migrated and now contains UUIDs / hashes / longer formats | Step 2 against source vs prior week's snapshot |
| Custom code field was sized for a 6-char SKU but suppliers now send 12-char composite codes | Step 3 against the SKU column |
Remediation playbook
Once truncation is confirmed:
- Stop the bleed. Disable the Activity that's writing the truncated values until the destination is resized.
- Resize the destination DE column in the UI (DE properties → column → Length). Note that resizing is non-destructive for existing data — already-truncated rows stay truncated; only new writes get the new width.
- Re-import the truncated rows from the source if you can identify them. The Step 4 query gives you the SubscriberKeys; an
INSERT INTO ... SELECTfrom the original source withUpdatemode rewrites them with the full values. - Add the column-width audit (step 1) to a scheduled diagnostics Automation so any future DE-property edit triggers a row in
de_log_de_field_widthsyou can review. - Document the new column-width policy in your team's Style Guide — for
EmailAddresscolumns we use 254 (the RFC max), forSubscriberKeywe use 254 minimum, etc.
Related
- INSERT INTO — the write path where truncation happens silently
- String functions —
LEN()for measuring source values,LEFT()for intentional truncation - SELECT — explicit column projection so destination width changes are visible
- FROM —
_DataExtension/_DataExtensionFieldSDVs for column-width audits - MC SQL gotchas — see #5 for the underlying length-truncation behavior
- Style Guide — the column-width policy that prevents this
- Debugging email sends — the related how-to for send-vs-expected discrepancies (often caused by truncated emails)