Transforms
Transforms normalise a plaintext value before it is passed to the HMAC function. Normalisation ensures that logically equivalent inputs - differing only in case or whitespace - produce the same blind index fingerprint.
Without transforms, searching for "jane@example.com" would not match a record saved with "Jane@Example.com", even though they refer to the same address.
Configuring Transforms
Transforms are declared on the [BlindIndex] attribute as an ordered array of strings:
[BlindIndex(
CompanionProperty = nameof(EmailHash),
Transforms = ["lowercase", "trim"])]
public string Email { get; set; } = "";Transforms are applied left to right. The example above first converts to lowercase, then strips surrounding whitespace.
Built-In Transforms
lowercase
Converts the entire value to lowercase using the invariant culture.
| Input | Output |
|---|---|
"Jane@Example.COM" | "jane@example.com" |
"ACME Corp" | "acme corp" |
"already lower" | "already lower" |
Use on: email addresses, usernames, domain names.
trim
Removes leading and trailing whitespace (spaces, tabs, newlines).
| Input | Output |
|---|---|
" jane@example.com " | "jane@example.com" |
"\tjane\n" | "jane" |
"no change" | "no change" |
Use on: any field where trailing spaces might appear from user input or data imports.
alphanumeric
Removes all characters that are not ASCII letters (a-z, A-Z) or digits (0-9). Useful for normalising names or identifiers that might contain punctuation.
| Input | Output |
|---|---|
"O'Brien" | "OBrien" |
"Smith-Jones" | "SmithJones" |
"+1 (555) 867-5309" | "15558675309" |
Combine with lowercase for case-insensitive matching
alphanumeric alone does not change case. Use ["lowercase", "alphanumeric"] if you want case-insensitive matching.
digits
Retains only ASCII digit characters (0-9). All other characters are removed. Designed for phone numbers, tax IDs, and other numeric identifiers.
| Input | Output |
|---|---|
"+1 (555) 867-5309" | "15558675309" |
"SSN: 123-45-6789" | "123456789" |
"GB VAT 123 456 789" | "123456789" |
last4
Retains only the last 4 characters of the value after all other characters have been processed. Commonly used for partial credit card or SSN matching.
| Input | Output |
|---|---|
"4111111111111111" | "1111" |
"123-45-6789" | "6789" |
"AB12" | "AB12" |
"AB" | "AB" (shorter than 4 - returned as-is) |
Combine last4 with digits for card numbers
Use ["digits", "last4"] to strip formatting characters before taking the last four digits. This ensures "4111-1111-1111-1111" and "4111111111111111" produce the same result.
first_char
Retains only the first character of the value. Useful for bucketed or initial-based lookups.
| Input | Output |
|---|---|
"Jane" | "J" |
"jane" | "j" |
"" | "" (empty string is preserved) |
Low cardinality warning
first_char produces at most 26 distinct values (plus digits and symbols). This is a very low-cardinality blind index and is susceptible to frequency analysis. See Security Considerations.
Transform Ordering
Transforms are applied in the order they are declared. Order matters.
Example: ["trim", "lowercase", "digits"]
Input: " +1 (555) 867-5309 "
trim → "+1 (555) 867-5309"
lowercase → "+1 (555) 867-5309" (no letters, no change)
digits → "15558675309"Example: ["digits", "last4"]
Input: "4111-1111-1111-1111"
digits → "4111111111111111"
last4 → "1111"Reversing the order would give last4 the formatted string first, which could produce a different result depending on the trailing characters.
Custom Transforms
Use WithTransform() to add inline custom transforms in the fluent API:
// Inline custom transforms - no class or registration needed
var transformServices = new ServiceCollection();
var transformBuilder = transformServices.AddTayra(opts => opts.LicenseKey = licenseKey);
transformBuilder.Entity<IndexedCustomer>(e =>
{
e.DataSubjectId(c => c.CustomerId);
e.PersonalData(c => c.Email);
e.BlindIndex(c => c.Email)
.WithTransform(value => value.Split('@')[0]) // extract local part
.WithLowercase()
.StoredIn(c => c.EmailIndex);
});Custom transforms are just functions - no class or registration needed. They compose naturally with built-in transforms in the pipeline.
Custom Transform Rules
- The function must be a pure function - same input always produces the same output.
- The function must not throw on an empty string.
- Transforms should be fast (no I/O, no allocations if avoidable).
Transform Reference Summary
| Name | Effect | Typical Use |
|---|---|---|
lowercase | Converts to invariant lowercase | Email, username |
trim | Removes leading/trailing whitespace | Any user-input field |
alphanumeric | Keeps only [a-zA-Z0-9] | Names, identifiers |
digits | Keeps only [0-9] | Phone numbers, tax IDs |
last4 | Keeps last 4 characters | Card numbers, SSN suffix |
first_char | Keeps first character only | Bucketed lookups |
See Also
- Blind Indexes Overview - How HMAC blind indexes work
- Security Considerations - Cardinality and frequency analysis risks
- Querying - Using blind indexes in EF Core and Marten queries
