Speak to the Team

Size of Company

Products of Interest

Please select one or more options below:

Sedna needs the contact information you provide to us to contact you about our products and services. You may unsubscribe from these communications at any time. For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, please review our Privacy Policy.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Sedna Personal Data Redaction in Shipping Email Workflows: A Technical Guide

Every day, your operational inboxes receive passport scans, medical certificates, crew lists, and immigration documents. They arrive as attachments in forwarded threads, copied across shared inboxes, passed between port agents and crewing managers and charterers. None of them systematically controlled.

Personal Data Redaction is the control layer that changes that. This blog article explains how it works technically, what it detects, how it integrates into your existing Sedna Email environment, and what your team actually needs to do to configure and manage it.

Detection: How Personal Data Is Identified

Personal Data Redaction uses a layered detection architecture built specifically for the document types and data patterns common in shipping operations.
‍

Named Entity Recognition (NER)

The primary detection mechanism is Named Entity Recognition, a machine learning approach that identifies named entities in text by understanding context, not just pattern-matching. NER distinguishes between a name in a crew list context and the same string appearing in a company name or vessel reference. This is critical in a shipping environment, where the same words carry different meanings depending on the surrounding content.
‍

Rule-based pattern matching

Alongside NER, the system uses rule-based logic for high-confidence structured data: passport numbers, national ID formats, IBAN and account numbers, visa and immigration reference numbers. These follow known patterns across jurisdictions and can be detected with high precision without contextual analysis.
‍

Severity scoring

Every detection is assigned a severity level (critical, high, or low) based on data type and context. This matters operationally: not every email containing a name should trigger the same response as one containing a passport scan alongside medical data. You configure which severity thresholds trigger automatic redaction. This gives your team precise control over the balance between protection and workflow continuity.
‍

30+ personal data types covered

The standard detector library includes:

Passport numbers and national identity documents
Medical data and health certificate references
Crew details and seafarer identification
Payroll and financial data
Immigration and visa documentation
Names, in configurable scope. Names can be excluded from redaction while other PII is controlled, important for preserving searchability by crew member name
‍

400+ file formats including Optical Charter Recognition (OCR)

With Sedna’s PDR, detection runs across email body text and every attachment type your inboxes receive: PDFs, Word documents, Excel sheets, scanned images. For scanned documents (which are common in shipping), the system applies optical character recognition (OCR) before running the detection pipeline. A passport scan arriving as a JPEG is treated identically to the same data arriving in a structured PDF.
‍

Processing: When and How Scanning Happens

Forward-looking by design

Personal Data Redaction scans every inbound and outbound Sedna email and attachment from the point of activation. It does not retroactively process historical email. This is by design, not a limitation: retroactive scanning of large email archives introduces reliability issues, processing overhead, and difficult questions about what you do with findings you can't remediate.
‍

Scanning on send and receive

Every email that enters or leaves Sedna Email is scanned at the point of transit. The detection pipeline runs against the email body and all accessible attachments. The PII Scan Card (the interface element surfaced to compliance and IT users) shows findings in real time.
‍

Configurable redaction intervals

Detection and redaction are decoupled. When personal data is found, it is not immediately hidden from view. Instead, it enters a configurable redaction queue. Your team sets the interval, namely, how long sensitive data remains visible before it is redacted from standard user view. After that window, redaction runs automatically.

This decoupling means that Personal Data Redaction fits around how your teams actually work, rather than requiring your teams to change workflows to fit the tool.
‍

Redaction: What Happens to Detected Data

Redaction is NOT deletion

This distinction matters both operationally and legally. When data is redacted, it is hidden from the view of standard users in the Sedna interface. The underlying data is preserved in the system. Authorised users (defined by role-based permissions) retain access to unredacted content when needed.

Standard users will see that redaction has occurred; they cannot see what was redacted. The email context (including the thread, the sender, the subject, the surrounding communication) remains intact. Only the identified personal data is removed from view.
‍

Scheduled deletion

Separately from redaction, you can configure scheduled deletion: the permanent removal of identified personal data after a defined interval. This supports data minimisation obligations under GDPR. These obligations reference not storing personal data longer than necessary for the purpose for which it was collected. Redaction and deletion operate on independent schedules, giving compliance teams granular control over retention policy.
‍

What teams actually see

For operational users the only visible change after Personal Data Redaction is activated is that certain data fields in emails become inaccessible after their redaction interval has passed. The PII Scan Card interface is surfaced to compliance and IT administrators, not to operational users. Most operational teams never interact with the system directly; it runs transparently in their existing Sedna workflow.
‍

Integration: How It Fits Into Your Existing Environment

No new sub-processors

Personal Data Redaction is self-hosted within the Sedna environment. Detection and redaction processing does not involve external scanning services.

NO data leaves your Sedna instance for analysis.

This matters for two reasons: it satisfies the data residency requirements that many shipping companies operate under, and it removes the need to update your data processing agreements or your sub-processor list.

When customers ask us "Does this introduce new third-party data processors?", the answer is no.
‍

No new tools for operational teams

There is no separate application to install, no new login credentials to manage, and no interface for operational users to learn. Personal Data Redaction extends Sedna Email. Teams send, receive, and manage email exactly as they do today. The administration interface is available within Sedna for IT and compliance users; operational users are unaffected.
‍

Relationship to Microsoft Purview

Sedna’s Personal Data Redaction complements Microsoft 365’s Purview. Purview operates within the Microsoft ecosystem. It has no access to the Sedna Email environment, no awareness of Sedna's shared inbox structures, and no coverage of attachments processed within Sedna. If your organisation uses Purview for M365 data governance, Personal Data Redaction fills the gap that Purview cannot reach: the place where your most sensitive operational documents actually flow.
‍

Deployment timeline

Standard deployment is quick, normally less than 2 weeks. There is no IT project, no infrastructure change, and no change management programme for operational teams. Sedna configures the module; your team defines the redaction schedules, permission levels, and any custom classifiers. The first two to three weeks after activation are used to tune the system. This is for adjusting severity thresholds, disabling detectors that generate false positives in your specific environment (email address detection in footers is a common one to tune out), and building any custom classifiers for document types specific to your operations.

After that initial tuning period, the system requires minimal ongoing management.
‍

Configuration and Admin Controls

Role-based permissions

Access to unredacted content is controlled by role. You define which users can view redacted data and which can manually override a redaction decision. This creates a clear chain of accountability that satisfies customer due diligence questionnaires and regulatory audit requirements.
‍

Manual controls

Administrators can manually tag content for early redaction (ahead of the scheduled interval) or deferred redaction (extending the visibility window for a specific item). When a false positive occurs, namely when personal data is detected in a context where it shouldn't be redacted, an authorised user can restore visibility immediately. Every manual action is logged.
‍

Custom classifiers (Enterprise tier)

The standard 30+ detector library covers the data types common across shipping operations. Sedna Enterprise accounts can define additional classifiers for data types specific to their workflows: proprietary vessel management codes, jurisdiction-specific document formats, or internal data categories not covered by standard PII definitions. Custom classifiers are built collaboratively with us during the initial configuration period.
‍
‍

What to Evaluate Before Implementation

For IT and security teams assessing Personal Data Redaction, we recommend resolving the below questions before implementation:

Scope confirmation: Which mailboxes will be in scope for scanning? Sedna Email scans accessible shared inboxes. Define which mailboxes and whether all or a subset of your Sedna environment will be covered initially.
‍

Redaction interval design: What are the operational requirements for data visibility windows? Work with your crewing and operations leads to define sensible intervals before activation. The system will be far less disruptive if the intervals are set correctly from day one.
‍

Permission structure: Who needs access to unredacted content? Define this before go-live rather than retroactively, to avoid access requests creating operational bottlenecks.
‍

Outlook visibility: If your teams also access email through Outlook alongside Sedna, note that redaction applies to the Sedna environment only. Data visible in Sedna will be redacted there; the same email in an Outlook client will reflect the Sedna state depending on integration configuration. This is worth validating for your specific environment.
‍

False positive calibration: Plan for a two to three week tuning window after activation. This is normal and expected; the system is designed to be refined for your specific data patterns. The initial configuration period is where you trade off sensitivity (catching more PII, with more false positives) against precision (fewer false positives, with potentially lower recall on edge cases).
‍

See Sedna’sPersonal Data Redaction in action: Book a demo on the PDR product page →

Want to assess your potential data risk? Calculate the Risk

‍