Security¶
PyNydus enforces a strict principle: no real secret or PII value should exist in the Egg or be visible to the LLM. This page covers the full security model: redaction, placeholder linking, signing, and trust boundaries.
Redaction boundary (secrets OUT)¶
During spawn, redaction happens before any parsing or LLM calls:
Source files (contain real secrets + PII)
│
▼
┌──────────────────────┐
│ Gitleaks │ Credentials → {{SECRET_NNN}}
│ Presidio │ PII → {{PII_NNN}}
└──────────────────────┘
│
▼
Redacted files (only placeholders)
│
├──> Spawner.parse() (sees placeholders only)
├──> LLM refinement (sees placeholders only)
├──> Egg archive (stores placeholders only)
├──> LLM hatch polish (sees placeholders only)
│
▼
┌──────────────────────┐
│ Secret injection │ {{SECRET_NNN}} → real values from .env
└──────────────────────┘
│
▼
Output files (real values restored)
Gitleaks (credentials)¶
Gitleaks scans source files for API
keys, tokens, and passwords, replacing each match with a {{SECRET_NNN}}
placeholder. Each finding becomes a SecretRecord with kind credential.
How the scan works:
Scannable files (non-binary) are written to a temporary directory.
Gitleaks runs against the temp dir using its built-in rule set.
Findings are mapped back to the in-memory file dict.
Each matched span is replaced with a unique
{{SECRET_NNN}}token.Ignored (binary) files pass through unchanged.
Gitleaks must be installed when spawning with REDACT true and SOURCE
directives. See Installation for setup.
Presidio (PII)¶
Presidio scans the
(already gitleaks-processed) files for PII and replaces each match with a
{{PII_NNN}} placeholder (kind pii).
NLP model: en_core_web_lg (spaCy). Loaded once and cached as a singleton.
Confidence threshold: 0.40. Detections below this score are discarded.
Built-in entity types detected:
Entity type |
Examples |
|---|---|
|
Full names |
|
email@example.com |
|
+1-555-0123 |
|
4111-1111-1111-1111 |
|
GB82 WEST 1234 5698 7654 32 |
|
192.168.1.1 |
|
“123 Main St, Springfield” |
|
Medical license numbers |
Custom recognizers (added by PyNydus):
Entity type |
Pattern |
Context keywords |
|---|---|---|
|
|
ssn, social security |
|
|
passport, travel document |
|
|
driver, license, dl |
Suppressed entity types (too noisy): URL, DATE_TIME, NRP.
Overlap resolution: When multiple detections overlap the same text span, the highest-scoring, longest match wins. Others are discarded.
Deduplication: Identical PII values across files reuse the same placeholder.
“John Smith” in three files all become the same {{PII_001}}.
File classification¶
Binary files are skipped during scanning. Classification is by extension:
Ignored (binary): png, jpg, jpeg, gif, webp, ico, svg,
pdf, zip, egg, gz, tar, bz2, xz, 7z, woff, woff2,
ttf, otf, eot, mp3, mp4, wav, ogg, webm, avi, bin,
exe, dll, so, dylib, pyc, pyo, class.
Scannable (everything else): Markdown, JSON, YAML, plain text, Python, etc.
REDACT directive¶
Value |
Secrets (gitleaks) |
PII (Presidio) |
|---|---|---|
|
Redacted |
Redacted |
|
Kept |
Kept |
When REDACT false is set, a warning is logged. Use only for testing.
Placeholder linking¶
Every redacted value gets a unique placeholder token. The Egg’s SecretsModule
stores a SecretRecord for each:
Field |
Description |
|---|---|
|
Stable ID ( |
|
The token ( |
|
|
|
Entity type for PII records (e.g., |
|
Human-readable name (e.g., |
|
If |
|
How the value is substituted ( |
|
Optional description of what was redacted |
|
List of source files containing this placeholder |
nydus env agent.egg generates a template .env by reading all
SecretRecord entries and listing their names as keys to fill in.
Injection boundary (secrets IN)¶
During hatch, secret injection is the last transformation before writing to disk:
Connector renders files (placeholders intact)
LLM polishes files (placeholders intact)
Secret injection:
{{SECRET_NNN}}/{{PII_NNN}}replaced with real values from.envFiles written to disk
The .env file maps SecretRecord.name -> real value. If a record has
required_at_hatch=True and its name is missing, hatch fails with an error
listing the missing secrets.
Spawn log security¶
The spawn log captures detailed structured events from every pipeline step. It is stored in the Egg and forwarded to the hatch LLM. No real secret values or PII appear in the log:
Secret entries log only the placeholder name and gitleaks rule ID. The actual secret value is never recorded.
PII entries log only the entity type (e.g.,
PERSON) and the placeholder. The actual PII value is never recorded.Text content is logged as character lengths, never as raw text. For example, a memory record logs
text_length: 245, not the text itself.
Egg signing (Ed25519)¶
Eggs can be signed with Ed25519 keys for integrity and authenticity.
Key generation¶
nydus keygen
Creates private.pem (permissions 600) and public.pem in ~/.nydus/keys/.
Custom directory: nydus keygen --dir ./my-keys/.
How signing works¶
When a private key exists at ~/.nydus/keys/private.pem, or when
NYDUS_PRIVATE_KEY is set to the PEM text of that key (not a file path),
nydus spawn automatically signs the egg:
Serialize manifest, skills, memory, and secrets into ordered byte arrays
Compute SHA-256 over the canonical content
Sign with Ed25519 private key
Store in
manifest.signatureandsignature.jsoninside the archive
Verification¶
Happens automatically during nydus hatch:
Valid: hatching proceeds
Invalid: hatching is rejected
Unsigned: hatching proceeds silently
Check status anytime: nydus inspect agent.egg.
SDK signing¶
ny.save(egg, Path("agent.egg"), sign=True)
LLM trust boundary¶
The LLM is an untrusted component. It can modify content but never sees real secrets:
Redaction runs before the LLM sees content
Injection runs after the LLM finishes
Unknown file paths or record IDs in LLM output are dropped with a warning
If the LLM fails, the pipeline falls back to unrefined content
Recommendations¶
Always use
REDACT true(the default) for real agent projectsGenerate signing keys for eggs shared between teams (
nydus keygen)Use
nydus envto create.envtemplates rather than guessing secret namesReview spawn logs (
nydus inspect --logs) to verify redaction coverageKeep gitleaks updated for broader credential pattern detection