pynydus.engine.pipeline¶
Spawning pipeline.
Resolves Nydusfile directives, loads sources, runs gitleaks and Presidio on file
text, invokes the platform spawner, optionally runs LLM refinement, then
builds manifest and Egg records.
Pipeline steps: 1. Resolve base egg (FROM directive) 2. Read source files 3. Redaction (file filtering, secret scan, PII redaction) 4. Parse sources via spawner connector 5. Build structured records (skills, memory, secrets) 6. Merge with base egg (FROM + SOURCE) 7. LLM refinement (optional) 8. Post-processing (custom labels, memory exclusions) 9. Package egg
Module Contents¶
Classes¶
Mutable context passed through each pipeline phase. |
Functions¶
Run the spawning pipeline. |
|
Raise if gitleaks is required but not installed. |
|
If FROM is present, load and merge the base egg. |
|
Check if a base egg reference looks like a registry ref (name:version). |
|
Pull a registry egg to a temp file and return its path. |
|
Read source files into independent per-group dicts. |
|
Read text files matching patterns from a directory. |
|
Remove files whose keys match any exclude glob. |
|
Replace secrets with |
|
Replace PII with |
|
Return the spawner connector for the given agent type. |
|
Parse redacted files, dispatching each source group to its own spawner. |
|
Convert ParseResult skills into AgentSkill objects. |
|
Build McpModule from ParseResult’s raw MCP config dicts. |
|
Convert ParseResult memory into MemoryRecord objects. |
|
Combine base egg skills with freshly extracted skills, re-numbering IDs. |
|
Merge MCP server configs from base egg and parsed source. |
|
Combine base egg memory with freshly extracted memory, re-numbering IDs. |
|
Combine base egg secrets with extracted secrets, deduplicating by name. |
|
Override memory record labels based on source_store pattern matching. |
|
Remove memory records whose label is listed in |
|
Construct the final Egg with manifest and neutral metadata fields. |
|
Find and return apm.yml content from source files (passthrough). |
|
Return A2A card: passthrough from source or generate from egg. |
|
Load spec markdown files and build the snapshots dict with manifest.json. |
|
Generate A2A card, AGENTS.md, spec snapshots, and stash apm.yml. |
Data¶
API¶
- pynydus.engine.pipeline.logger¶
‘getLogger(…)’
- class pynydus.engine.pipeline.PipelineContext¶
Mutable context passed through each pipeline phase.
All Nydusfile fields are front-loaded here at the start of the pipeline. No phase should reach back into NydusfileConfig.
- nydusfile_dir: pathlib.Path¶
None
- sources: list[pynydus.engine.nydusfile.SourceDirective]¶
‘field(…)’
- merge_ops: list[pynydus.engine.nydusfile.MergeOp]¶
‘field(…)’
- excluded_memory_labels: list[pynydus.common.enums.MemoryLabel]¶
‘field(…)’
- agent_type: pynydus.common.enums.AgentType | None¶
None
- pynydus.engine.pipeline.spawn(config: pynydus.engine.nydusfile.NydusfileConfig, *, nydusfile_dir: pathlib.Path, llm_config: pynydus.llm.LLMTierConfig | None = None) tuple[pynydus.api.schemas.Egg, dict[str, str], dict[str, list[dict]]]¶
Run the spawning pipeline.
This is the single entry point for spawn: it enforces prerequisites such as
ensure_gitleaks_if_neededbefore any file reads or redaction.Args: config: Parsed Nydusfile (sources, FROM, merge ops, redaction flags). nydusfile_dir: Directory containing the Nydusfile (resolves relative paths). llm_config: Optional LLM tier for spawn Step 7 refinement.
Returns:
(egg, raw_artifacts, logs): the spawned Egg, redacted source file contents, and pipeline log entries (e.g.{"spawn_log": [...]}).Raises: NydusfileError: If the Nydusfile is invalid (e.g. multiple SOURCE lines). GitleaksNotFoundError: When redaction requires gitleaks but it is missing.
- pynydus.engine.pipeline.ensure_gitleaks_if_needed(config: pynydus.engine.nydusfile.NydusfileConfig) None¶
Raise if gitleaks is required but not installed.
Secret scanning is required when
REDACTis true (the default) and at least oneSOURCEdirective is present. FROM-only spawns andREDACT falsepipelines skip file-level scanning entirely.Args: config: Parsed Nydusfile configuration.
Raises: GitleaksNotFoundError: When scanning is required but gitleaks is not found.
- pynydus.engine.pipeline._resolve_base_egg(ctx: pynydus.engine.pipeline.PipelineContext) tuple[pynydus.api.schemas.EggPartial | None, pynydus.common.enums.AgentType | None]¶
If FROM is present, load and merge the base egg.
Returns
(partial, agent_type): partial is the merged base egg, agent_type is the base egg’s manifest agent type.
- pynydus.engine.pipeline._is_registry_ref(ref: str) bool¶
Check if a base egg reference looks like a registry ref (name:version).
- pynydus.engine.pipeline._pull_registry_egg(ref: str) str¶
Pull a registry egg to a temp file and return its path.
- pynydus.engine.pipeline._read_source_files(ctx: pynydus.engine.pipeline.PipelineContext) list[tuple[pynydus.common.enums.AgentType, pathlib.Path, dict[str, str]]]¶
Read source files into independent per-group dicts.
Returns at most one
(agent_type, source_root, files)tuple (at most one SOURCE). Each group’s dict has bare filename keys and is independent: no merging is performed here.
- pynydus.engine.pipeline._read_files_from_path(root: pathlib.Path, patterns: list[str]) dict[str, str]¶
Read text files matching patterns from a directory.
- pynydus.engine.pipeline._filter_files_by_patterns(files: dict[str, str], patterns: list[str]) dict[str, str]¶
Remove files whose keys match any exclude glob.
- pynydus.engine.pipeline._scan_secrets_gitleaks(files: dict[str, str], ctx: pynydus.engine.pipeline.PipelineContext, *, start_index: int = 1) tuple[dict[str, str], list[pynydus.api.schemas.SecretRecord], int]¶
Replace secrets with
{{SECRET_NNN}}placeholders via gitleaks.Writes scannable files to a temp directory, runs gitleaks, maps findings back to in-memory dict keys. Ignored (binary) files pass through unchanged.
Returns
(redacted_files, credential_records, next_index).
- pynydus.engine.pipeline._redact_pii(files: dict[str, str], ctx: pynydus.engine.pipeline.PipelineContext, *, start_index: int = 1) tuple[dict[str, str], list[pynydus.api.schemas.SecretRecord], int]¶
Replace PII with
{{PII_NNN}}placeholders via Presidio.Returns
(redacted_files, pii_records, next_index)so callers can chain the counter across multiple groups.
- pynydus.engine.pipeline._get_spawner(agent_type: pynydus.common.enums.AgentType)¶
Return the spawner connector for the given agent type.
- pynydus.engine.pipeline._parse_sources(source_groups: list[tuple[pynydus.common.enums.AgentType, pathlib.Path, dict[str, str]]], ctx: pynydus.engine.pipeline.PipelineContext) pynydus.api.raw_types.ParseResult¶
Parse redacted files, dispatching each source group to its own spawner.
Each group’s dict is already redacted: it is passed directly to the spawner with bare filename keys.
- pynydus.engine.pipeline._build_skills_module_from_parse(parse_result: pynydus.api.raw_types.ParseResult, agent_type: pynydus.common.enums.AgentType) pynydus.api.schemas.SkillsModule¶
Convert ParseResult skills into AgentSkill objects.
- pynydus.engine.pipeline._build_mcp_module_from_parse(parse_result: pynydus.api.raw_types.ParseResult) pynydus.api.schemas.McpModule¶
Build McpModule from ParseResult’s raw MCP config dicts.
- pynydus.engine.pipeline._build_memory_module_from_parse(parse_result: pynydus.api.raw_types.ParseResult, agent_type: pynydus.common.enums.AgentType) pynydus.api.schemas.MemoryModule¶
Convert ParseResult memory into MemoryRecord objects.
- pynydus.engine.pipeline._merge_skills(base: pynydus.api.schemas.SkillsModule, extracted: pynydus.api.schemas.SkillsModule) pynydus.api.schemas.SkillsModule¶
Combine base egg skills with freshly extracted skills, re-numbering IDs.
- pynydus.engine.pipeline._merge_mcp(base: pynydus.api.schemas.McpModule, extracted: pynydus.api.schemas.McpModule) pynydus.api.schemas.McpModule¶
Merge MCP server configs from base egg and parsed source.
- pynydus.engine.pipeline._merge_memory(base: pynydus.api.schemas.MemoryModule, extracted: pynydus.api.schemas.MemoryModule) pynydus.api.schemas.MemoryModule¶
Combine base egg memory with freshly extracted memory, re-numbering IDs.
- pynydus.engine.pipeline._merge_secrets(base: pynydus.api.schemas.SecretsModule, extracted: pynydus.api.schemas.SecretsModule) pynydus.api.schemas.SecretsModule¶
Combine base egg secrets with extracted secrets, deduplicating by name.
- pynydus.engine.pipeline._apply_custom_labels(memory: pynydus.api.schemas.MemoryModule, custom_labels: dict[str, str], spawn_log: list[dict] | None = None) None¶
Override memory record labels based on source_store pattern matching.
- pynydus.engine.pipeline._drop_memory_records_with_excluded_labels(memory: pynydus.api.schemas.MemoryModule, excluded: list[pynydus.common.enums.MemoryLabel], spawn_log: list[dict] | None = None) pynydus.api.schemas.MemoryModule¶
Remove memory records whose label is listed in
excluded.
- pynydus.engine.pipeline._package_egg(ctx: pynydus.engine.pipeline.PipelineContext, skills: pynydus.api.schemas.SkillsModule, mcp: pynydus.api.schemas.McpModule, memory: pynydus.api.schemas.MemoryModule, secrets: pynydus.api.schemas.SecretsModule, parse_result: pynydus.api.raw_types.ParseResult | None = None) pynydus.api.schemas.Egg¶
Construct the final Egg with manifest and neutral metadata fields.
- pynydus.engine.pipeline._stash_apm(groups: list[tuple[pynydus.common.enums.AgentType, pathlib.Path, dict[str, str]]], ctx: pynydus.engine.pipeline.PipelineContext) str | None¶
Find and return apm.yml content from source files (passthrough).
- pynydus.engine.pipeline._resolve_a2a_card(groups: list[tuple[pynydus.common.enums.AgentType, pathlib.Path, dict[str, str]]], egg: pynydus.api.schemas.Egg, ctx: pynydus.engine.pipeline.PipelineContext) dict | None¶
Return A2A card: passthrough from source or generate from egg.
- pynydus.engine.pipeline._embed_spec_snapshots(ctx: pynydus.engine.pipeline.PipelineContext) dict[str, str] | None¶
Load spec markdown files and build the snapshots dict with manifest.json.
- pynydus.engine.pipeline._generate_standards_artifacts(egg: pynydus.api.schemas.Egg, groups: list[tuple[pynydus.common.enums.AgentType, pathlib.Path, dict[str, str]]], ctx: pynydus.engine.pipeline.PipelineContext) pynydus.api.schemas.Egg¶
Generate A2A card, AGENTS.md, spec snapshots, and stash apm.yml.
Mutates nothing. Returns a new Egg via
model_copy().