Code Style¶
Rules for docstrings, comments, formatting, and lint in the pynydus/ tree.
Humans and automation should follow these conventions.
Typography¶
Avoid these in docstrings, # comments, Markdown prose meant for readers, or
user-visible runtime strings (CLI output, raised errors, log messages to humans):
Em dash (Unicode U+2014). Prefer a colon (label: explanation), a comma, parentheses, or two short sentences.
Semicolon (
;) to join clauses. Use a period, comma, or parentheses. Google-styleArgs:lines use a colon after the parameter name only. Do not chain clauses with;inside descriptions.
Do not change semicolons required by Python syntax or by wire-format literals
that must match an external protocol. When joining human-readable fragments, use
| or newlines instead of "; ".
Docstring style: Google¶
Use Google-style docstrings for all modules, classes, and functions.
Module-level docstrings¶
Every .py file must start with a module docstring. State what the module
does, not what it contains. If the module implements a multi-step process, list
the steps.
"""Hatching pipeline: transforms an Egg into a target runtime.
Two modes:
rebuild (default): render from structured egg modules via connector
passthrough replay redacted raw/ snapshot verbatim
Pipeline steps:
1. Version check
2. Build file dict
3. LLM polish
4. Secrets IN
5. Write to disk
6. Hatch log
"""
Function and method docstrings¶
Every public function, every private function longer than about five lines, and every method on a public class must have a docstring.
Required sections (include only those that apply):
Section |
When to include |
|---|---|
Summary line |
Always. One imperative sentence. |
Args |
Any function with parameters beyond |
Returns |
Any function that returns a non-None value. |
Raises |
Any function that intentionally raises. |
Format:
def refine_hatch(
file_dict: dict[str, str],
egg: Egg,
llm_config: LLMTierConfig,
*,
log: list[dict] | None = None,
) -> dict[str, str]:
"""Adapt reconstructed files for the target platform via LLM.
Sends each file through the configured LLM tier for polishing or
cross-platform adaptation. Retries up to REFINEMENT_RETRY_LIMIT
times if redaction placeholders are dropped, and falls back to the
original content when all attempts fail.
Args:
file_dict: Mapping of filename to file content (placeholder'd).
egg: The Egg being hatched (provides manifest and secrets list).
llm_config: LLM provider, model, and API key.
log: If set, hatch log entries are appended here.
Returns:
Updated file dict. Files whose placeholders could not be
preserved are returned unchanged.
"""
Bad (do not do this):
def refine_skills(skills, llm_config, spawn_log=None):
"""Standalone skill refinement: delegates to _refine_skills via EggPartial."""
This restates internal wiring. It tells the caller nothing about what the function does, what it expects, or what it returns.
Class docstrings¶
One-line summary of purpose. If the class has important invariants or lifecycle notes, add them after the summary.
class PipelineContext:
"""Mutable state threaded through each spawn pipeline step.
All Nydusfile fields are front-loaded here at construction time.
No step should reach back into NydusfileConfig after this point.
"""
Docstrings for trivial functions¶
Functions that are genuinely one-line (e.g. a property, a thin wrapper) can use a single-line docstring. Do not pad with empty Args/Returns sections when the signature is self-documenting:
def _extract_placeholders(text: str) -> set[str]:
"""Return all {{SECRET_NNN}} / {{PII_NNN}} tokens in *text*."""
return set(_PLACEHOLDER_RE.findall(text))
In-function comments¶
When to comment¶
Comment why, not what. The code already says what it does. A comment should explain:
Non-obvious intent or constraints (“we check version compat before any file I/O so we fail fast on stale eggs”)
Domain rules that are not obvious from the code (“the LLM never sees real secrets: only placeholder tokens”)
Security boundaries (“secrets IN is the last transform before disk. Nothing after this point should touch file content”)
The reason for a specific ordering or grouping
Edge cases and gotchas (“Pydantic model_copy creates a shallow copy of lists, so we extend the caller’s spawn_log explicitly”)
When NOT to comment¶
Do not write comments that narrate the code:
# BAD: restates the code
secret_counter = 1 # Initialize secret counter to 1
# BAD: obvious from the function name
group_files = _filter_files_by_patterns(group_files, patterns) # Filter files
# BAD: labels a block that is already self-explanatory
url = url.rstrip("/") # Strip trailing slash for clean URL joining
Step comments in pipelines¶
When a function implements a numbered pipeline (spawn steps 1-10, hatch steps 1-7), use step-header comments to mark boundaries. These are structural, not narrating. Keep them short and match the documented step names:
# --- Step 3: Redaction (secret scan + PII) ---
After the header, add a brief “why” comment only if the step has non-obvious behavior:
# --- Step 4: Secrets IN (last transform before disk) ---
# Real values are injected here so the LLM never sees them.
if placeholder_map:
file_dict = _substitute_secrets(file_dict, placeholder_map)
Section separators¶
Use the project’s existing separator style for grouping related functions:
# ---------------------------------------------------------------------------
# Step 1 helpers: Base egg resolution
# ---------------------------------------------------------------------------
Do not over-section. One separator per logical group is enough. Do not add a separator for a single function.
Project-specific API expectations¶
Every module needs a file-level docstring (what the module does).
pynydus/anddocs/conf.py: public functions with parameters need anArgs:section (andReturns:/Raises:when applicable).tests/still need module docstrings but are not required to addArgs:on everytest_*function.Private helpers longer than about eight lines should have at least a one-line summary.
Prefer step headers and short why comments at security boundaries. Avoid comments that only restate the next line of code.
Formatting and lint¶
Local CI-style checks live in the repository root (the nydus/ directory
that contains Makefile, pynydus/, and tests/).
make check¶
Runs Ruff format check and lint on pynydus/ and tests/. There are
no writes. Use make fmt to auto-format and apply safe Ruff fixes.
Ruff¶
Rules: E, F, I, UP (pyflakes, pycodestyle, isort, pyupgrade). Line
length: 100. Target: Python 3.10.
Run manually:
uv run ruff format --check pynydus tests
uv run ruff check pynydus tests