DCFN - Research

Changelog

Currently v0.3.10. ← Back to home

DCFN-Research — Changelog

What's changed in the engine's user-visible output, in reverse chronological order. Pre-1.0 versioning convention:


v0.3.10 — 2026-04-30

Citation walk: hybrid two-pass for multi-source corpora

v0.3.5's bidirectional citation walk shipped assuming all reference IDs were Semantic Scholar paperIds (40-char hex). In practice the merged corpus pulls from 4-6 sources and references[] is mixed-format: OpenAlex Work IDs, PubMed UIDs, arXiv IDs. v0.3.8's S2-only filter prevented the resulting 400-Bad-Request crash but at the cost of ~zero expansion on multi-source corpora (one local test: 749 non-S2 IDs filtered, 0 added).

v0.3.10 closes the gap with a hybrid two-pass design:

Graceful degradation: if S2 batch returns 429 (free-tier rate limit, common) or any other error, the IDs that came in via DOI translation get re-attempted via per-source fetch. So the entire walk doesn't depend on S2 cooperating.

Metadata richness: expand_via_citation_walk return dict now includes ids_resolved_via_s2_native, ids_resolved_via_doi_translation, ids_resolved_via_per_source, ids_unresolved, per_source_breakdown. Operator (and Z reading the report) can see exactly where each neighbor came from and which sources contributed.

Empirical validation (local Single-Cell corpus, 100 OpenAlex neighbors, S2 rate-limited): 25 articles added in 8.6s via Pass 2 OpenAlex fallback alone. With production S2 API key cooperating, Pass 1 + Pass 2 combined would land substantially more.

New module id_translation.py centralizes paper-ID source recognition + DOI prefix formatting so downstream code doesn't sprinkle prefix-matching logic.

Open follow-on (tracked separately): some Pass 2 OpenAlex fetches return papers without abstracts (filtered out by the existing _to_article_record schema requirement). Worth a future Bio/Research investigation — abstract-less papers may still carry useful metadata if the engine is willing to operate without abstracts on those nodes.


v0.3.9 — 2026-04-30

Pipeline timing instrumentation in autonomous-scheduler path (Charter §16)

main.py's user-driven path already had per-stage timing via stage_timings. The autonomous-scheduler path (scheduler.py:_run_autonomous_pipeline) was missing it — only total elapsed was logged. Added per-stage capture for: qeb_encoding, concept_graph, cte_traversal, apriori, svw, hypothesis_generation, calibration, bridge_detection_and_rerank. Surfaces as a single [PIPELINE_TIMING] log line per run + persisted to the report's stage_timings field for downstream tooling.

Triggered by Charter §16 codification (Patents L1 ran 50 min and we had no per-step data to answer "should we upgrade Render tier?"). This closes the gap on the Research autonomous path so the same question is answerable empirically there too.

Note: the multi-source citation walk hybrid (DOI translation + per-source fanout) flagged in v0.3.8 is now tracked as v0.3.10 (next minor).


v0.3.8 — 2026-04-30

Critical fix: every autonomous run since v0.3.7 was crashing

Local Research validation surfaced two bugs in code I shipped earlier today.

Honest caveat on citation-walk effectiveness

The S2 ID filter prevents the crash but reveals a deeper architecture gap: in corpora dominated by non-S2 sources (OpenAlex, PubMed), MOST references are non-S2-format and get dropped. The walk produces near-zero expansion in that case. Test against a real Single-Cell corpus: 749 non-S2 IDs filtered, 0 S2 IDs walked, 0 articles added. The v0.3.5 architectural value (bidirectional cross-source neighbor expansion) doesn't yet land for multi-source corpora.

Follow-on work tracked separately for v0.3.9: translate non-S2 IDs to S2's prefix syntax (DOI:10.x, PMID:NNN) before batching, OR fan out per-source (OpenAlex API for openalex: IDs, PubMed E-utilities for pmid: IDs). Not a v0.3.8 ship — needs design.


v0.3.7 — 2026-04-30

Hypothesis-target granularity + trajectory anti-drift

Two coupled fixes for Perplexity's 2026-04-30 broad-vocabulary findings.


v0.3.6 — 2026-04-30

Vocabulary-bleed suppression in convergence detection


v0.3.5 — 2026-04-30

Engine depth: bidirectional citation-walk corpus expansion (Charter §12 Pattern B)


v0.3.4 — 2026-04-30

Quality fix


v0.3.3 — 2026-04-30

Quality fix


v0.3.2 — 2026-04-30

Patent attribution accuracy

Footer's "Built on" line was undercounting: said "6 U.S. Patents Pending" and named CTE + QECO. Actual total since the Tesseract Composition supplemental landed (2026-04-20) is 8, and the engine rides more than two substrate patents. Updated to:

Same correction applied to the Firebase brand site's DCFN-Research card.


v0.3.1 — 2026-04-30

Quality fix


v0.3.0 — 2026-04-30

Discovery-driven autonomous runs

The Research engine's autonomous-run path now drives from a discovery-agent-fed queue instead of cycling fixed domains. A discovery agent identifies new research topics worth running by querying Semantic Scholar (with PubMed fallback) for substantive recent activity in curated seed areas, derives a topic configuration from the top results, and proposes it for human review. After a 7-day cooldown without rejection, the proposal auto-promotes into the live run queue, where the engine executes the full pipeline against it once or twice before going dormant.

Why this matters: it converts the autonomous path from "run the same three domains every day" (which produces noise) into "surface new research territory worth exploring" (which produces signal). Each run feeds the Bridge Inbox + LEF Ai Upstream telemetry channels — autonomous runs are the substrate's input.

Engine output

Three-layer report architecture


v0.2.0 — 2026-04-08

Engine output

Sources


v0.1.0 — 2026-03-15

Initial deployment. Single-page intake → multi-source ingest → concept graph construction with typed edges → Cognitive Traversal Engine (5 operations: backward / forward / branch cataloging / entropy / golden token) → SVW convergence detection → Apriori pattern mining → Article + Technical Report generation. Free 5 runs / month / browser; $15 unlock for Layer 2 + Layer 3 deeper traversal.