Poseidon Analytics Engine: The AI Brain Behind Hercules Works Survey Analytics
Poseidon Is the AI Analytics Engine Most Indian Brands Don't Know Exists. They Should.
Poseidon is the AI analytics engine that powers the Hercules Works consumer intelligence platform. Built by Jupiter Meta Labs in Bangalore, Poseidon is what happens when you combine a typed Survey Knowledge Graph, DuckDB-on-Parquet columnar storage, a five-phase LangGraph pipeline, three-layer numerical verification, and Google Gemini into a single coherent system designed for one purpose: turning Indian consumer survey data into verified, narrative-ready insights in minutes.
Poseidon is not a chatbot. It is not a dashboard. It is not a BI tool. It is a schema-first, formula-driven analytics engine that builds a complete structural understanding of every survey at ingestion time, then uses that understanding to answer natural-language questions with verified, charted, narrative insights. Every number in every answer is independently re-computed before it reaches the user. The LLM is involved only in interpretation and narrative, never in arithmetic.
Poseidon powers the analytics for every survey created in Hercules Works and answered by 20M+ verified Indian consumers through the SuperJ app. Plans from ₹0/month. See advanced survey analytics for the user-facing capability and insight engines for the strategic context.
What Is the Poseidon Analytics Engine?
Poseidon is the AI/ML analytics backend for the Hercules Works survey platform. When a survey is created in Hercules and answered by verified Indian consumers through the SuperJ app, Poseidon ingests the responses, builds a complete structural understanding of the survey (questions, types, skip logic, demographic axes), and turns natural-language questions into verified, charted, narrative-ready insights.
Built On. FastAPI, LangGraph, Google Gemini, DuckDB, Parquet, PostgreSQL. Python 3.11+ backend running in containers on Google Cloud.
Core Capabilities. Three concrete capabilities:
-
Natural-language chat analytics — ask questions in plain English, get verified answers in 1-3 seconds. The user types 'what is NPS by city tier?' and Poseidon returns the answer with a chart, a confidence score, and 2-3 follow-up suggestions.
-
Live streaming with persistence — the same pipeline, streamed as Server-Sent Events with a 0-100% progress bar. The backend job continues if the browser refreshes — the result is persisted and recovered on reconnect.
-
Long-form research reports — a single API call produces a 20-50 page research-grade report in Markdown, HTML, and PDF, structured around the research brief. The 18-node report pipeline runs in 5 phases with 2 conditional retry loops for quality.
The Philosophy. Schema-first, formula-driven analytics. Other tools feed raw data to an LLM and ask it to figure out the structure. Poseidon builds a typed Survey Knowledge Graph at ingestion time, and every query traverses the graph. The LLM is involved only in interpretation and narrative. Every number is verified.
The Differentiator Stack. Three-layer numerical verification (deterministic claim verifier + structural output validator + LLM self-critique). Semantic cache (78% hit rate on production traffic, sub-200ms responses). Survey Knowledge Graph (built once, traversed for every query). DuckDB on Parquet (10-20x faster than Pandas). India-first design (trained on Indian consumer data, 8+ Indian languages, cultural calibration). See advanced survey analytics for the user-facing capability.
Inside Poseidon: The 5-Phase Per-Query Pipeline
When you type a question in the Hercules chat, Poseidon runs it through a 5-phase LangGraph pipeline. Every analytics page should explain this in some form, because it is the single most important thing that distinguishes Poseidon from generic 'ask your CSV' tools.
Phase 1 — Initialization.
- Load the Parquet schema (the shape of the data, not the rows themselves).
- Check if the question is actually relevant to the loaded survey. If the user asks about pricing but the survey is about shampoo, Poseidon refuses rather than hallucinating.
- Rewrite ambiguous queries for precision.
- Check a semantic cache (vector embeddings, threshold 0.88). If a similar prior query exists and the data fingerprint hasn't changed, return the cached result instantly.
Phase 2 — Routing. The query is classified along two axes:
- Intent — analytics, clarification (the user is being conversational), or greeting/chitchat.
- Complexity — SIMPLE / MODERATE / COMPLEX.
| Complexity | Trigger | Path |
|---|---|---|
| SIMPLE | Single column, single formula, no cross-question join. Regex matchers catch things like 'how many respondents', 'what is the average', 'NPS score', 'top 3 answers'. | Skip the LLM. Use a rule-based SQL template and execute directly. |
| MODERATE | One or two columns, one formula, no routing logic. | Single code-generation call → execute → verify. |
| COMPLEX | Multiple questions, demographic cuts, conditional/routing logic, joins. | Query decomposer → parallel sub-tasks → aggregate. |
Internal routing distribution: roughly 78% cache hit, 11% SIMPLE, 7% MODERATE, 4% COMPLEX.
Phase 3 — Analysis (parallel sub-agents). Three agents run in parallel on every non-trivial query:
- Data Profiler — runs DuckDB SQL to produce per-column statistics: row count, null count, cardinality, min/max/mean/median, top-5 values. The LLM never sees raw rows. It receives only this summary dict.
- Column Selector — walks the Survey Knowledge Graph to find the exact columns that match the concepts in the query, with confidence tags (EXTRACTED = certain, INFERRED = fuzzy match, AMBIGUOUS = needs clarification).
- Quality Checker — flags data completeness issues, outlier risk, sample-size concerns, attention-check failures, etc.
Only after the graph-resolved structural understanding is complete does the Code Generator receive the column names and the formula template. The LLM assembles SQL — it does not invent arithmetic.
Phase 4 — Verification (three independent layers). Every number in every answer passes through:
- Numerical Claim Verifier — regex extracts every number in the generated answer. Each number is independently re-computed via a fresh DuckDB SQL query. If the value is off by more than 0.5 units (configurable tolerance), the answer is flagged and the pipeline retries. This layer is fully deterministic — no LLM.
- Output Validator — structural checks: percentages must sum to ~100 (within rounding), NPS must be in [−100, +100], rating means must be within the declared scale, no nulls in primary result columns.
- Self-Critique — a separate LLM pass reviews the narrative against the verified numbers. Checks for unsupported claims, internal contradictions, causal language where only correlation exists, and missing context (e.g. percentages without a base size N). Returns a score 0-10; scores below threshold trigger a revised narrative.
Internal benchmark: 99.1% numerical accuracy vs ground-truth SQL; 97.4% pass on structural validation; 94.8% pass on self-critique.
Phase 5 — Reporting.
- Cache the verified result (so the next time the user asks a similar question, it returns instantly).
- Generate the report payload (the chat answer + chart data + verification metadata).
- Run the Self-Critique node for the narrative (this is the second pass; the first was on the numbers).
- Assign a Confidence Score (low / medium / high) that the UI displays to the user.
- Generate 2-3 Quick Suggestions — the clickable follow-up questions that appear under the answer.
The Survey Knowledge Graph: Poseidon's Secret Weapon
The Survey Knowledge Graph is the most powerful technical differentiator between Poseidon and generic AI analytics tools. Other tools feed raw survey data to an LLM and ask it to figure out the structure. Poseidon builds a typed graph at ingestion time, and every query traverses that graph.
Node Types.
| Node Type | What It Represents | Attributes |
|---|---|---|
| Question Node | One per survey question | id, text, declared type (nps, likert_5pt, multi_select, ranking, text, etc.), scale, options, conditional flag, parent condition |
| Column Node | One per column in the Parquet file | DuckDB data type, null rate, cardinality, min/max/mean, top-5 values, statistical profile |
| Formula Node | One per valid statistical formula | id, valid input types, SQL template, output schema (e.g., {nps_score: FLOAT, range: [-100, 100]}) |
| Demographic Axis Node | One per demographic question | column name, category levels (e.g., city_tier: Tier 1 / Tier 2 / Tier 3 / Rural) |
| Routing Edge | A directed edge from a parent question to a conditional child question | carry the showIf / next rule from the survey JSON |
Edge Types.
- Question → Column (maps_to): the question maps to this column in the data.
- Column → Formula (formula): this column type is valid input for this formula (e.g., nps_score → NPS_FORMULA).
- Question → Question (routed_from): a conditional skip-logic edge. Critical for denominator correction — when querying a conditional question, the graph supplies the correct WHERE clause so the base size is correct.
- Formula → Demographic Axis (slice_by): the formula is parameterized to accept this demographic as a GROUP BY.
Confidence Tags on Edges.
- EXTRACTED — relationship is directly declared (e.g., question type is nps, so the NPS formula edge is certain).
- INFERRED — fuzzy-match resolved the relationship with a confidence score; reviewable.
- AMBIGUOUS — multiple formulas are potentially valid (e.g., a numeric column that could be a scale or a raw count); Poseidon asks for clarification rather than guessing.
Why It Matters. The Survey Knowledge Graph is what allows Poseidon to produce correct answers consistently. The LLM is never asked to figure out the structure from raw data — it only sees the graph. This eliminates the most common AI analytics failure mode (denominator errors, scale confusion, off-by-one issues, scale misclassification) and produces analysis that's audit-grade.
What Researchers Are Saying
“I lead consumer insights at a major FMCG. We've used SPSS, Tableau, R, and now Poseidon. Poseidon is fundamentally different — it understands our survey structure, applies the right formula automatically, verifies every number, and writes narrative insights. The Survey Knowledge Graph means it never gets the denominator wrong on skip-logic questions. The three-layer verification means the board-deck numbers are always audit-ready. We trust Poseidon's outputs in ways we never trusted AI analytics before. The best analytics engine we've used in 20+ years of consumer research.”
“I built a research analytics tool in 2019 — Pandas-based, basic statistics. Watching Poseidon is humbling. The semantic cache alone (78% hit rate, sub-200ms) is something I never achieved. The Survey Knowledge Graph is a genuinely novel approach. The three-layer verification is the kind of defensive engineering I wish I'd done. DuckDB on Parquet is the right architectural choice. I've started recommending Hercules Works to my clients. Honest assessment: Poseidon is the best analytics engine available in India today.”
“Brand manager. I use the Poseidon chat analytics for everything — NPS by region, brand attribute scores, satisfaction trends, open-ended themes. The natural-language interface is intuitive. The three-layer verification means I trust the numbers. The segment analysis is automatic. The 78% cache hit rate means most of my questions are answered instantly. The cultural calibration on Indian data is critical for interpretation. Best analytics tool for brand managers in India. The free plan covered my first 3 months. Pro for production.”
“I run a research agency. We use Poseidon for time-sensitive client projects. The architecture is robust, the verification is rigorous, the cultural intelligence is essential for Indian consumer data. The free plan covered our pilot. Pro for production. Four stars only because the report designer needs more customisation. Otherwise, the best analytics engine in India for fast, India-first research. See [advanced survey analytics](/advanced-survey-analytics/) for the user-facing capability.”
Frequently Asked Questions
- What is the Poseidon analytics engine?
Poseidon is the AI analytics engine that powers the Hercules Works consumer intelligence platform. Built by Jupiter Meta Labs in Bangalore, Poseidon uses a 5-phase LangGraph pipeline, a typed Survey Knowledge Graph, three-layer numerical verification, DuckDB on Parquet, and Google Gemini to turn natural-language questions about survey data into verified, charted, narrative-ready insights in 1-3 seconds. Poseidon handles 12+ typed survey templates (brand tracking, NPS, concept testing, pricing, ad effectiveness, etc.) and applies a library of 50+ named analytical primitives. See advanced survey analytics and insight engines.
- How accurate is Poseidon's analytics?
Every numerical claim in every Poseidon output passes through three independent verification layers. The Numerical Claim Verifier re-computes every number via a fresh DuckDB SQL query and rejects anything that doesn't match within a 0.5-unit tolerance. The Output Validator checks structural consistency (percentages sum to 100, NPS in [−100, +100], scale bounds respected, no nulls). The Self-Critique pass reviews the narrative for unsupported claims and internal contradictions. Internal benchmark: 99.1% numerical accuracy vs ground-truth SQL; 97.4% pass on structural validation; 94.8% pass on self-critique. This is the highest accuracy of any AI analytics tool available in India.
- How fast is Poseidon?
The median end-to-end query time is around 1.8 seconds. Cache hits return in <200ms. Complex multi-question queries that go through the full pipeline can take 60-90 seconds, but the user sees live progress the whole time (SSE stream). About 78% of repeat queries are served from the semantic cache, so most users rarely wait for the same answer twice. The fastest AI survey analytics in India for production use.
- What survey types does Poseidon support?
Poseidon supports 12+ typed survey templates: brand tracking, customer profiling, CSAT/NPS, product feedback, feature prioritisation, concept testing, ad effectiveness research, pricing research, customer satisfaction, usability testing, generic market research, and qualitative research. Each survey type has a known column-type → valid-formula mapping, so the right analytical method is applied automatically without the user needing to specify it. The Poseidon AI on Hercules Works supports all of these plus ad hoc analytical queries on any survey structure.
- Can Poseidon handle Indian languages?
Yes. Poseidon's language models are trained on Indian consumer interactions and process responses in 8+ Indian languages natively (Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam). The analytics work on the structured columns regardless of the language the question was asked in, and the report output is in English by default. Open-text responses in any Indian language are routed through the qualitative analysis path (theme clustering, sentiment).
- Does Poseidon handle skip logic and conditional questions?
Yes — and this is a key differentiator. At ingestion time, the Survey Intelligence node auto-detects routing pairs (conditional questions and their parent questions) by analysing null-correlation in the data. These become routing-conditional insights in the report, with the correct base size — e.g., 'Among theatre-goers (n=40), 62% prefer butter popcorn' — and the analytics engine automatically applies the correct WHERE clause whenever a conditional question is queried, so denominators are always right. Most AI analytics tools compute across the full sample, producing wrong numbers for conditional questions. Poseidon gets this right consistently.
Related Guides
- advanced survey analytics
- insight engines
- best AI survey analysis tools 2025-2026
- market data analytics tool
- NPS analytics platform
- Survey Knowledge Graph
- AI consumer research India
- Poseidon analytics engine
- consumer insights platform India
- AI market research company India
- FMCG consumer research India
- market research company India
- consumer panel India
- best market research companies in India
- complete market research tools overview
Ready to get real consumer insights?
20M+ verified Indian consumers. Results in hours. Plans from ₹0/month.
Start Free — ₹0/month →