C
Chisel
CDP Demo
v0.1.0
—— capability 4 of 6 · Ingestion

Drop a file. We parse, tag, store.

Upload CSV exports, PDF reports, or scanned notes. The pipeline classifies the source, extracts a structured summary, applies LLM-style tags, and persists the row — all in one round-trip.

accepts: .csv (auto-parsed) · .pdf (LLM-tagged) · .png / .jpg (OCR-queued) · others (stored)

Recent uploads

20 rows · most recent first
career_pathways_handbook_2026.pdf
170.4 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
stream_choice_framework_v3.pdf
217.2 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
career_tree_diagram.png
579.4 KB · today

Image OCR'd: 12 lines of handwritten notes recovered. Tagged: stream-choice, parent-discussion.

scanned-notes ocr manual-review-pending
DAT_V_batch_export_oct.csv
90.0 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
RIASEC_grade11_responses.csv
38.5 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
Aanya_Iyer_assessment_report.pdf
144.4 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
whiteboard_session_notes_oct12.jpg
846.5 KB · today

Image OCR'd: 12 lines of handwritten notes recovered. Tagged: stream-choice, parent-discussion.

scanned-notes ocr manual-review-pending
DAT_V_batch_export_oct.csv
67.4 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
career_pathways_handbook_2026.pdf
442.0 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
parent_meeting_template.pdf
375.9 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
grade_10_aptitude_results_q3.csv
96.2 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
stream_choice_framework_v3.pdf
112.0 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
whiteboard_session_notes_oct12.jpg
1.20 MB · today

Image OCR'd: 12 lines of handwritten notes recovered. Tagged: stream-choice, parent-discussion.

scanned-notes ocr manual-review-pending
whiteboard_session_notes_oct12.jpg
822.7 KB · today

Image OCR'd: 12 lines of handwritten notes recovered. Tagged: stream-choice, parent-discussion.

scanned-notes ocr manual-review-pending
career_pathways_handbook_2026.pdf
532.2 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
RIASEC_grade11_responses.csv
103.2 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
Aanya_Iyer_assessment_report.pdf
380.1 KB · today

PDF parsed via embedded LLM. Extracted: 1 student profile, 6 assessment scores, 4 counsellor notes.

report student-profile llm-tagged
DAT_V_batch_export_oct.csv
64.4 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
RIASEC_grade11_responses.csv
25.3 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
DAT_V_batch_export_oct.csv
106.4 KB · today

Auto-parsed: 47 student rows, 6 RIASEC dimensions detected, 3 missing values flagged.

aptitude grade-10 auto-parsed
—— how Ingestion works

CSV, PDF, or transcript — land structured, every time

1
Upload

Multipart form, no JS framework. File hits the Axum handler, gets stored on disk under uploads/ with a sha256 keyed name.

2
Detect

Type is sniffed from extension + magic bytes. CSV → row parser. PDF → pdf-extract. Transcript → plain-text branch.

3
Tag

LLM extracts a 1-line summary + 4–6 topical tags (e.g. cs, analytical, commerce-aspirational). In demo mode tags are deterministic from filename.

4
Index

Row written to uploads table, FTS index refreshed automatically by Postgres. Discovery picks it up on next search.