Skip to main content
Tuteliq processes content through a multi-stage AI pipeline designed specifically for child safety. Every input — whether text, voice, or image — passes through detection, contextual analysis, and age-calibrated scoring before returning an actionable result.

The detection pipeline

1. Content ingestion & language detection

When you send a request to any Safety endpoint, Tuteliq first normalizes the input and detects the content language. Text is analyzed directly. Audio files are transcribed via Whisper and then analyzed as text with timestamped segments preserved. Images are processed through Vision AI for visual classification and OCR text extraction simultaneously — so a screenshot of a harmful conversation is caught by both the visual and textual classifiers. Language detection uses a layered approach for maximum reliability:
  1. Explicit code — If you pass a language parameter, it is used directly.
  2. Trigram detection — If no explicit language is given, the API runs trigram-based analysis on the input text.
  3. LLM confirmation — The LLM also identifies the content language during analysis. When the LLM’s detection is a supported language, it takes precedence — this ensures correct detection for closely related languages like Norwegian, Swedish, and Danish.
The detected language is used to inject culture-specific analysis guidelines (local slang, idioms, harmful terms, grooming indicators, self-harm coded vocabulary, and filter evasion techniques) into the classification prompts. 27 languages are supported: English (stable), plus all 24 EU official languages, Ukrainian, Norwegian, and Turkish (beta). See Language Support for the full list.

2. Multi-model classification

Rather than relying on a single model, Tuteliq runs content through specialized classifiers for each harm category in parallel:
ClassifierWhat it detects
Grooming DetectionTrust escalation, secrecy requests, isolation attempts, boundary testing, gift/reward patterns
Bullying & HarassmentDirect insults, social exclusion, intimidation, cyberstalking, identity-based attacks
Self-Harm & Suicidal IdeationCrisis language, passive ideation, planning indicators, self-injury references
Substance UsePromotion, solicitation, normalization of drug/alcohol use toward minors
Eating DisordersPro-anorexia/bulimia content, body dysmorphia triggers, dangerous diet promotion
Depression & AnxietyPersistent mood indicators, hopelessness patterns, withdrawal signals
Compulsive UsageEngagement manipulation, addiction-pattern reinforcement, dark patterns targeting minors
Sexual ExploitationExplicit solicitation, sextortion patterns, inappropriate sexual content directed at minors
Social EngineeringPretexting, impersonation, urgency manipulation, authority exploitation targeting minors
App FraudFake app promotion, malicious download links, clone app distribution, fraudulent reviews
Romance ScamLove-bombing, financial requests, identity fabrication, isolation from support networks
Mule RecruitmentEasy money offers, account sharing requests, laundering language, recruitment pressure
Gambling HarmUnderage gambling promotion, addiction patterns, predatory odds, bet pressure tactics
Coercive ControlIsolation tactics, financial control, monitoring/surveillance, threat patterns in relationships
Vulnerability ExploitationTargeting based on age, disability, emotional state, or financial hardship — with cross-endpoint vulnerability profiling
RadicalisationExtremist rhetoric, us-vs-them framing, recruitment patterns, dehumanisation of outgroups
Age VerificationDocument-based age verification, biometric age estimation, age assurance for platform compliance
Beta — available on Pro tier and above
Identity VerificationDocument verification, liveness detection, identity confirmation to prevent impersonation
Beta — available on Business tier and above
Each classifier produces an independent confidence score. When multiple classifiers fire on the same content (e.g., grooming + sexual exploitation), Tuteliq combines the signals to produce a holistic risk assessment. The /analyse/multi endpoint lets you run up to 10 classifiers on a single piece of content in one API call. When vulnerability exploitation detection is included, it produces a cross-endpoint vulnerability modifier that automatically adjusts severity scores across all other results — amplifying risk when the content targets vulnerable individuals. Valid endpoint values for /analyse/multi:
Endpoint IDClassifier
bullyingBullying & Harassment
groomingGrooming Detection
unsafeUnsafe Content (KOSA categories)
social-engineeringSocial Engineering
app-fraudApp-based Fraud
romance-scamRomance Scam
mule-recruitmentMule Recruitment
gambling-harmGambling Harm
coercive-controlCoercive Control
vulnerability-exploitationVulnerability Exploitation
radicalisationRadicalisation

3. Context engine

This is where Tuteliq diverges from keyword-based filters. The context engine evaluates:
  • Linguistic intent — Is “I want to kill myself” an expression of frustration over a video game, or a genuine crisis signal? Tuteliq analyzes surrounding context, tone, and conversational history to distinguish the two.
  • Relationship dynamics — A single message may appear harmless. The context engine tracks multi-turn escalation patterns — compliments, then secrecy requests, then isolation attempts, then boundary violations — that only become visible across a conversation. Every conversation-aware endpoint returns a message_analysis array that shows exactly how risk escalates message by message, with individual risk scores and detected tactics for each entry.
  • Platform norms — Teen slang, gaming culture, and social media language evolve fast. The context engine recognizes that “I’m literally dead” in a group chat has a fundamentally different risk profile than the same phrase in a private message to a younger child.

4. Age-calibrated scoring

The same content carries different risk depending on the child’s age. Tuteliq adjusts severity across four brackets:
Age bracketCalibration
Under 10Highest sensitivity. Almost any exposure to harmful content is flagged at elevated severity.
10–12High sensitivity. Beginning to encounter peer conflict; distinguishes normal friction from targeted harassment.
13–15Moderate sensitivity. Accounts for typical teen communication patterns while remaining alert to genuine risk.
16–17Adjusted sensitivity. Recognizes greater autonomy while maintaining protection against grooming, exploitation, and crisis signals.
You specify the age_group in your request context. If omitted, Tuteliq defaults to the most protective bracket.

5. Response generation

Context fields

You can pass a context object with any detection request to improve accuracy:
FieldTypeEffect
age_group / ageGroupstringTriggers age-calibrated scoring (e.g., "10-12", "13-15", "under 18")
languagestringISO 639-1 code. Auto-detected if omitted.
platformstringPlatform name (e.g., "Discord", "Roblox"). Adjusts for platform-specific norms.
conversation_historyarrayPrior messages for context-aware analysis. Returns per-message message_analysis.
sender_truststring"verified", "trusted", or "unknown". Verified senders suppress impersonation flags.
sender_namestringSender identifier (used with sender_trust).
countrystringISO 3166-1 alpha-2 code (e.g., "GB", "US", "SE"). Enables geo-localised crisis helpline data. Falls back to user profile country if omitted.
When sender_trust is "verified", the API fully suppresses AUTH_IMPERSONATION — a verified sender cannot be impersonating an authority by definition. Routine urgency (schedules, deadlines) is also suppressed. Only genuinely malicious content (credential theft, phishing links, financial demands) will be flagged.

Crisis support resources (support_threshold)

Detection responses can include country-specific crisis helplines and response guidance. The support_threshold parameter controls when these are included:
ValueBehavior
lowInclude for Low severity and above
mediumInclude for Medium severity and above
high(Default) Include for High severity and above
criticalInclude only for Critical severity
Critical severity always includes support resources regardless of the threshold setting.
Pass support_threshold in the options object or as a top-level request field:
{
  "text": "I don't want to be here anymore",
  "context": { "ageGroup": "13-15" },
  "options": { "support_threshold": "medium" }
}
Every response includes:
  • unsafe (boolean, legacy endpoints) or detected (boolean, new detection endpoints) — Clear yes/no for immediate routing decisions. Legacy endpoints return unsafe; newer detection endpoints use detected instead.
  • categories (array) — Which KOSA harm categories were triggered.
  • severity (string) — low, medium, high, or critical, calibrated to the age group.
  • risk_score (float, 0.0–1.0) — Granular score for threshold-based automation.
  • confidence (float) — Model confidence in the classification.
  • rationale (string) — Human-readable explanation of why the content was flagged. Useful for trust & safety review and audit trails.
  • message_analysis (array, conversation-aware endpoints) — Per-message risk breakdown, returned when conversation_history is provided. Each entry contains message_index, risk_score, flags, and summary, making the escalation sequence visible for dashboards and reporting. Available on grooming, social engineering, app fraud, romance scam, mule recruitment, gambling harm, coercive control, vulnerability exploitation, and radicalisation endpoints.
  • recommended_action (string) — Suggested next step, such as “Escalate to counselor” or “Block and report.”
  • language (string) — Resolved language code (ISO 639-1) used for analysis, auto-detected or explicit.
  • language_status (string) — "stable" for English, "beta" for all other supported languages.

Beyond detection

Tuteliq doesn’t stop at “this content is unsafe.” Two additional endpoints complete the workflow:

Action plan generation

The /guidance/action-plan endpoint takes a detection result and generates age-appropriate guidance tailored to the audience:
  • For children — Gentle, reading-level-appropriate language explaining what happened and what to do next.
  • For parents — Clear explanation of the detected risk with suggested conversations and resources.
  • For trust & safety teams — Technical summary with recommended platform actions and escalation paths.

Incident reports

The /reports/incident endpoint converts raw conversation data into structured, professional reports suitable for school counselors responding to bullying incidents, platform moderators documenting patterns of abuse, and compliance teams maintaining audit trails for KOSA reporting.

Architecture principles

Fully stateless. Every API call is independent — Tuteliq never stores conversation text, context, or session state between requests. This is a deliberate privacy-by-design decision: when processing children’s data under GDPR/COPPA, the safest data is data you never store. Pass conversation history with each request that needs it; results are returned and content is discarded. No training on your data. Content sent to Tuteliq is used solely for real-time analysis and is not retained for model training. See the GDPR section for data retention details. Parallel processing. All harm classifiers run simultaneously, not sequentially. This is how Tuteliq maintains sub-400ms response times even when checking against all nine KOSA categories. Policy-configurable. Use the /policy/ endpoint to adjust detection thresholds, category weights, and moderation rules for your specific use case — without changing your integration code.

Next steps

Quickstart

Make your first detection call in under 5 minutes.

KOSA Compliance

See how each harm category maps to regulatory requirements.