Flick Knowledge Base
Repository docs from .qoder/repowiki
Search, browse, and read the generated project wiki without leaving the repo.
Content Moderation
Referenced Files in This Document
moderator.service.tsaho-corasick.tsnormalize.tswords-moderation.repo.tswords-moderation.controller.tsreport-moderation.service.tscontent-moderation.service.tsuser-moderation.service.tsuser-moderation.controller.tsReportsPage.tsxBannedWordsPage.tsxReportPost.tsxreport.tsocr.tsextract.controller.tsModeratedText.tsxnotification.service.tsNotification.ts
Table of Contents
Introduction
This document describes the content moderation system, focusing on automated filtering using the Aho-Corasick algorithm, reporting mechanisms, admin review workflows, banned word management, content flagging, user reporting, moderation queues, OCR integration for text extraction, moderation policies, appeals, and administrative oversight. It also outlines moderation workflows and integration with the notification system.
Project Structure
The moderation system spans three layers:
- Backend services and repositories under server/src/modules/moderation and server/src/infra/services/moderator
- Admin UI under admin/src/pages and admin/src/components
- Web UI under web/src/components and web/src/services/api
graph TB
subgraph "Web Frontend"
W_Report["report.ts"]
W_MText["ModeratedText.tsx"]
W_OCR["ocr.ts"]
end
subgraph "OCR Service"
O_Extract["extract.controller.ts"]
end
subgraph "Server Backend"
S_ModSvc["moderator.service.ts"]
S_Aho["aho-corasick.ts"]
S_Normalize["normalize.ts"]
S_WordsRepo["words-moderation.repo.ts"]
S_WordsCtrl["words-moderation.controller.ts"]
S_ReportsSvc["report-moderation.service.ts"]
S_ContentSvc["content-moderation.service.ts"]
S_UserSvc["user-moderation.service.ts"]
S_UserCtrl["user-moderation.controller.ts"]
S_Notif["notification.service.ts"]
end
subgraph "Admin UI"
A_Reports["ReportsPage.tsx"]
A_Banned["BannedWordsPage.tsx"]
A_ReportComp["ReportPost.tsx"]
end
W_MText --> S_ModSvc
W_Report --> S_ReportsSvc
W_OCR --> O_Extract
O_Extract --> S_ModSvc
S_ModSvc --> S_Aho
S_ModSvc --> S_Normalize
S_ModSvc --> S_WordsRepo
S_WordsCtrl --> S_WordsRepo
A_Reports --> S_ReportsSvc
A_Banned --> S_WordsCtrl
A_ReportComp --> S_ContentSvc
A_ReportComp --> S_UserSvc
S_ReportsSvc --> S_ContentSvc
S_ReportsSvc --> S_UserSvc
S_ModSvc --> S_NotifDiagram sources
moderator.service.tsaho-corasick.tsnormalize.tswords-moderation.repo.tswords-moderation.controller.tsreport-moderation.service.tscontent-moderation.service.tsuser-moderation.service.tsuser-moderation.controller.tsReportsPage.tsxBannedWordsPage.tsxReportPost.tsxreport.tsocr.tsextract.controller.tsnotification.service.ts
Section sources
moderator.service.tsReportsPage.tsxBannedWordsPage.tsx
Core Components
- Aho-Corasick-based dynamic moderation engine with normalization and boundary checks
- Perspective API-based policy validator for toxicity and related attributes
- Banned word repository and CRUD APIs for managing global blocklists
- Reporting subsystem for user-flagged content with admin workflows
- Content moderation actions (ban/shadow ban/unban) and user moderation controls
- OCR integration pipeline for extracting text from documents
- Moderation-aware UI components and admin dashboards
Section sources
moderator.service.tswords-moderation.repo.tsreport-moderation.service.tscontent-moderation.service.tsuser-moderation.service.tsextract.controller.ts
Architecture Overview
The moderation pipeline integrates real-time text validation and OCR-extracted text through a unified moderation service. Dynamic filtering uses Aho-Corasick on normalized text, while policy validation leverages Perspective API. Admins manage banned words and review reports; users can report content. Notifications are available for moderation outcomes.
sequenceDiagram
participant Client as "Web Client"
participant API as "Web API Layer"
participant Mod as "ModeratorService"
participant Repo as "BannedWordsRepo"
participant AC as "AhoCorasick"
participant Norm as "Normalize"
participant Persp as "Perspective API"
Client->>API : Submit content
API->>Mod : moderateText({text, contextText, runValidator})
Mod->>Repo : listBannedWords()
Repo-->>Mod : words[]
Mod->>AC : build matchers from words
Mod->>Norm : normalize(normal/strict)
Mod->>AC : search(normalized text)
AC-->>Mod : matches[]
alt matches found
Mod-->>API : allowed=false, violation=CONTENT_MODERATION_VIOLATION
else no matches
Mod->>Persp : analyze(text, languages, attributes)
Persp-->>Mod : scores/spans
alt scores exceed thresholds
Mod-->>API : allowed=false, violation=CONTENT_POLICY_VIOLATION
else acceptable
Mod-->>API : allowed=true, violation=null
end
endDiagram sources
moderator.service.tsnormalize.tsaho-corasick.tswords-moderation.repo.ts
Detailed Component Analysis
Automated Content Filtering with Aho-Corasick
The dynamic moderation engine compiles banned words into three Aho-Corasick automata:
- Strict matcher: normalized strict mode words
- Normal matcher: normalized normal mode words
- Normal variants matcher: strict-normal variants of normal words
Normalization supports Unicode decomposition, leet-speak mapping, and strict character filtering. Boundary checks ensure matches occur at word boundaries. Wildcard patterns are supported via DFS over normalized tokens.
classDiagram
class ModeratorService {
-compiled : CompiledModerationSet
-loadingPromise : Promise
-cachedVersion : Date
-lastVersionCheck : number
+moderateText(input) IntegratedModerationResult
+rebuildMatcher() Promise
-ensureCompiled() Promise
-buildMatcherFromDatabase() Promise
}
class CompiledModerationSet {
+strictMatcher : AhoCorasick
+normalMatcher : AhoCorasick
+normalVariantsMatcher : AhoCorasick
+strictWords : CompiledWord[]
+normalWords : CompiledWord[]
+normalVariantWords : CompiledWord[]
+wildcardPatterns : PatternPayload[]
}
class AhoCorasick {
+search(text) AhoMatch[]
}
class Normalize {
+normalizeText(input, mode) NormalizedText
+isBoundaryMatch(text, start, end) boolean
}
ModeratorService --> CompiledModerationSet : "compiles"
CompiledModerationSet --> AhoCorasick : "contains"
ModeratorService --> Normalize : "uses"
ModeratorService --> AhoCorasick : "search()"Diagram sources
moderator.service.tsaho-corasick.tsnormalize.ts
Section sources
moderator.service.tsnormalize.tsaho-corasick.ts
Policy Validation with Perspective API
After dynamic filtering, the validator performs toxicity and related attribute scoring. It detects spam heuristics, self-harm encouragement, and applies language detection. Scores exceeding configured thresholds produce policy violations.
flowchart TD
Start(["validateContent"]) --> Spam["Spam heuristics"]
Spam --> |Match| DenySpam["Deny: SPAM"]
Spam --> |No Match| SelfHarm["Self-harm regex"]
SelfHarm --> |Match| DenySH["Deny: SELF_HARM_ENCOURAGEMENT"]
SelfHarm --> |No Match| Lang["Detect language"]
Lang --> Fetch["Fetch Perspective API"]
Fetch --> |OK| Scores["Parse scores and spans"]
Scores --> Thresholds{"Exceed thresholds?"}
Thresholds --> |Yes| Reasons["Build reasons list"]
Reasons --> DenyPolicy["Deny: CONTENT_POLICY_VIOLATION"]
Thresholds --> |No| Allow["Allow"]
Fetch --> |Error| FailClosed["Fail-closed: service unavailable"]Diagram sources
moderator.service.ts
Section sources
moderator.service.ts
Banned Word Management
Administrators maintain the global blocklist via CRUD endpoints backed by a repository. Words support strict vs normal modes and severity levels. The moderator service reloads matchers when the database version changes.
sequenceDiagram
participant Admin as "Admin UI"
participant Ctrl as "WordsModerationController"
participant Svc as "WordsModerationService"
participant Repo as "BannedWordsRepo"
Admin->>Ctrl : GET /moderation/config
Ctrl->>Svc : getConfig()
Svc->>Repo : getModerationConfigWords()
Repo-->>Svc : {strictWords, normalWords}
Svc-->>Ctrl : config
Ctrl-->>Admin : config
Admin->>Ctrl : POST /moderation/words (create)
Ctrl->>Svc : createWord(payload)
Svc->>Repo : createBannedWord()
Repo-->>Svc : BannedWordRecord
Svc-->>Ctrl : record
Ctrl-->>Admin : createdDiagram sources
words-moderation.controller.tswords-moderation.repo.tsmoderator.service.ts
Section sources
words-moderation.controller.tswords-moderation.repo.tsBannedWordsPage.tsx
Content Flagging and Reporting
Users submit reports for posts or comments. Reports are stored with status and metadata, enabling admin review and bulk actions. Content moderation actions update report statuses accordingly.
sequenceDiagram
participant User as "User"
participant Web as "Web API"
participant Reports as "ContentReportService"
participant Audit as "Audit Logger"
User->>Web : POST /reports
Web->>Reports : createReport(values)
Reports->>Reports : persist report
Reports->>Audit : record audit event
Reports-->>Web : report
Web-->>User : created
participant Admin as "Admin UI"
participant AdminSvc as "ContentModerationService/UserManagementService"
Admin->>Admin : Review reports (ReportsPage)
Admin->>AdminSvc : Moderate content (ban/unban/shadow ban)
AdminSvc->>Reports : updateReportsByTargetId()
Reports->>Audit : record audit eventDiagram sources
report.tsreport-moderation.service.tsReportsPage.tsxReportPost.tsxcontent-moderation.service.tsuser-moderation.service.ts
Section sources
report.tsreport-moderation.service.tsReportsPage.tsxReportPost.tsx
Admin Review Workflows
Admins can filter reports by status, refresh lists, and apply actions such as marking pending or undoing actions across content and user moderation states. Bulk operations are coordinated via admin components.
flowchart TD
Open["Open Reports Page"] --> Filter["Filter by status (pending/resolved/ignored)"]
Filter --> Load["Load reports with pagination"]
Load --> Actions{"Admin Action"}
Actions --> |Mark Pending| Mark["updateSingleReportStatus(pending)"]
Actions --> |Undo All Actions| Undo["updateContentModerationState + updateUserModerationState + updateSingleReportStatus"]
Mark --> Refresh["onRefresh()"]
Undo --> Refresh
Refresh --> LoadDiagram sources
ReportsPage.tsxReportPost.tsx
Section sources
ReportsPage.tsxReportPost.tsx
OCR Integration for Document Text Extraction
The OCR service recognizes text from images and extracts structured details (e.g., email, branch). The web client sends multipart form data to the OCR endpoint, and the backend initializes a Tesseract worker to process the image.
sequenceDiagram
participant Client as "Web Client"
participant OCR as "OCR Extract Controller"
participant Tesseract as "Tesseract Worker"
Client->>OCR : POST /extract (FormData)
OCR->>Tesseract : recognize(imagePath)
Tesseract-->>OCR : {text}
OCR->>OCR : extractEmail(text), extractBranch(text)
OCR-->>Client : {success, data : {email, branch}}Diagram sources
ocr.tsextract.controller.ts
Section sources
ocr.tsextract.controller.ts
Moderation Policies and Appeals
- Dynamic policy: banned words via Aho-Corasick with strict/normal modes and severity
- Validator policy: Perspective API thresholds for toxicity, insult, identity attack, threat, profanity; spam detection; self-harm detection
- Appeals: not implemented in the reviewed code; administrators can adjust banned words and review reports
Section sources
moderator.service.tswords-moderation.repo.ts
Administrative Tools for Oversight
- Banned words management: add, edit, delete, search, and toggle strict mode
- Reports dashboard: filter by status, paginate, refresh
- Bulk moderation actions: content and user moderation state updates
Section sources
BannedWordsPage.tsxReportsPage.tsxReportPost.tsx
Moderation Queue and Workflows
- Reports are paginated and filtered by status
- Admins can update report statuses and trigger moderation actions on targets
- Content moderation actions propagate to related reports
Section sources
report-moderation.service.tscontent-moderation.service.ts
Integration with Notification System
The notification module is present but largely commented out in the reviewed code. It defines types and includes commented-out logic for emitting notifications and bundling. While not actively integrated in the reviewed files, the types and structure indicate potential future integration.
Section sources
notification.service.tsNotification.ts
Dependency Analysis
The moderation system exhibits clear separation of concerns:
- Dynamic filtering depends on normalized text and Aho-Corasick automata
- Policy validation depends on external API and language detection
- Admin endpoints depend on repositories for CRUD operations
- Reports depend on content and user moderation services
- UI components depend on backend APIs and local moderation utilities
graph LR
Mod["moderator.service.ts"] --> Repo["words-moderation.repo.ts"]
Mod --> AC["aho-corasick.ts"]
Mod --> Norm["normalize.ts"]
Mod --> Persp["Perspective API"]
Reports["report-moderation.service.ts"] --> Content["content-moderation.service.ts"]
Reports --> User["user-moderation.service.ts"]
AdminWords["words-moderation.controller.ts"] --> Repo
AdminReports["ReportsPage.tsx"] --> Reports
AdminBanned["BannedWordsPage.tsx"] --> AdminWords
AdminReview["ReportPost.tsx"] --> Content
AdminReview --> User
WebReport["report.ts"] --> Reports
WebOCR["ocr.ts"] --> OCR["extract.controller.ts"]
OCR --> ModDiagram sources
moderator.service.tswords-moderation.repo.tswords-moderation.controller.tsreport-moderation.service.tscontent-moderation.service.tsuser-moderation.service.tsReportsPage.tsxBannedWordsPage.tsxReportPost.tsxreport.tsocr.tsextract.controller.ts
Section sources
moderator.service.tsreport-moderation.service.ts
Performance Considerations
- Aho-Corasick search runs in linear time relative to input length with constant per-state transitions; ensure minimal redundant builds by leveraging versioned caching
- Normalization and boundary checks add overhead proportional to input length; consider pre-normalizing and caching repeated validations
- Perspective API calls are rate-limited and timeout-bound; implement retry/backoff and consider batching
- Wildcard matching uses DFS with memoization; avoid excessive wildcard patterns to prevent combinatorial blow-up
- OCR worker initialization should be lazy and reused to minimize startup costs
[No sources needed since this section provides general guidance]
Troubleshooting Guide
- Dynamic moderation returns allowed=true despite flagged content: verify strict vs normal mode and boundary checks; confirm banned word normalization and wildcards
- Policy violations without dynamic matches: inspect Perspective API thresholds and language detection; check spam/self-harm heuristics
- Admin banned word changes not reflected: ensure matcher rebuild triggers on version change and that caches are invalidated
- Report status not updating: verify report ID correctness and valid status values; check audit logging for errors
- OCR extraction fails: confirm worker initialization, file upload presence, and Tesseract model availability
Section sources
moderator.service.tsreport-moderation.service.tsextract.controller.ts
Conclusion
The moderation system combines efficient Aho-Corasick-based filtering with robust policy validation and a comprehensive admin toolkit. OCR integration enables automated analysis of document-based content. Administrators can manage banned words, review reports, and enforce moderation actions. Future enhancements could include formal appeals workflows and active notification emission.
[No sources needed since this section summarizes without analyzing specific files]
Appendices
Example Moderation Workflow
- User submits content
- Backend validates with Aho-Corasick; if matches found, deny with dynamic violation
- Else, validate with Perspective API; if thresholds exceeded, deny with policy violation
- Else, allow and optionally notify
Section sources
moderator.service.ts
UI Integration Notes
- ModeratedText component loads configuration, subscribes to updates, and splits text by matches for highlighting
- Web report API posts user-generated reports
- Admin pages provide CRUD and review capabilities
Section sources
ModeratedText.tsxreport.tsReportsPage.tsxBannedWordsPage.tsx