# AI Tools Suite - Product Manual > A comprehensive collection of AI/ML operational tools for monitoring, security, compliance, and cost management. --- ## Table of Contents 1. [Overview](#overview) 2. [Architecture](#architecture) 3. [Tool Catalog](#tool-catalog) 4. [Product Roadmap](#product-roadmap) 5. [Installation](#installation) 6. [Quick Start](#quick-start) 7. [User Guide](#user-guide) 8. [Detailed Tool Documentation](#detailed-tool-documentation) 9. [Directory Structure](#directory-structure) 10. [Version History](#version-history) --- ## Overview This suite provides 14 essential tools for managing AI/ML systems in production environments. Each tool addresses a specific operational need, from cost tracking to security testing. ### Target Users - ML Engineers - Data Scientists - DevOps/MLOps Teams - Product Managers - Compliance Officers --- ## Architecture ### System Overview The AI Tools Suite uses a modern web architecture with a unified SvelteKit frontend and FastAPI backend. ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ SVELTEKIT FRONTEND │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ UNIFIED DASHBOARD │ │ │ │ ┌──────────────────┐ ┌─────────────────────────────────────────┐ │ │ │ │ │ Sidebar Nav │ │ Main Content Area │ │ │ │ │ │ ──────────── │ │ ──────────────── │ │ │ │ │ │ Dashboard │ │ │ │ │ │ │ │ Drift Monitor │ │ [Selected Tool View] │ │ │ │ │ │ Cost Tracker │ │ │ │ │ │ │ │ Security Test │ │ - Interactive Charts │ │ │ │ │ │ Data History │ │ - Data Tables │ │ │ │ │ │ Model Compare │ │ - Configuration Forms │ │ │ │ │ │ Privacy Scan │ │ - Real-time Updates │ │ │ │ │ │ Label Quality │ │ - Export Options │ │ │ │ │ │ Cost Estimate │ │ │ │ │ │ │ │ Data Audit │ │ │ │ │ │ │ │ Content Perf │ │ │ │ │ │ │ │ Bias Checks │ │ │ │ │ │ │ │ Profitability │ │ │ │ │ │ │ │ Emergency Ctrl │ │ │ │ │ │ │ │ Reports │ │ │ │ │ │ │ └──────────────────┘ └─────────────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ REST API / WebSocket ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ FASTAPI BACKEND │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ API ROUTERS │ │ │ │ /api/drift /api/costs /api/security /api/privacy /api/... │ │ │ └────────────────────────────────────────────────────────────────────┘ │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ SERVICE LAYER │ │ │ │ DriftDetector │ CostAggregator │ PIIScanner │ BiasAnalyzer │ ... │ │ │ └────────────────────────────────────────────────────────────────────┘ │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ SHARED SERVICES │ │ │ │ Authentication │ Database ORM │ File Storage │ Background Jobs │ │ │ └────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ DATA LAYER │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │ │ │ PostgreSQL/ │ │ Redis │ │ File Storage │ │ │ │ SQLite │ │ (Cache/Queue) │ │ (Uploads/Reports) │ │ │ │ - Users │ │ - Session cache │ │ - CSV/JSON uploads │ │ │ │ - Audit logs │ │ - Job queue │ │ - Generated reports │ │ │ │ - Metrics │ │ - Real-time │ │ - Model artifacts │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ ``` ### Tech Stack | Layer | Technology | Purpose | |-------|------------|---------| | **Frontend** | SvelteKit + TypeScript | Unified single-page application | | **UI Components** | Tailwind CSS + shadcn-svelte | Modern, accessible components | | **Charts** | Apache ECharts | Interactive data visualizations | | **Backend** | FastAPI (Python 3.11+) | REST API + WebSocket support | | **ORM** | SQLAlchemy 2.0 | Database abstraction | | **Database** | SQLite (dev) / PostgreSQL (prod) | Persistent storage | | **Cache** | Redis (optional) | Session cache, job queue | | **Deployment** | Docker Compose | Container orchestration | ### API Design All tools are accessible via a RESTful API: ``` Base URL: http://localhost:8000/api/v1 Endpoints: ├── /drift/ # Model Drift Monitor ├── /costs/ # Vendor Cost Tracker ├── /security/ # Security Tester ├── /history/ # Data History Log ├── /compare/ # Model Comparator ├── /privacy/ # Privacy Scanner ├── /labels/ # Label Quality Scorer ├── /estimate/ # Inference Estimator ├── /audit/ # Data Integrity Audit ├── /content/ # Content Performance ├── /bias/ # Safety/Bias Checks ├── /profitability/ # Profitability Analysis ├── /emergency/ # Emergency Control └── /reports/ # Result Interpretation ``` --- ## Tool Catalog | # | Tool Name | Deliverable | Description | Status | |---|-----------|-------------|-------------|--------| | 1 | **Model Drift Monitor** | Dashboard | Tracks prediction confidence over time to detect when AI accuracy begins to decline. | Pending | | 2 | **Vendor Cost Tracker** | API spend aggregator | Provides a single view of all API expenses across providers like OpenAI, Anthropic, and AWS. | Pending | | 3 | **Security Tester** | Input fuzzer | Tests AI endpoints for exploits and prompt injections to prevent unauthorized access. | Pending | | 4 | **Data History Log** | Audit trail logger | Maintains a record of which data versions were used to train specific models for legal compliance. | Pending | | 5 | **Model Comparator** | Response evaluator | Compares outputs from different models side-by-side to determine the best fit for specific tasks. | Pending | | 6 | **Privacy Scanner** | PII detector | Automatically finds and removes personal information (names, emails) from training datasets. | Pending | | 7 | **Label Quality Scorer** | Agreement calculator | Measures the consistency of data labeling teams to ensure high-quality training inputs. | Pending | | 8 | **Inference Estimator** | Token/Price calculator | Predicts monthly operational costs based on expected usage before a project is deployed. | Pending | | 9 | **Data Integrity Audit** | Data cleaning app | Identifies and fixes errors in databases to prevent data loss and improve model performance. | Pending | | 10 | **Content Performance** | Retention model | Visualizes audience drop-off points to identify which content segments drive engagement. | Pending | | 11 | **Safety/Bias Checks** | Bias scanner checklist | Audits recommendation engines to ensure they follow privacy laws and treat users fairly. | Pending | | 12 | **Profitability Analysis** | Cost-vs-revenue view | Correlates AI costs with business revenue to identify specific areas for monthly savings. | Pending | | 13 | **Emergency Control** | Manual override template | Provides a reliable mechanism to immediately suspend automated processes if they fail. | Pending | | 14 | **Result Interpretation** | Automated report generator | Converts technical metrics into a standardized list of actions for business decision-makers. | Pending | --- ## Product Roadmap ### Phase 1: Foundation (MVP) **Goal:** Core infrastructure and 3 essential tools | Milestone | Deliverables | Dependencies | |-----------|--------------|--------------| | **1.1 Project Setup** | SvelteKit frontend scaffold, FastAPI backend scaffold, Docker Compose config, CI/CD pipeline | None | | **1.2 Shared Infrastructure** | Authentication system, Database models, API client library, Shared UI components (sidebar, charts, tables) | 1.1 | | **1.3 Inference Estimator** | Token counting, Multi-provider pricing, Cost projection UI, Export to CSV | 1.2 | | **1.4 Data Integrity Audit** | File upload, Missing value detection, Duplicate finder, Interactive cleaning UI | 1.2 | | **1.5 Privacy Scanner** | PII detection engine, Redaction modes, Scan results UI, Batch processing | 1.2 | ### Phase 2: Monitoring & Costs **Goal:** Production monitoring and cost management tools | Milestone | Deliverables | Dependencies | |-----------|--------------|--------------| | **2.1 Model Drift Monitor** | Baseline upload, KS/PSI tests, Drift visualization, Alert configuration | Phase 1 | | **2.2 Vendor Cost Tracker** | API key integration (OpenAI, Anthropic, AWS), Cost aggregation, Budget alerts, Usage forecasting | Phase 1 | | **2.3 Profitability Analysis** | Revenue data import, Cost-revenue correlation, ROI calculator, Savings recommendations | 2.2 | ### Phase 3: Security & Compliance **Goal:** Security testing and compliance tools | Milestone | Deliverables | Dependencies | |-----------|--------------|--------------| | **3.1 Security Tester** | Prompt injection test suite, Jailbreak detection, Vulnerability report generation | Phase 1 | | **3.2 Data History Log** | Data versioning (SHA-256), Model-dataset linking, Audit trail UI, GDPR/CCPA reports | Phase 1 | | **3.3 Safety/Bias Checks** | Fairness metrics (demographic parity, equalized odds), Bias detection, Compliance checklist | Phase 1 | ### Phase 4: Quality & Comparison **Goal:** Data quality and model evaluation tools | Milestone | Deliverables | Dependencies | |-----------|--------------|--------------| | **4.1 Label Quality Scorer** | Multi-rater agreement (Kappa, Alpha), Inconsistency flagging, Quality reports | Phase 1 | | **4.2 Model Comparator** | Side-by-side comparison UI, Quality scoring, Latency benchmarks, Cost-per-query analysis | Phase 1 | ### Phase 5: Analytics & Control **Goal:** Advanced analytics and operational control | Milestone | Deliverables | Dependencies | |-----------|--------------|--------------| | **5.1 Content Performance** | Engagement tracking, Drop-off visualization, Retention curves, A/B analysis | Phase 1 | | **5.2 Emergency Control** | Kill switch API, Graceful degradation, Rollback triggers, Incident logging | Phase 1 | | **5.3 Result Interpretation** | Metric-to-insight engine, Executive summary generator, PDF/Markdown export | Phase 1 | ### Roadmap Visualization ``` Phase 1: Foundation Phase 2: Monitoring Phase 3: Security ───────────────────── ───────────────────── ───────────────────── ┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐ │ Project Setup │ │ Model Drift Monitor │ │ Security Tester │ │ Shared Infra │ ───► │ Vendor Cost Tracker │ │ Data History Log │ │ Inference Estimator │ │ Profitability │ │ Safety/Bias Checks │ │ Data Integrity │ └─────────────────────┘ └─────────────────────┘ │ Privacy Scanner │ │ │ └─────────────────────┘ │ │ ▼ ▼ Phase 4: Quality Phase 5: Analytics ───────────────────── ───────────────────── ┌─────────────────────┐ ┌─────────────────────┐ │ Label Quality │ │ Content Performance │ │ Model Comparator │ │ Emergency Control │ └─────────────────────┘ │ Result Interpret │ └─────────────────────┘ ``` ### Success Metrics | Phase | Key Metrics | |-------|-------------| | Phase 1 | Frontend/backend running, 3 tools functional, <2s page load | | Phase 2 | Real-time drift alerts, Cost tracking across 3+ providers | | Phase 3 | 90%+ PII detection rate, Compliance reports generated | | Phase 4 | Inter-rater agreement calculated, Model comparison functional | | Phase 5 | Emergency shutoff <1s response, Automated reports generated | --- ## Installation ### Prerequisites ```bash # Required Node.js 18+ Python 3.11+ Docker & Docker Compose (recommended) # Optional PostgreSQL 15+ (for production) Redis 7+ (for caching/queues) ``` ### Quick Setup with Docker ```bash # Clone and navigate cd ai_tools_suite # Start all services docker-compose up -d # Access the application # Frontend: http://localhost:3000 # Backend API: http://localhost:8000 # API Docs: http://localhost:8000/docs ``` ### Manual Setup #### Backend (FastAPI) ```bash cd backend # Create virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Run development server uvicorn main:app --reload --port 8000 ``` #### Frontend (SvelteKit) ```bash cd frontend # Install dependencies npm install # Run development server npm run dev # Build for production npm run build ``` ### Environment Variables Create `.env` files in both `frontend/` and `backend/` directories: **backend/.env** ```env DATABASE_URL=sqlite:///./ai_tools.db SECRET_KEY=your-secret-key-here OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... ``` **frontend/.env** ```env PUBLIC_API_URL=http://localhost:8000 ``` --- ## Quick Start ### Accessing the Dashboard 1. Start the application (Docker or manual setup) 2. Open http://localhost:3000 in your browser 3. Use the sidebar to navigate between tools ### API Usage All tools are accessible via REST API: ```bash # Check API health curl http://localhost:8000/api/v1/health # Estimate inference costs curl -X POST http://localhost:8000/api/v1/estimate/calculate \ -H "Content-Type: application/json" \ -d '{"model": "gpt-4", "tokens": 1000000, "requests_per_day": 1000}' # Scan for PII curl -X POST http://localhost:8000/api/v1/privacy/scan \ -F "file=@data.csv" # Check model drift curl -X POST http://localhost:8000/api/v1/drift/analyze \ -F "baseline=@baseline.csv" \ -F "production=@production.csv" ``` --- ## User Guide This section provides step-by-step instructions for using the tools available in Phase 1. ### Inference Estimator **Purpose:** Calculate AI API costs before deploying your application. #### How to Use 1. **Navigate to the Tool** - Click "Inference Estimator" in the sidebar (or go to `/inference-estimator`) 2. **Configure Your Model** - Select your AI provider (OpenAI, Anthropic, Google, or Custom) - Choose the specific model (e.g., GPT-4, Claude 3, Gemini Pro) - For custom models, enter your own input/output prices per 1M tokens 3. **Enter Usage Parameters** - **Input Tokens per Request:** Average tokens you send to the model - **Output Tokens per Request:** Average tokens the model returns - **Requests per Day:** Expected daily API calls - **Peak Multiplier:** Account for traffic spikes (1x = normal, 2x = double traffic) 4. **View Cost Breakdown** - **Daily Cost:** Input cost + Output cost per day - **Monthly Cost:** 30-day projection - **Annual Cost:** 365-day projection 5. **Override Pricing (Optional)** - Click the edit icon next to any model to set custom pricing - Useful for negotiated enterprise rates or new models #### Example Calculation ``` Model: GPT-4 Turbo Input: 500 tokens/request × $10.00/1M = $0.005/request Output: 200 tokens/request × $30.00/1M = $0.006/request Requests: 10,000/day Daily Cost: (0.005 + 0.006) × 10,000 = $110.00 Monthly Cost: $110 × 30 = $3,300.00 ``` #### Expected Outcome After entering your parameters, you will see: ``` ┌─────────────────────────────────────────────────────────────┐ │ COST BREAKDOWN │ ├─────────────────────────────────────────────────────────────┤ │ Provider: OpenAI │ │ Model: GPT-4 Turbo │ │ │ │ ┌─────────────┬────────────┬────────────┬───────────────┐ │ │ │ Period │ Input Cost │ Output Cost│ Total │ │ │ ├─────────────┼────────────┼────────────┼───────────────┤ │ │ │ Daily │ $50.00 │ $60.00 │ $110.00 │ │ │ │ Monthly │ $1,500.00 │ $1,800.00 │ $3,300.00 │ │ │ │ Yearly │$18,250.00 │$21,900.00 │ $40,150.00 │ │ │ └─────────────┴────────────┴────────────┴───────────────┘ │ │ │ │ Tokens per Day: 7,000,000 (5M input + 2M output) │ │ Cost per Request: $0.011 │ │ Cost per 1K Requests: $11.00 │ └─────────────────────────────────────────────────────────────┘ ``` --- ### Data Integrity Audit **Purpose:** Analyze datasets for quality issues, missing values, duplicates, and outliers. #### How to Use 1. **Navigate to the Tool** - Click "Data Integrity Audit" in the sidebar (or go to `/data-audit`) 2. **Upload Your Dataset** - **Drag and drop** a file onto the upload area, OR - **Click to browse** and select a file - Supported formats: CSV, Excel (.xlsx, .xls), JSON 3. **Click "Analyze Data"** - The tool will process your file and display results 4. **Review the Results** **Quick Stats Panel:** - **Rows:** Total number of records - **Columns:** Number of fields - **Duplicates:** Count of duplicate rows - **Issues:** Number of problems detected **Overview Tab:** - Missing values summary with counts and percentages - Duplicate row detection results **Columns Tab:** - Detailed statistics for each column: - Data type (int64, float64, object, etc.) - Missing value count and percentage - Unique value count - Sample values **Issues & Recommendations Tab:** - List of detected problems with icons: - `!` = Missing values - `2x` = Duplicates - `~` = Outliers - `#` = High cardinality - `=` = Constant column - `OK` = No issues - Actionable recommendations for fixing each issue #### Understanding the Results | Issue Type | What It Means | Recommended Action | |------------|---------------|-------------------| | Missing Values | Empty cells in the data | Fill with mean/median or remove rows | | Duplicate Rows | Identical records | Remove duplicates to avoid bias | | Outliers | Extreme values | Investigate if valid or remove | | High Cardinality | Too many unique values | Check if column is an ID field | | Constant Column | Only one value | Consider removing from analysis | #### Expected Outcome After uploading a dataset (e.g., `customers.csv`), you will see: ``` ┌─────────────────────────────────────────────────────────────────────┐ │ QUICK STATS │ │ ┌──────────┬──────────┬──────────────┬─────────────┐ │ │ │ Rows │ Columns │ Duplicates │ Issues │ │ │ │ 10,542 │ 12 │ 47 │ 5 │ │ │ └──────────┴──────────┴──────────────┴─────────────┘ │ ├─────────────────────────────────────────────────────────────────────┤ │ OVERVIEW TAB │ │ ──────────── │ │ Missing Values: │ │ ┌─────────────────┬─────────┬─────────┐ │ │ │ Column │ Count │ Percent │ │ │ ├─────────────────┼─────────┼─────────┤ │ │ │ email │ 23 │ 0.22% │ │ │ │ phone │ 156 │ 1.48% │ │ │ │ address │ 89 │ 0.84% │ │ │ └─────────────────┴─────────┴─────────┘ │ │ │ │ ⚠ 47 duplicate rows found (0.45%) │ ├─────────────────────────────────────────────────────────────────────┤ │ ISSUES & RECOMMENDATIONS TAB │ │ ──────────────────────────── │ │ Issues Found: │ │ [!] Dataset has 268 missing values across 3 columns │ │ [2x] Found 47 duplicate rows (0.45%) │ │ [~] Column 'age' has 12 potential outliers │ │ [#] Column 'user_id' has very high cardinality (10,495 unique) │ │ [=] Column 'status' has only one unique value │ │ │ │ Recommendations: │ │ 💡 Fill missing values with mean/median for numeric columns │ │ 💡 Consider removing duplicate rows to improve data quality │ │ 💡 Review if 'user_id' should be used as an identifier │ │ 💡 Consider removing constant column 'status' │ └─────────────────────────────────────────────────────────────────────┘ ``` #### API Endpoints ```bash # Analyze a dataset curl -X POST http://localhost:8000/api/v1/audit/analyze \ -F "file=@your_data.csv" # Clean a dataset (remove duplicates and missing rows) curl -X POST http://localhost:8000/api/v1/audit/clean \ -F "file=@your_data.csv" # Validate schema curl -X POST http://localhost:8000/api/v1/audit/validate-schema \ -F "file=@your_data.csv" # Detect outliers curl -X POST http://localhost:8000/api/v1/audit/detect-outliers \ -F "file=@your_data.csv" ``` #### Sample API Response ```json { "total_rows": 10542, "total_columns": 12, "missing_values": { "email": {"count": 23, "percent": 0.22}, "phone": {"count": 156, "percent": 1.48}, "address": {"count": 89, "percent": 0.84} }, "duplicate_rows": 47, "duplicate_percent": 0.45, "column_stats": [ { "name": "customer_id", "dtype": "int64", "missing_count": 0, "missing_percent": 0.0, "unique_count": 10542, "sample_values": [1001, 1002, 1003, 1004, 1005] } ], "issues": [ "Dataset has 268 missing values across 3 columns", "Found 47 duplicate rows (0.45%)", "Column 'age' has 12 potential outliers" ], "recommendations": [ "Consider filling missing values with mean/median", "Consider removing duplicate rows to improve data quality" ] } ``` --- ### Privacy Scanner **Purpose:** Detect and redact personally identifiable information (PII) from text and files. #### How to Use 1. **Navigate to the Tool** - Click "Privacy Scanner" in the sidebar (or go to `/privacy-scanner`) 2. **Choose Input Mode** - **Text Mode:** Paste or type text directly - **File Mode:** Upload CSV, TXT, or JSON files 3. **Configure Detection Options** - Toggle which PII types to detect: - Emails - Phone numbers - SSN (Social Security Numbers) - Credit Cards - IP Addresses - Dates of Birth 4. **Enter or Upload Content** - **For text:** Paste content into the text area - **For files:** Drag and drop or click to upload - **Tip:** Click "Load Sample" to see example PII data 5. **Click "Scan for PII"** - The tool will analyze your content and display results 6. **Review the Results** **Risk Summary:** - **PII Found:** Total number of PII entities detected - **Types:** Number of different PII categories - **Risk Score:** Calculated severity (0-100) - **Risk Level:** CRITICAL, HIGH, MEDIUM, or LOW **Overview Tab:** - PII counts by type with color-coded severity - Risk assessment with explanation **Entities Tab:** - Detailed list of each detected PII item: - Type (EMAIL, PHONE, SSN, etc.) - Original value - Masked value - Confidence score (percentage) **Redacted Preview Tab:** - Shows your text with all PII masked - Safe to share after verification #### PII Detection Patterns | Type | Example | Masked As | |------|---------|-----------| | EMAIL | john.doe@example.com | jo***@example.com | | PHONE | (555) 123-4567 | ***-***-4567 | | SSN | 123-45-6789 | ***-**-6789 | | CREDIT_CARD | 4532015112830366 | ****-****-****-0366 | | IP_ADDRESS | 192.168.1.100 | 192.***.***.*| | DATE_OF_BIRTH | 03/15/1985 | **/**/1985 | #### Risk Levels Explained | Level | Score | Description | |-------|-------|-------------| | CRITICAL | 70-100 | Highly sensitive PII (SSN, Credit Cards). Immediate action required. | | HIGH | 50-69 | Multiple sensitive PII elements. Consider redaction before sharing. | | MEDIUM | 30-49 | Some PII detected that may require attention. | | LOW | 0-29 | Minimal or no PII detected. | #### API Endpoints ```bash # Scan text for PII curl -X POST http://localhost:8000/api/v1/privacy/scan-text \ -F "text=Contact john@example.com or call 555-123-4567" \ -F "detect_emails=true" \ -F "detect_phones=true" # Scan a file for PII curl -X POST http://localhost:8000/api/v1/privacy/scan-file \ -F "file=@customer_data.csv" # Scan CSV/Excel with column-by-column analysis curl -X POST http://localhost:8000/api/v1/privacy/scan-dataframe \ -F "file=@customer_data.csv" # Redact PII from text curl -X POST http://localhost:8000/api/v1/privacy/redact \ -F "text=Call 555-123-4567 for support" \ -F "mode=mask" # List supported PII types curl http://localhost:8000/api/v1/privacy/entity-types ``` #### Redaction Modes | Mode | Description | Example Output | |------|-------------|----------------| | `mask` | Shows partial value | jo***@example.com | | `remove` | Replaces with [REDACTED] | [REDACTED] | | `type` | Shows PII type | [EMAIL] | #### Expected Outcome After scanning text or a file, you will see results like: ``` RISK SUMMARY ┌────────────┬──────────┬────────────┬─────────────────┐ │ PII Found │ Types │ Risk Score │ Risk Level │ │ 7 │ 5 │ 72 │ CRITICAL │ └────────────┴──────────┴────────────┴─────────────────┘ ENTITIES TAB ┌────────────────┬───────────────────────┬────────────────┬───────┐ │ Type │ Original │ Masked │ Conf │ ├────────────────┼───────────────────────┼────────────────┼───────┤ │ EMAIL │ john.smith@example.com│ jo***@example..│ 95% │ │ PHONE │ (555) 123-4567 │ ***-***-4567 │ 85% │ │ SSN │ 123-45-6789 │ ***-**-6789 │ 95% │ │ CREDIT_CARD │ 4532015112830366 │ ****-****-0366 │ 95% │ └────────────────┴───────────────────────┴────────────────┴───────┘ REDACTED PREVIEW Customer Record: Email: jo***@example.com Phone: ***-***-4567 SSN: ***-**-6789 Credit Card: ****-****-****-0366 ``` #### Sample API Response ```json { "total_entities": 7, "entities_by_type": { "EMAIL": 2, "PHONE": 2, "SSN": 1, "CREDIT_CARD": 1, "IP_ADDRESS": 1 }, "risk_level": "CRITICAL", "risk_score": 72, "entities": [ { "type": "SSN", "value": "123-45-6789", "confidence": 0.95, "masked_value": "***-**-6789" } ], "redacted_preview": "Email: jo***@example.com\nSSN: ***-**-6789..." } ``` --- ## Detailed Tool Documentation ### 1. Model Drift Monitor **Purpose:** Detect when model performance degrades over time. **Features:** - Real-time confidence score tracking - Statistical drift detection (KS test, PSI) - Alert thresholds configuration - Historical trend visualization **API Endpoints:** ``` POST /api/v1/drift/baseline # Upload baseline distribution POST /api/v1/drift/analyze # Analyze production data for drift GET /api/v1/drift/history # Get drift score history PUT /api/v1/drift/thresholds # Configure alert thresholds ``` --- ### 2. Vendor Cost Tracker **Purpose:** Aggregate and visualize API spending across providers. **Supported Providers:** - OpenAI - Anthropic - AWS Bedrock - Google Vertex AI - Azure OpenAI **Features:** - Daily/weekly/monthly cost breakdowns - Per-project cost allocation - Budget alerts - Usage forecasting --- ### 3. Security Tester **Purpose:** Identify vulnerabilities in AI endpoints. **Test Categories:** - Prompt injection attacks - Jailbreak attempts - Data exfiltration probes - Rate limit testing - Input validation bypass **Output:** Security report with severity ratings and remediation steps. --- ### 4. Data History Log **Purpose:** Maintain audit trail for ML training data. **Features:** - Data version hashing (SHA-256) - Model-to-dataset mapping - Timestamp logging - Compliance report generation (GDPR, CCPA) --- ### 5. Model Comparator **Purpose:** Evaluate and compare model outputs. **Features:** - Side-by-side response comparison - Quality scoring (coherence, accuracy, relevance) - Latency benchmarking - Cost-per-query analysis --- ### 6. Privacy Scanner **Purpose:** Detect and remove PII from datasets. **Detected Entities:** - Names - Email addresses - Phone numbers - SSN/National IDs - Credit card numbers - Addresses - IP addresses **Modes:** - Detection only - Automatic redaction - Pseudonymization --- ### 7. Label Quality Scorer **Purpose:** Measure inter-annotator agreement. **Metrics:** - Cohen's Kappa - Fleiss' Kappa (multi-rater) - Krippendorff's Alpha - Percent agreement **Output:** Quality report with flagged inconsistent samples. --- ### 8. Inference Estimator **Purpose:** Predict operational costs before deployment. **Inputs:** - Expected request volume - Average tokens per request - Model selection - Peak usage patterns **Output:** Monthly cost projection with confidence intervals. --- ### 9. Data Integrity Audit **Purpose:** Clean and validate datasets. **Checks:** - Missing values - Duplicate records - Data type mismatches - Outlier detection - Schema validation - Referential integrity **Interface:** Interactive data cleaning with preview and undo. --- ### 10. Content Performance **Purpose:** Analyze user engagement patterns. **Features:** - Drop-off point visualization - Engagement heatmaps - A/B test analysis - Retention curve modeling --- ### 11. Safety/Bias Checks **Purpose:** Audit AI systems for fairness. **Metrics:** - Demographic parity - Equalized odds - Calibration across groups - Disparate impact ratio **Output:** Compliance checklist with recommendations. --- ### 12. Profitability Analysis **Purpose:** Connect AI costs to business outcomes. **Features:** - Cost attribution by feature/product - Revenue correlation analysis - ROI calculation - Optimization recommendations --- ### 13. Emergency Control **Purpose:** Safely halt AI systems when needed. **Features:** - One-click system suspension - Graceful degradation modes - Rollback capabilities - Incident logging **Implementation:** API endpoints + admin dashboard. --- ### 14. Result Interpretation **Purpose:** Translate metrics into business actions. **Features:** - Automated insight generation - Executive summary creation - Action item extraction - Trend interpretation **Output:** Markdown/PDF reports for stakeholders. --- ## Directory Structure ``` ai_tools_suite/ ├── PRODUCT_MANUAL.md ├── docker-compose.yml ├── .env.example │ ├── frontend/ # SvelteKit Application │ ├── src/ │ │ ├── routes/ │ │ │ ├── +layout.svelte # Shared layout with sidebar │ │ │ ├── +page.svelte # Dashboard home │ │ │ ├── drift-monitor/ │ │ │ │ └── +page.svelte │ │ │ ├── cost-tracker/ │ │ │ │ └── +page.svelte │ │ │ ├── security-tester/ │ │ │ ├── data-history/ │ │ │ ├── model-comparator/ │ │ │ ├── privacy-scanner/ │ │ │ ├── label-quality/ │ │ │ ├── inference-estimator/ │ │ │ ├── data-audit/ │ │ │ ├── content-performance/ │ │ │ ├── bias-checks/ │ │ │ ├── profitability/ │ │ │ ├── emergency-control/ │ │ │ └── reports/ │ │ ├── lib/ │ │ │ ├── components/ # Shared UI components │ │ │ │ ├── Sidebar.svelte │ │ │ │ ├── Chart.svelte │ │ │ │ ├── DataTable.svelte │ │ │ │ └── FileUpload.svelte │ │ │ ├── stores/ # Svelte stores │ │ │ └── api/ # API client │ │ └── app.html │ ├── static/ │ ├── package.json │ ├── svelte.config.js │ ├── tailwind.config.js │ └── tsconfig.json │ ├── backend/ # FastAPI Application │ ├── main.py # Application entry point │ ├── requirements.txt │ ├── routers/ │ │ ├── drift.py │ │ ├── costs.py │ │ ├── security.py │ │ ├── history.py │ │ ├── compare.py │ │ ├── privacy.py │ │ ├── labels.py │ │ ├── estimate.py │ │ ├── audit.py │ │ ├── content.py │ │ ├── bias.py │ │ ├── profitability.py │ │ ├── emergency.py │ │ └── reports.py │ ├── services/ # Business logic │ │ ├── drift_detector.py │ │ ├── cost_aggregator.py │ │ ├── pii_scanner.py │ │ ├── bias_analyzer.py │ │ └── ... │ ├── models/ # Pydantic schemas │ │ ├── drift.py │ │ ├── costs.py │ │ └── ... │ ├── database/ │ │ ├── connection.py │ │ └── models.py # SQLAlchemy models │ └── tests/ │ ├── shared/ # Shared utilities (deprecated) ├── tests/ # Integration tests └── examples/ # Example data and usage ├── sample_baseline.csv ├── sample_production.csv └── sample_pii_data.csv ``` --- ## Version History | Version | Date | Changes | |---------|------|---------| | 0.1.0 | TBD | Phase 1 - Foundation (3 tools) | | 0.2.0 | TBD | Phase 2 - Monitoring & Costs | | 0.3.0 | TBD | Phase 3 - Security & Compliance | | 0.4.0 | TBD | Phase 4 - Quality & Comparison | | 1.0.0 | TBD | Phase 5 - Full Release (14 tools) | --- ## Support For issues or feature requests, refer to the project documentation or contact the development team. --- *Last Updated: December 2024*