ai-tools-suite/PRODUCT_MANUAL.md
2025-12-27 15:33:06 +00:00

42 KiB
Raw Permalink Blame History

AI Tools Suite - Product Manual

A comprehensive collection of AI/ML operational tools for monitoring, security, compliance, and cost management.


Table of Contents

  1. Overview
  2. Architecture
  3. Tool Catalog
  4. Product Roadmap
  5. Installation
  6. Quick Start
  7. User Guide
  8. Detailed Tool Documentation
  9. Directory Structure
  10. Version History

Overview

This suite provides 14 essential tools for managing AI/ML systems in production environments. Each tool addresses a specific operational need, from cost tracking to security testing.

Target Users

  • ML Engineers
  • Data Scientists
  • DevOps/MLOps Teams
  • Product Managers
  • Compliance Officers

Architecture

System Overview

The AI Tools Suite uses a modern web architecture with a unified SvelteKit frontend and FastAPI backend.

┌─────────────────────────────────────────────────────────────────────────┐
│                         SVELTEKIT FRONTEND                              │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                        UNIFIED DASHBOARD                           │ │
│  │  ┌──────────────────┐  ┌─────────────────────────────────────────┐ │ │
│  │  │  Sidebar Nav     │  │  Main Content Area                      │ │ │
│  │  │  ────────────    │  │  ────────────────                       │ │ │
│  │  │  Dashboard       │  │                                         │ │ │
│  │  │  Drift Monitor   │  │    [Selected Tool View]                 │ │ │
│  │  │  Cost Tracker    │  │                                         │ │ │
│  │  │  Security Test   │  │    - Interactive Charts                 │ │ │
│  │  │  Data History    │  │    - Data Tables                        │ │ │
│  │  │  Model Compare   │  │    - Configuration Forms                │ │ │
│  │  │  Privacy Scan    │  │    - Real-time Updates                  │ │ │
│  │  │  Label Quality   │  │    - Export Options                     │ │ │
│  │  │  Cost Estimate   │  │                                         │ │ │
│  │  │  Data Audit      │  │                                         │ │ │
│  │  │  Content Perf    │  │                                         │ │ │
│  │  │  Bias Checks     │  │                                         │ │ │
│  │  │  Profitability   │  │                                         │ │ │
│  │  │  Emergency Ctrl  │  │                                         │ │ │
│  │  │  Reports         │  │                                         │ │ │
│  │  └──────────────────┘  └─────────────────────────────────────────┘ │ │
│  └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    │ REST API / WebSocket
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          FASTAPI BACKEND                                │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                         API ROUTERS                                │ │
│  │  /api/drift  /api/costs  /api/security  /api/privacy  /api/...    │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                      SERVICE LAYER                                 │ │
│  │  DriftDetector │ CostAggregator │ PIIScanner │ BiasAnalyzer │ ... │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                    SHARED SERVICES                                 │ │
│  │  Authentication │ Database ORM │ File Storage │ Background Jobs   │ │
│  └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          DATA LAYER                                     │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────────┐  │
│  │  PostgreSQL/     │  │  Redis           │  │  File Storage        │  │
│  │  SQLite          │  │  (Cache/Queue)   │  │  (Uploads/Reports)   │  │
│  │  - Users         │  │  - Session cache │  │  - CSV/JSON uploads  │  │
│  │  - Audit logs    │  │  - Job queue     │  │  - Generated reports │  │
│  │  - Metrics       │  │  - Real-time     │  │  - Model artifacts   │  │
│  └──────────────────┘  └──────────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘

Tech Stack

Layer Technology Purpose
Frontend SvelteKit + TypeScript Unified single-page application
UI Components Tailwind CSS + shadcn-svelte Modern, accessible components
Charts Apache ECharts Interactive data visualizations
Backend FastAPI (Python 3.11+) REST API + WebSocket support
ORM SQLAlchemy 2.0 Database abstraction
Database SQLite (dev) / PostgreSQL (prod) Persistent storage
Cache Redis (optional) Session cache, job queue
Deployment Docker Compose Container orchestration

API Design

All tools are accessible via a RESTful API:

Base URL: http://localhost:8000/api/v1

Endpoints:
├── /drift/           # Model Drift Monitor
├── /costs/           # Vendor Cost Tracker
├── /security/        # Security Tester
├── /history/         # Data History Log
├── /compare/         # Model Comparator
├── /privacy/         # Privacy Scanner
├── /labels/          # Label Quality Scorer
├── /estimate/        # Inference Estimator
├── /audit/           # Data Integrity Audit
├── /content/         # Content Performance
├── /bias/            # Safety/Bias Checks
├── /profitability/   # Profitability Analysis
├── /emergency/       # Emergency Control
└── /reports/         # Result Interpretation

Tool Catalog

# Tool Name Deliverable Description Status
1 Model Drift Monitor Dashboard Tracks prediction confidence over time to detect when AI accuracy begins to decline. Pending
2 Vendor Cost Tracker API spend aggregator Provides a single view of all API expenses across providers like OpenAI, Anthropic, and AWS. Pending
3 Security Tester Input fuzzer Tests AI endpoints for exploits and prompt injections to prevent unauthorized access. Pending
4 Data History Log Audit trail logger Maintains a record of which data versions were used to train specific models for legal compliance. Pending
5 Model Comparator Response evaluator Compares outputs from different models side-by-side to determine the best fit for specific tasks. Pending
6 Privacy Scanner PII detector Automatically finds and removes personal information (names, emails) from training datasets. Pending
7 Label Quality Scorer Agreement calculator Measures the consistency of data labeling teams to ensure high-quality training inputs. Pending
8 Inference Estimator Token/Price calculator Predicts monthly operational costs based on expected usage before a project is deployed. Pending
9 Data Integrity Audit Data cleaning app Identifies and fixes errors in databases to prevent data loss and improve model performance. Pending
10 Content Performance Retention model Visualizes audience drop-off points to identify which content segments drive engagement. Pending
11 Safety/Bias Checks Bias scanner checklist Audits recommendation engines to ensure they follow privacy laws and treat users fairly. Pending
12 Profitability Analysis Cost-vs-revenue view Correlates AI costs with business revenue to identify specific areas for monthly savings. Pending
13 Emergency Control Manual override template Provides a reliable mechanism to immediately suspend automated processes if they fail. Pending
14 Result Interpretation Automated report generator Converts technical metrics into a standardized list of actions for business decision-makers. Pending

Product Roadmap

Phase 1: Foundation (MVP)

Goal: Core infrastructure and 3 essential tools

Milestone Deliverables Dependencies
1.1 Project Setup SvelteKit frontend scaffold, FastAPI backend scaffold, Docker Compose config, CI/CD pipeline None
1.2 Shared Infrastructure Authentication system, Database models, API client library, Shared UI components (sidebar, charts, tables) 1.1
1.3 Inference Estimator Token counting, Multi-provider pricing, Cost projection UI, Export to CSV 1.2
1.4 Data Integrity Audit File upload, Missing value detection, Duplicate finder, Interactive cleaning UI 1.2
1.5 Privacy Scanner PII detection engine, Redaction modes, Scan results UI, Batch processing 1.2

Phase 2: Monitoring & Costs

Goal: Production monitoring and cost management tools

Milestone Deliverables Dependencies
2.1 Model Drift Monitor Baseline upload, KS/PSI tests, Drift visualization, Alert configuration Phase 1
2.2 Vendor Cost Tracker API key integration (OpenAI, Anthropic, AWS), Cost aggregation, Budget alerts, Usage forecasting Phase 1
2.3 Profitability Analysis Revenue data import, Cost-revenue correlation, ROI calculator, Savings recommendations 2.2

Phase 3: Security & Compliance

Goal: Security testing and compliance tools

Milestone Deliverables Dependencies
3.1 Security Tester Prompt injection test suite, Jailbreak detection, Vulnerability report generation Phase 1
3.2 Data History Log Data versioning (SHA-256), Model-dataset linking, Audit trail UI, GDPR/CCPA reports Phase 1
3.3 Safety/Bias Checks Fairness metrics (demographic parity, equalized odds), Bias detection, Compliance checklist Phase 1

Phase 4: Quality & Comparison

Goal: Data quality and model evaluation tools

Milestone Deliverables Dependencies
4.1 Label Quality Scorer Multi-rater agreement (Kappa, Alpha), Inconsistency flagging, Quality reports Phase 1
4.2 Model Comparator Side-by-side comparison UI, Quality scoring, Latency benchmarks, Cost-per-query analysis Phase 1

Phase 5: Analytics & Control

Goal: Advanced analytics and operational control

Milestone Deliverables Dependencies
5.1 Content Performance Engagement tracking, Drop-off visualization, Retention curves, A/B analysis Phase 1
5.2 Emergency Control Kill switch API, Graceful degradation, Rollback triggers, Incident logging Phase 1
5.3 Result Interpretation Metric-to-insight engine, Executive summary generator, PDF/Markdown export Phase 1

Roadmap Visualization

Phase 1: Foundation          Phase 2: Monitoring       Phase 3: Security
─────────────────────        ─────────────────────     ─────────────────────
┌─────────────────────┐      ┌─────────────────────┐   ┌─────────────────────┐
│ Project Setup       │      │ Model Drift Monitor │   │ Security Tester     │
│ Shared Infra        │ ───► │ Vendor Cost Tracker │   │ Data History Log    │
│ Inference Estimator │      │ Profitability       │   │ Safety/Bias Checks  │
│ Data Integrity      │      └─────────────────────┘   └─────────────────────┘
│ Privacy Scanner     │                │                         │
└─────────────────────┘                │                         │
                                       ▼                         ▼
                              Phase 4: Quality          Phase 5: Analytics
                              ─────────────────────     ─────────────────────
                              ┌─────────────────────┐   ┌─────────────────────┐
                              │ Label Quality       │   │ Content Performance │
                              │ Model Comparator    │   │ Emergency Control   │
                              └─────────────────────┘   │ Result Interpret    │
                                                        └─────────────────────┘

Success Metrics

Phase Key Metrics
Phase 1 Frontend/backend running, 3 tools functional, <2s page load
Phase 2 Real-time drift alerts, Cost tracking across 3+ providers
Phase 3 90%+ PII detection rate, Compliance reports generated
Phase 4 Inter-rater agreement calculated, Model comparison functional
Phase 5 Emergency shutoff <1s response, Automated reports generated

Installation

Prerequisites

# Required
Node.js 18+
Python 3.11+
Docker & Docker Compose (recommended)

# Optional
PostgreSQL 15+ (for production)
Redis 7+ (for caching/queues)

Quick Setup with Docker

# Clone and navigate
cd ai_tools_suite

# Start all services
docker-compose up -d

# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs

Manual Setup

Backend (FastAPI)

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run development server
uvicorn main:app --reload --port 8000

Frontend (SvelteKit)

cd frontend

# Install dependencies
npm install

# Run development server
npm run dev

# Build for production
npm run build

Environment Variables

Create .env files in both frontend/ and backend/ directories:

backend/.env

DATABASE_URL=sqlite:///./ai_tools.db
SECRET_KEY=your-secret-key-here
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

frontend/.env

PUBLIC_API_URL=http://localhost:8000

Quick Start

Accessing the Dashboard

  1. Start the application (Docker or manual setup)
  2. Open http://localhost:3000 in your browser
  3. Use the sidebar to navigate between tools

API Usage

All tools are accessible via REST API:

# Check API health
curl http://localhost:8000/api/v1/health

# Estimate inference costs
curl -X POST http://localhost:8000/api/v1/estimate/calculate \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "tokens": 1000000, "requests_per_day": 1000}'

# Scan for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan \
  -F "file=@data.csv"

# Check model drift
curl -X POST http://localhost:8000/api/v1/drift/analyze \
  -F "baseline=@baseline.csv" \
  -F "production=@production.csv"

User Guide

This section provides step-by-step instructions for using the tools available in Phase 1.

Inference Estimator

Purpose: Calculate AI API costs before deploying your application.

How to Use

  1. Navigate to the Tool

    • Click "Inference Estimator" in the sidebar (or go to /inference-estimator)
  2. Configure Your Model

    • Select your AI provider (OpenAI, Anthropic, Google, or Custom)
    • Choose the specific model (e.g., GPT-4, Claude 3, Gemini Pro)
    • For custom models, enter your own input/output prices per 1M tokens
  3. Enter Usage Parameters

    • Input Tokens per Request: Average tokens you send to the model
    • Output Tokens per Request: Average tokens the model returns
    • Requests per Day: Expected daily API calls
    • Peak Multiplier: Account for traffic spikes (1x = normal, 2x = double traffic)
  4. View Cost Breakdown

    • Daily Cost: Input cost + Output cost per day
    • Monthly Cost: 30-day projection
    • Annual Cost: 365-day projection
  5. Override Pricing (Optional)

    • Click the edit icon next to any model to set custom pricing
    • Useful for negotiated enterprise rates or new models

Example Calculation

Model: GPT-4 Turbo
Input: 500 tokens/request × $10.00/1M = $0.005/request
Output: 200 tokens/request × $30.00/1M = $0.006/request
Requests: 10,000/day

Daily Cost: (0.005 + 0.006) × 10,000 = $110.00
Monthly Cost: $110 × 30 = $3,300.00

Expected Outcome

After entering your parameters, you will see:

┌─────────────────────────────────────────────────────────────┐
│  COST BREAKDOWN                                             │
├─────────────────────────────────────────────────────────────┤
│  Provider: OpenAI                                           │
│  Model: GPT-4 Turbo                                         │
│                                                             │
│  ┌─────────────┬────────────┬────────────┬───────────────┐  │
│  │   Period    │ Input Cost │ Output Cost│    Total      │  │
│  ├─────────────┼────────────┼────────────┼───────────────┤  │
│  │   Daily     │   $50.00   │   $60.00   │    $110.00    │  │
│  │   Monthly   │ $1,500.00  │ $1,800.00  │  $3,300.00    │  │
│  │   Yearly    │$18,250.00  │$21,900.00  │ $40,150.00    │  │
│  └─────────────┴────────────┴────────────┴───────────────┘  │
│                                                             │
│  Tokens per Day: 7,000,000 (5M input + 2M output)           │
│  Cost per Request: $0.011                                   │
│  Cost per 1K Requests: $11.00                               │
└─────────────────────────────────────────────────────────────┘

Data Integrity Audit

Purpose: Analyze datasets for quality issues, missing values, duplicates, and outliers.

How to Use

  1. Navigate to the Tool

    • Click "Data Integrity Audit" in the sidebar (or go to /data-audit)
  2. Upload Your Dataset

    • Drag and drop a file onto the upload area, OR
    • Click to browse and select a file
    • Supported formats: CSV, Excel (.xlsx, .xls), JSON
  3. Click "Analyze Data"

    • The tool will process your file and display results
  4. Review the Results

    Quick Stats Panel:

    • Rows: Total number of records
    • Columns: Number of fields
    • Duplicates: Count of duplicate rows
    • Issues: Number of problems detected

    Overview Tab:

    • Missing values summary with counts and percentages
    • Duplicate row detection results

    Columns Tab:

    • Detailed statistics for each column:
      • Data type (int64, float64, object, etc.)
      • Missing value count and percentage
      • Unique value count
      • Sample values

    Issues & Recommendations Tab:

    • List of detected problems with icons:
      • ! = Missing values
      • 2x = Duplicates
      • ~ = Outliers
      • # = High cardinality
      • = = Constant column
      • OK = No issues
    • Actionable recommendations for fixing each issue

Understanding the Results

Issue Type What It Means Recommended Action
Missing Values Empty cells in the data Fill with mean/median or remove rows
Duplicate Rows Identical records Remove duplicates to avoid bias
Outliers Extreme values Investigate if valid or remove
High Cardinality Too many unique values Check if column is an ID field
Constant Column Only one value Consider removing from analysis

Expected Outcome

After uploading a dataset (e.g., customers.csv), you will see:

┌─────────────────────────────────────────────────────────────────────┐
│  QUICK STATS                                                        │
│  ┌──────────┬──────────┬──────────────┬─────────────┐               │
│  │  Rows    │ Columns  │  Duplicates  │   Issues    │               │
│  │  10,542  │    12    │      47      │      5      │               │
│  └──────────┴──────────┴──────────────┴─────────────┘               │
├─────────────────────────────────────────────────────────────────────┤
│  OVERVIEW TAB                                                       │
│  ────────────                                                       │
│  Missing Values:                                                    │
│  ┌─────────────────┬─────────┬─────────┐                            │
│  │ Column          │ Count   │ Percent │                            │
│  ├─────────────────┼─────────┼─────────┤                            │
│  │ email           │   23    │  0.22%  │                            │
│  │ phone           │  156    │  1.48%  │                            │
│  │ address         │   89    │  0.84%  │                            │
│  └─────────────────┴─────────┴─────────┘                            │
│                                                                     │
│  ⚠ 47 duplicate rows found (0.45%)                                  │
├─────────────────────────────────────────────────────────────────────┤
│  ISSUES & RECOMMENDATIONS TAB                                       │
│  ────────────────────────────                                       │
│  Issues Found:                                                      │
│  [!] Dataset has 268 missing values across 3 columns                │
│  [2x] Found 47 duplicate rows (0.45%)                               │
│  [~] Column 'age' has 12 potential outliers                         │
│  [#] Column 'user_id' has very high cardinality (10,495 unique)     │
│  [=] Column 'status' has only one unique value                      │
│                                                                     │
│  Recommendations:                                                   │
│  💡 Fill missing values with mean/median for numeric columns        │
│  💡 Consider removing duplicate rows to improve data quality        │
│  💡 Review if 'user_id' should be used as an identifier             │
│  💡 Consider removing constant column 'status'                      │
└─────────────────────────────────────────────────────────────────────┘

API Endpoints

# Analyze a dataset
curl -X POST http://localhost:8000/api/v1/audit/analyze \
  -F "file=@your_data.csv"

# Clean a dataset (remove duplicates and missing rows)
curl -X POST http://localhost:8000/api/v1/audit/clean \
  -F "file=@your_data.csv"

# Validate schema
curl -X POST http://localhost:8000/api/v1/audit/validate-schema \
  -F "file=@your_data.csv"

# Detect outliers
curl -X POST http://localhost:8000/api/v1/audit/detect-outliers \
  -F "file=@your_data.csv"

Sample API Response

{
  "total_rows": 10542,
  "total_columns": 12,
  "missing_values": {
    "email": {"count": 23, "percent": 0.22},
    "phone": {"count": 156, "percent": 1.48},
    "address": {"count": 89, "percent": 0.84}
  },
  "duplicate_rows": 47,
  "duplicate_percent": 0.45,
  "column_stats": [
    {
      "name": "customer_id",
      "dtype": "int64",
      "missing_count": 0,
      "missing_percent": 0.0,
      "unique_count": 10542,
      "sample_values": [1001, 1002, 1003, 1004, 1005]
    }
  ],
  "issues": [
    "Dataset has 268 missing values across 3 columns",
    "Found 47 duplicate rows (0.45%)",
    "Column 'age' has 12 potential outliers"
  ],
  "recommendations": [
    "Consider filling missing values with mean/median",
    "Consider removing duplicate rows to improve data quality"
  ]
}

Privacy Scanner

Purpose: Detect and redact personally identifiable information (PII) from text and files.

How to Use

  1. Navigate to the Tool

    • Click "Privacy Scanner" in the sidebar (or go to /privacy-scanner)
  2. Choose Input Mode

    • Text Mode: Paste or type text directly
    • File Mode: Upload CSV, TXT, or JSON files
  3. Configure Detection Options

    • Toggle which PII types to detect:
      • Emails
      • Phone numbers
      • SSN (Social Security Numbers)
      • Credit Cards
      • IP Addresses
      • Dates of Birth
  4. Enter or Upload Content

    • For text: Paste content into the text area
    • For files: Drag and drop or click to upload
    • Tip: Click "Load Sample" to see example PII data
  5. Click "Scan for PII"

    • The tool will analyze your content and display results
  6. Review the Results

    Risk Summary:

    • PII Found: Total number of PII entities detected
    • Types: Number of different PII categories
    • Risk Score: Calculated severity (0-100)
    • Risk Level: CRITICAL, HIGH, MEDIUM, or LOW

    Overview Tab:

    • PII counts by type with color-coded severity
    • Risk assessment with explanation

    Entities Tab:

    • Detailed list of each detected PII item:
      • Type (EMAIL, PHONE, SSN, etc.)
      • Original value
      • Masked value
      • Confidence score (percentage)

    Redacted Preview Tab:

    • Shows your text with all PII masked
    • Safe to share after verification

PII Detection Patterns

Type Example Masked As
EMAIL john.doe@example.com jo***@example.com
PHONE (555) 123-4567 --4567
SSN 123-45-6789 *--6789
CREDIT_CARD 4532015112830366 --****-0366
IP_ADDRESS 192.168.1.100 192...*
DATE_OF_BIRTH 03/15/1985 //1985

Risk Levels Explained

Level Score Description
CRITICAL 70-100 Highly sensitive PII (SSN, Credit Cards). Immediate action required.
HIGH 50-69 Multiple sensitive PII elements. Consider redaction before sharing.
MEDIUM 30-49 Some PII detected that may require attention.
LOW 0-29 Minimal or no PII detected.

API Endpoints

# Scan text for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-text \
  -F "text=Contact john@example.com or call 555-123-4567" \
  -F "detect_emails=true" \
  -F "detect_phones=true"

# Scan a file for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-file \
  -F "file=@customer_data.csv"

# Scan CSV/Excel with column-by-column analysis
curl -X POST http://localhost:8000/api/v1/privacy/scan-dataframe \
  -F "file=@customer_data.csv"

# Redact PII from text
curl -X POST http://localhost:8000/api/v1/privacy/redact \
  -F "text=Call 555-123-4567 for support" \
  -F "mode=mask"

# List supported PII types
curl http://localhost:8000/api/v1/privacy/entity-types

Redaction Modes

Mode Description Example Output
mask Shows partial value jo***@example.com
remove Replaces with [REDACTED] [REDACTED]
type Shows PII type [EMAIL]

Expected Outcome

After scanning text or a file, you will see results like:

RISK SUMMARY
┌────────────┬──────────┬────────────┬─────────────────┐
│ PII Found  │  Types   │ Risk Score │   Risk Level    │
│     7      │    5     │     72     │    CRITICAL     │
└────────────┴──────────┴────────────┴─────────────────┘

ENTITIES TAB
┌────────────────┬───────────────────────┬────────────────┬───────┐
│ Type           │ Original              │ Masked         │ Conf  │
├────────────────┼───────────────────────┼────────────────┼───────┤
│ EMAIL          │ john.smith@example.com│ jo***@example..│  95%  │
│ PHONE          │ (555) 123-4567        │ ***-***-4567   │  85%  │
│ SSN            │ 123-45-6789           │ ***-**-6789    │  95%  │
│ CREDIT_CARD    │ 4532015112830366      │ ****-****-0366 │  95%  │
└────────────────┴───────────────────────┴────────────────┴───────┘

REDACTED PREVIEW
Customer Record:
Email: jo***@example.com
Phone: ***-***-4567
SSN: ***-**-6789
Credit Card: ****-****-****-0366

Sample API Response

{
  "total_entities": 7,
  "entities_by_type": {
    "EMAIL": 2, "PHONE": 2, "SSN": 1, "CREDIT_CARD": 1, "IP_ADDRESS": 1
  },
  "risk_level": "CRITICAL",
  "risk_score": 72,
  "entities": [
    {
      "type": "SSN",
      "value": "123-45-6789",
      "confidence": 0.95,
      "masked_value": "***-**-6789"
    }
  ],
  "redacted_preview": "Email: jo***@example.com\nSSN: ***-**-6789..."
}

Detailed Tool Documentation

1. Model Drift Monitor

Purpose: Detect when model performance degrades over time.

Features:

  • Real-time confidence score tracking
  • Statistical drift detection (KS test, PSI)
  • Alert thresholds configuration
  • Historical trend visualization

API Endpoints:

POST /api/v1/drift/baseline     # Upload baseline distribution
POST /api/v1/drift/analyze      # Analyze production data for drift
GET  /api/v1/drift/history      # Get drift score history
PUT  /api/v1/drift/thresholds   # Configure alert thresholds

2. Vendor Cost Tracker

Purpose: Aggregate and visualize API spending across providers.

Supported Providers:

  • OpenAI
  • Anthropic
  • AWS Bedrock
  • Google Vertex AI
  • Azure OpenAI

Features:

  • Daily/weekly/monthly cost breakdowns
  • Per-project cost allocation
  • Budget alerts
  • Usage forecasting

3. Security Tester

Purpose: Identify vulnerabilities in AI endpoints.

Test Categories:

  • Prompt injection attacks
  • Jailbreak attempts
  • Data exfiltration probes
  • Rate limit testing
  • Input validation bypass

Output: Security report with severity ratings and remediation steps.


4. Data History Log

Purpose: Maintain audit trail for ML training data.

Features:

  • Data version hashing (SHA-256)
  • Model-to-dataset mapping
  • Timestamp logging
  • Compliance report generation (GDPR, CCPA)

5. Model Comparator

Purpose: Evaluate and compare model outputs.

Features:

  • Side-by-side response comparison
  • Quality scoring (coherence, accuracy, relevance)
  • Latency benchmarking
  • Cost-per-query analysis

6. Privacy Scanner

Purpose: Detect and remove PII from datasets.

Detected Entities:

  • Names
  • Email addresses
  • Phone numbers
  • SSN/National IDs
  • Credit card numbers
  • Addresses
  • IP addresses

Modes:

  • Detection only
  • Automatic redaction
  • Pseudonymization

7. Label Quality Scorer

Purpose: Measure inter-annotator agreement.

Metrics:

  • Cohen's Kappa
  • Fleiss' Kappa (multi-rater)
  • Krippendorff's Alpha
  • Percent agreement

Output: Quality report with flagged inconsistent samples.


8. Inference Estimator

Purpose: Predict operational costs before deployment.

Inputs:

  • Expected request volume
  • Average tokens per request
  • Model selection
  • Peak usage patterns

Output: Monthly cost projection with confidence intervals.


9. Data Integrity Audit

Purpose: Clean and validate datasets.

Checks:

  • Missing values
  • Duplicate records
  • Data type mismatches
  • Outlier detection
  • Schema validation
  • Referential integrity

Interface: Interactive data cleaning with preview and undo.


10. Content Performance

Purpose: Analyze user engagement patterns.

Features:

  • Drop-off point visualization
  • Engagement heatmaps
  • A/B test analysis
  • Retention curve modeling

11. Safety/Bias Checks

Purpose: Audit AI systems for fairness.

Metrics:

  • Demographic parity
  • Equalized odds
  • Calibration across groups
  • Disparate impact ratio

Output: Compliance checklist with recommendations.


12. Profitability Analysis

Purpose: Connect AI costs to business outcomes.

Features:

  • Cost attribution by feature/product
  • Revenue correlation analysis
  • ROI calculation
  • Optimization recommendations

13. Emergency Control

Purpose: Safely halt AI systems when needed.

Features:

  • One-click system suspension
  • Graceful degradation modes
  • Rollback capabilities
  • Incident logging

Implementation: API endpoints + admin dashboard.


14. Result Interpretation

Purpose: Translate metrics into business actions.

Features:

  • Automated insight generation
  • Executive summary creation
  • Action item extraction
  • Trend interpretation

Output: Markdown/PDF reports for stakeholders.


Directory Structure

ai_tools_suite/
├── PRODUCT_MANUAL.md
├── docker-compose.yml
├── .env.example
│
├── frontend/                      # SvelteKit Application
│   ├── src/
│   │   ├── routes/
│   │   │   ├── +layout.svelte     # Shared layout with sidebar
│   │   │   ├── +page.svelte       # Dashboard home
│   │   │   ├── drift-monitor/
│   │   │   │   └── +page.svelte
│   │   │   ├── cost-tracker/
│   │   │   │   └── +page.svelte
│   │   │   ├── security-tester/
│   │   │   ├── data-history/
│   │   │   ├── model-comparator/
│   │   │   ├── privacy-scanner/
│   │   │   ├── label-quality/
│   │   │   ├── inference-estimator/
│   │   │   ├── data-audit/
│   │   │   ├── content-performance/
│   │   │   ├── bias-checks/
│   │   │   ├── profitability/
│   │   │   ├── emergency-control/
│   │   │   └── reports/
│   │   ├── lib/
│   │   │   ├── components/        # Shared UI components
│   │   │   │   ├── Sidebar.svelte
│   │   │   │   ├── Chart.svelte
│   │   │   │   ├── DataTable.svelte
│   │   │   │   └── FileUpload.svelte
│   │   │   ├── stores/            # Svelte stores
│   │   │   └── api/               # API client
│   │   └── app.html
│   ├── static/
│   ├── package.json
│   ├── svelte.config.js
│   ├── tailwind.config.js
│   └── tsconfig.json
│
├── backend/                       # FastAPI Application
│   ├── main.py                    # Application entry point
│   ├── requirements.txt
│   ├── routers/
│   │   ├── drift.py
│   │   ├── costs.py
│   │   ├── security.py
│   │   ├── history.py
│   │   ├── compare.py
│   │   ├── privacy.py
│   │   ├── labels.py
│   │   ├── estimate.py
│   │   ├── audit.py
│   │   ├── content.py
│   │   ├── bias.py
│   │   ├── profitability.py
│   │   ├── emergency.py
│   │   └── reports.py
│   ├── services/                  # Business logic
│   │   ├── drift_detector.py
│   │   ├── cost_aggregator.py
│   │   ├── pii_scanner.py
│   │   ├── bias_analyzer.py
│   │   └── ...
│   ├── models/                    # Pydantic schemas
│   │   ├── drift.py
│   │   ├── costs.py
│   │   └── ...
│   ├── database/
│   │   ├── connection.py
│   │   └── models.py              # SQLAlchemy models
│   └── tests/
│
├── shared/                        # Shared utilities (deprecated)
├── tests/                         # Integration tests
└── examples/                      # Example data and usage
    ├── sample_baseline.csv
    ├── sample_production.csv
    └── sample_pii_data.csv

Version History

Version Date Changes
0.1.0 TBD Phase 1 - Foundation (3 tools)
0.2.0 TBD Phase 2 - Monitoring & Costs
0.3.0 TBD Phase 3 - Security & Compliance
0.4.0 TBD Phase 4 - Quality & Comparison
1.0.0 TBD Phase 5 - Full Release (14 tools)

Support

For issues or feature requests, refer to the project documentation or contact the development team.


Last Updated: December 2024