ai-tools-suite/PRODUCT_MANUAL.md
2025-12-27 15:33:06 +00:00

1041 lines
42 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AI Tools Suite - Product Manual
> A comprehensive collection of AI/ML operational tools for monitoring, security, compliance, and cost management.
---
## Table of Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Tool Catalog](#tool-catalog)
4. [Product Roadmap](#product-roadmap)
5. [Installation](#installation)
6. [Quick Start](#quick-start)
7. [User Guide](#user-guide)
8. [Detailed Tool Documentation](#detailed-tool-documentation)
9. [Directory Structure](#directory-structure)
10. [Version History](#version-history)
---
## Overview
This suite provides 14 essential tools for managing AI/ML systems in production environments. Each tool addresses a specific operational need, from cost tracking to security testing.
### Target Users
- ML Engineers
- Data Scientists
- DevOps/MLOps Teams
- Product Managers
- Compliance Officers
---
## Architecture
### System Overview
The AI Tools Suite uses a modern web architecture with a unified SvelteKit frontend and FastAPI backend.
```
┌─────────────────────────────────────────────────────────────────────────┐
│ SVELTEKIT FRONTEND │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ UNIFIED DASHBOARD │ │
│ │ ┌──────────────────┐ ┌─────────────────────────────────────────┐ │ │
│ │ │ Sidebar Nav │ │ Main Content Area │ │ │
│ │ │ ──────────── │ │ ──────────────── │ │ │
│ │ │ Dashboard │ │ │ │ │
│ │ │ Drift Monitor │ │ [Selected Tool View] │ │ │
│ │ │ Cost Tracker │ │ │ │ │
│ │ │ Security Test │ │ - Interactive Charts │ │ │
│ │ │ Data History │ │ - Data Tables │ │ │
│ │ │ Model Compare │ │ - Configuration Forms │ │ │
│ │ │ Privacy Scan │ │ - Real-time Updates │ │ │
│ │ │ Label Quality │ │ - Export Options │ │ │
│ │ │ Cost Estimate │ │ │ │ │
│ │ │ Data Audit │ │ │ │ │
│ │ │ Content Perf │ │ │ │ │
│ │ │ Bias Checks │ │ │ │ │
│ │ │ Profitability │ │ │ │ │
│ │ │ Emergency Ctrl │ │ │ │ │
│ │ │ Reports │ │ │ │ │
│ │ └──────────────────┘ └─────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│ REST API / WebSocket
┌─────────────────────────────────────────────────────────────────────────┐
│ FASTAPI BACKEND │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ API ROUTERS │ │
│ │ /api/drift /api/costs /api/security /api/privacy /api/... │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SERVICE LAYER │ │
│ │ DriftDetector │ CostAggregator │ PIIScanner │ BiasAnalyzer │ ... │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SHARED SERVICES │ │
│ │ Authentication │ Database ORM │ File Storage │ Background Jobs │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ PostgreSQL/ │ │ Redis │ │ File Storage │ │
│ │ SQLite │ │ (Cache/Queue) │ │ (Uploads/Reports) │ │
│ │ - Users │ │ - Session cache │ │ - CSV/JSON uploads │ │
│ │ - Audit logs │ │ - Job queue │ │ - Generated reports │ │
│ │ - Metrics │ │ - Real-time │ │ - Model artifacts │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
```
### Tech Stack
| Layer | Technology | Purpose |
|-------|------------|---------|
| **Frontend** | SvelteKit + TypeScript | Unified single-page application |
| **UI Components** | Tailwind CSS + shadcn-svelte | Modern, accessible components |
| **Charts** | Apache ECharts | Interactive data visualizations |
| **Backend** | FastAPI (Python 3.11+) | REST API + WebSocket support |
| **ORM** | SQLAlchemy 2.0 | Database abstraction |
| **Database** | SQLite (dev) / PostgreSQL (prod) | Persistent storage |
| **Cache** | Redis (optional) | Session cache, job queue |
| **Deployment** | Docker Compose | Container orchestration |
### API Design
All tools are accessible via a RESTful API:
```
Base URL: http://localhost:8000/api/v1
Endpoints:
├── /drift/ # Model Drift Monitor
├── /costs/ # Vendor Cost Tracker
├── /security/ # Security Tester
├── /history/ # Data History Log
├── /compare/ # Model Comparator
├── /privacy/ # Privacy Scanner
├── /labels/ # Label Quality Scorer
├── /estimate/ # Inference Estimator
├── /audit/ # Data Integrity Audit
├── /content/ # Content Performance
├── /bias/ # Safety/Bias Checks
├── /profitability/ # Profitability Analysis
├── /emergency/ # Emergency Control
└── /reports/ # Result Interpretation
```
---
## Tool Catalog
| # | Tool Name | Deliverable | Description | Status |
|---|-----------|-------------|-------------|--------|
| 1 | **Model Drift Monitor** | Dashboard | Tracks prediction confidence over time to detect when AI accuracy begins to decline. | Pending |
| 2 | **Vendor Cost Tracker** | API spend aggregator | Provides a single view of all API expenses across providers like OpenAI, Anthropic, and AWS. | Pending |
| 3 | **Security Tester** | Input fuzzer | Tests AI endpoints for exploits and prompt injections to prevent unauthorized access. | Pending |
| 4 | **Data History Log** | Audit trail logger | Maintains a record of which data versions were used to train specific models for legal compliance. | Pending |
| 5 | **Model Comparator** | Response evaluator | Compares outputs from different models side-by-side to determine the best fit for specific tasks. | Pending |
| 6 | **Privacy Scanner** | PII detector | Automatically finds and removes personal information (names, emails) from training datasets. | Pending |
| 7 | **Label Quality Scorer** | Agreement calculator | Measures the consistency of data labeling teams to ensure high-quality training inputs. | Pending |
| 8 | **Inference Estimator** | Token/Price calculator | Predicts monthly operational costs based on expected usage before a project is deployed. | Pending |
| 9 | **Data Integrity Audit** | Data cleaning app | Identifies and fixes errors in databases to prevent data loss and improve model performance. | Pending |
| 10 | **Content Performance** | Retention model | Visualizes audience drop-off points to identify which content segments drive engagement. | Pending |
| 11 | **Safety/Bias Checks** | Bias scanner checklist | Audits recommendation engines to ensure they follow privacy laws and treat users fairly. | Pending |
| 12 | **Profitability Analysis** | Cost-vs-revenue view | Correlates AI costs with business revenue to identify specific areas for monthly savings. | Pending |
| 13 | **Emergency Control** | Manual override template | Provides a reliable mechanism to immediately suspend automated processes if they fail. | Pending |
| 14 | **Result Interpretation** | Automated report generator | Converts technical metrics into a standardized list of actions for business decision-makers. | Pending |
---
## Product Roadmap
### Phase 1: Foundation (MVP)
**Goal:** Core infrastructure and 3 essential tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **1.1 Project Setup** | SvelteKit frontend scaffold, FastAPI backend scaffold, Docker Compose config, CI/CD pipeline | None |
| **1.2 Shared Infrastructure** | Authentication system, Database models, API client library, Shared UI components (sidebar, charts, tables) | 1.1 |
| **1.3 Inference Estimator** | Token counting, Multi-provider pricing, Cost projection UI, Export to CSV | 1.2 |
| **1.4 Data Integrity Audit** | File upload, Missing value detection, Duplicate finder, Interactive cleaning UI | 1.2 |
| **1.5 Privacy Scanner** | PII detection engine, Redaction modes, Scan results UI, Batch processing | 1.2 |
### Phase 2: Monitoring & Costs
**Goal:** Production monitoring and cost management tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **2.1 Model Drift Monitor** | Baseline upload, KS/PSI tests, Drift visualization, Alert configuration | Phase 1 |
| **2.2 Vendor Cost Tracker** | API key integration (OpenAI, Anthropic, AWS), Cost aggregation, Budget alerts, Usage forecasting | Phase 1 |
| **2.3 Profitability Analysis** | Revenue data import, Cost-revenue correlation, ROI calculator, Savings recommendations | 2.2 |
### Phase 3: Security & Compliance
**Goal:** Security testing and compliance tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **3.1 Security Tester** | Prompt injection test suite, Jailbreak detection, Vulnerability report generation | Phase 1 |
| **3.2 Data History Log** | Data versioning (SHA-256), Model-dataset linking, Audit trail UI, GDPR/CCPA reports | Phase 1 |
| **3.3 Safety/Bias Checks** | Fairness metrics (demographic parity, equalized odds), Bias detection, Compliance checklist | Phase 1 |
### Phase 4: Quality & Comparison
**Goal:** Data quality and model evaluation tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **4.1 Label Quality Scorer** | Multi-rater agreement (Kappa, Alpha), Inconsistency flagging, Quality reports | Phase 1 |
| **4.2 Model Comparator** | Side-by-side comparison UI, Quality scoring, Latency benchmarks, Cost-per-query analysis | Phase 1 |
### Phase 5: Analytics & Control
**Goal:** Advanced analytics and operational control
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **5.1 Content Performance** | Engagement tracking, Drop-off visualization, Retention curves, A/B analysis | Phase 1 |
| **5.2 Emergency Control** | Kill switch API, Graceful degradation, Rollback triggers, Incident logging | Phase 1 |
| **5.3 Result Interpretation** | Metric-to-insight engine, Executive summary generator, PDF/Markdown export | Phase 1 |
### Roadmap Visualization
```
Phase 1: Foundation Phase 2: Monitoring Phase 3: Security
───────────────────── ───────────────────── ─────────────────────
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Project Setup │ │ Model Drift Monitor │ │ Security Tester │
│ Shared Infra │ ───► │ Vendor Cost Tracker │ │ Data History Log │
│ Inference Estimator │ │ Profitability │ │ Safety/Bias Checks │
│ Data Integrity │ └─────────────────────┘ └─────────────────────┘
│ Privacy Scanner │ │ │
└─────────────────────┘ │ │
▼ ▼
Phase 4: Quality Phase 5: Analytics
───────────────────── ─────────────────────
┌─────────────────────┐ ┌─────────────────────┐
│ Label Quality │ │ Content Performance │
│ Model Comparator │ │ Emergency Control │
└─────────────────────┘ │ Result Interpret │
└─────────────────────┘
```
### Success Metrics
| Phase | Key Metrics |
|-------|-------------|
| Phase 1 | Frontend/backend running, 3 tools functional, <2s page load |
| Phase 2 | Real-time drift alerts, Cost tracking across 3+ providers |
| Phase 3 | 90%+ PII detection rate, Compliance reports generated |
| Phase 4 | Inter-rater agreement calculated, Model comparison functional |
| Phase 5 | Emergency shutoff <1s response, Automated reports generated |
---
## Installation
### Prerequisites
```bash
# Required
Node.js 18+
Python 3.11+
Docker & Docker Compose (recommended)
# Optional
PostgreSQL 15+ (for production)
Redis 7+ (for caching/queues)
```
### Quick Setup with Docker
```bash
# Clone and navigate
cd ai_tools_suite
# Start all services
docker-compose up -d
# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
```
### Manual Setup
#### Backend (FastAPI)
```bash
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run development server
uvicorn main:app --reload --port 8000
```
#### Frontend (SvelteKit)
```bash
cd frontend
# Install dependencies
npm install
# Run development server
npm run dev
# Build for production
npm run build
```
### Environment Variables
Create `.env` files in both `frontend/` and `backend/` directories:
**backend/.env**
```env
DATABASE_URL=sqlite:///./ai_tools.db
SECRET_KEY=your-secret-key-here
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
```
**frontend/.env**
```env
PUBLIC_API_URL=http://localhost:8000
```
---
## Quick Start
### Accessing the Dashboard
1. Start the application (Docker or manual setup)
2. Open http://localhost:3000 in your browser
3. Use the sidebar to navigate between tools
### API Usage
All tools are accessible via REST API:
```bash
# Check API health
curl http://localhost:8000/api/v1/health
# Estimate inference costs
curl -X POST http://localhost:8000/api/v1/estimate/calculate \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "tokens": 1000000, "requests_per_day": 1000}'
# Scan for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan \
-F "file=@data.csv"
# Check model drift
curl -X POST http://localhost:8000/api/v1/drift/analyze \
-F "baseline=@baseline.csv" \
-F "production=@production.csv"
```
---
## User Guide
This section provides step-by-step instructions for using the tools available in Phase 1.
### Inference Estimator
**Purpose:** Calculate AI API costs before deploying your application.
#### How to Use
1. **Navigate to the Tool**
- Click "Inference Estimator" in the sidebar (or go to `/inference-estimator`)
2. **Configure Your Model**
- Select your AI provider (OpenAI, Anthropic, Google, or Custom)
- Choose the specific model (e.g., GPT-4, Claude 3, Gemini Pro)
- For custom models, enter your own input/output prices per 1M tokens
3. **Enter Usage Parameters**
- **Input Tokens per Request:** Average tokens you send to the model
- **Output Tokens per Request:** Average tokens the model returns
- **Requests per Day:** Expected daily API calls
- **Peak Multiplier:** Account for traffic spikes (1x = normal, 2x = double traffic)
4. **View Cost Breakdown**
- **Daily Cost:** Input cost + Output cost per day
- **Monthly Cost:** 30-day projection
- **Annual Cost:** 365-day projection
5. **Override Pricing (Optional)**
- Click the edit icon next to any model to set custom pricing
- Useful for negotiated enterprise rates or new models
#### Example Calculation
```
Model: GPT-4 Turbo
Input: 500 tokens/request × $10.00/1M = $0.005/request
Output: 200 tokens/request × $30.00/1M = $0.006/request
Requests: 10,000/day
Daily Cost: (0.005 + 0.006) × 10,000 = $110.00
Monthly Cost: $110 × 30 = $3,300.00
```
#### Expected Outcome
After entering your parameters, you will see:
```
┌─────────────────────────────────────────────────────────────┐
│ COST BREAKDOWN │
├─────────────────────────────────────────────────────────────┤
│ Provider: OpenAI │
│ Model: GPT-4 Turbo │
│ │
│ ┌─────────────┬────────────┬────────────┬───────────────┐ │
│ │ Period │ Input Cost │ Output Cost│ Total │ │
│ ├─────────────┼────────────┼────────────┼───────────────┤ │
│ │ Daily │ $50.00 │ $60.00 │ $110.00 │ │
│ │ Monthly │ $1,500.00 │ $1,800.00 │ $3,300.00 │ │
│ │ Yearly │$18,250.00 │$21,900.00 │ $40,150.00 │ │
│ └─────────────┴────────────┴────────────┴───────────────┘ │
│ │
│ Tokens per Day: 7,000,000 (5M input + 2M output) │
│ Cost per Request: $0.011 │
│ Cost per 1K Requests: $11.00 │
└─────────────────────────────────────────────────────────────┘
```
---
### Data Integrity Audit
**Purpose:** Analyze datasets for quality issues, missing values, duplicates, and outliers.
#### How to Use
1. **Navigate to the Tool**
- Click "Data Integrity Audit" in the sidebar (or go to `/data-audit`)
2. **Upload Your Dataset**
- **Drag and drop** a file onto the upload area, OR
- **Click to browse** and select a file
- Supported formats: CSV, Excel (.xlsx, .xls), JSON
3. **Click "Analyze Data"**
- The tool will process your file and display results
4. **Review the Results**
**Quick Stats Panel:**
- **Rows:** Total number of records
- **Columns:** Number of fields
- **Duplicates:** Count of duplicate rows
- **Issues:** Number of problems detected
**Overview Tab:**
- Missing values summary with counts and percentages
- Duplicate row detection results
**Columns Tab:**
- Detailed statistics for each column:
- Data type (int64, float64, object, etc.)
- Missing value count and percentage
- Unique value count
- Sample values
**Issues & Recommendations Tab:**
- List of detected problems with icons:
- `!` = Missing values
- `2x` = Duplicates
- `~` = Outliers
- `#` = High cardinality
- `=` = Constant column
- `OK` = No issues
- Actionable recommendations for fixing each issue
#### Understanding the Results
| Issue Type | What It Means | Recommended Action |
|------------|---------------|-------------------|
| Missing Values | Empty cells in the data | Fill with mean/median or remove rows |
| Duplicate Rows | Identical records | Remove duplicates to avoid bias |
| Outliers | Extreme values | Investigate if valid or remove |
| High Cardinality | Too many unique values | Check if column is an ID field |
| Constant Column | Only one value | Consider removing from analysis |
#### Expected Outcome
After uploading a dataset (e.g., `customers.csv`), you will see:
```
┌─────────────────────────────────────────────────────────────────────┐
│ QUICK STATS │
│ ┌──────────┬──────────┬──────────────┬─────────────┐ │
│ │ Rows │ Columns │ Duplicates │ Issues │ │
│ │ 10,542 │ 12 │ 47 │ 5 │ │
│ └──────────┴──────────┴──────────────┴─────────────┘ │
├─────────────────────────────────────────────────────────────────────┤
│ OVERVIEW TAB │
│ ──────────── │
│ Missing Values: │
│ ┌─────────────────┬─────────┬─────────┐ │
│ │ Column │ Count │ Percent │ │
│ ├─────────────────┼─────────┼─────────┤ │
│ │ email │ 23 │ 0.22% │ │
│ │ phone │ 156 │ 1.48% │ │
│ │ address │ 89 │ 0.84% │ │
│ └─────────────────┴─────────┴─────────┘ │
│ │
│ ⚠ 47 duplicate rows found (0.45%) │
├─────────────────────────────────────────────────────────────────────┤
│ ISSUES & RECOMMENDATIONS TAB │
│ ──────────────────────────── │
│ Issues Found: │
│ [!] Dataset has 268 missing values across 3 columns │
│ [2x] Found 47 duplicate rows (0.45%) │
│ [~] Column 'age' has 12 potential outliers │
│ [#] Column 'user_id' has very high cardinality (10,495 unique) │
│ [=] Column 'status' has only one unique value │
│ │
│ Recommendations: │
│ 💡 Fill missing values with mean/median for numeric columns │
│ 💡 Consider removing duplicate rows to improve data quality │
│ 💡 Review if 'user_id' should be used as an identifier │
│ 💡 Consider removing constant column 'status' │
└─────────────────────────────────────────────────────────────────────┘
```
#### API Endpoints
```bash
# Analyze a dataset
curl -X POST http://localhost:8000/api/v1/audit/analyze \
-F "file=@your_data.csv"
# Clean a dataset (remove duplicates and missing rows)
curl -X POST http://localhost:8000/api/v1/audit/clean \
-F "file=@your_data.csv"
# Validate schema
curl -X POST http://localhost:8000/api/v1/audit/validate-schema \
-F "file=@your_data.csv"
# Detect outliers
curl -X POST http://localhost:8000/api/v1/audit/detect-outliers \
-F "file=@your_data.csv"
```
#### Sample API Response
```json
{
"total_rows": 10542,
"total_columns": 12,
"missing_values": {
"email": {"count": 23, "percent": 0.22},
"phone": {"count": 156, "percent": 1.48},
"address": {"count": 89, "percent": 0.84}
},
"duplicate_rows": 47,
"duplicate_percent": 0.45,
"column_stats": [
{
"name": "customer_id",
"dtype": "int64",
"missing_count": 0,
"missing_percent": 0.0,
"unique_count": 10542,
"sample_values": [1001, 1002, 1003, 1004, 1005]
}
],
"issues": [
"Dataset has 268 missing values across 3 columns",
"Found 47 duplicate rows (0.45%)",
"Column 'age' has 12 potential outliers"
],
"recommendations": [
"Consider filling missing values with mean/median",
"Consider removing duplicate rows to improve data quality"
]
}
```
---
### Privacy Scanner
**Purpose:** Detect and redact personally identifiable information (PII) from text and files.
#### How to Use
1. **Navigate to the Tool**
- Click "Privacy Scanner" in the sidebar (or go to `/privacy-scanner`)
2. **Choose Input Mode**
- **Text Mode:** Paste or type text directly
- **File Mode:** Upload CSV, TXT, or JSON files
3. **Configure Detection Options**
- Toggle which PII types to detect:
- Emails
- Phone numbers
- SSN (Social Security Numbers)
- Credit Cards
- IP Addresses
- Dates of Birth
4. **Enter or Upload Content**
- **For text:** Paste content into the text area
- **For files:** Drag and drop or click to upload
- **Tip:** Click "Load Sample" to see example PII data
5. **Click "Scan for PII"**
- The tool will analyze your content and display results
6. **Review the Results**
**Risk Summary:**
- **PII Found:** Total number of PII entities detected
- **Types:** Number of different PII categories
- **Risk Score:** Calculated severity (0-100)
- **Risk Level:** CRITICAL, HIGH, MEDIUM, or LOW
**Overview Tab:**
- PII counts by type with color-coded severity
- Risk assessment with explanation
**Entities Tab:**
- Detailed list of each detected PII item:
- Type (EMAIL, PHONE, SSN, etc.)
- Original value
- Masked value
- Confidence score (percentage)
**Redacted Preview Tab:**
- Shows your text with all PII masked
- Safe to share after verification
#### PII Detection Patterns
| Type | Example | Masked As |
|------|---------|-----------|
| EMAIL | john.doe@example.com | jo***@example.com |
| PHONE | (555) 123-4567 | ***-***-4567 |
| SSN | 123-45-6789 | ***-**-6789 |
| CREDIT_CARD | 4532015112830366 | ****-****-****-0366 |
| IP_ADDRESS | 192.168.1.100 | 192.***.***.*|
| DATE_OF_BIRTH | 03/15/1985 | **/**/1985 |
#### Risk Levels Explained
| Level | Score | Description |
|-------|-------|-------------|
| CRITICAL | 70-100 | Highly sensitive PII (SSN, Credit Cards). Immediate action required. |
| HIGH | 50-69 | Multiple sensitive PII elements. Consider redaction before sharing. |
| MEDIUM | 30-49 | Some PII detected that may require attention. |
| LOW | 0-29 | Minimal or no PII detected. |
#### API Endpoints
```bash
# Scan text for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-text \
-F "text=Contact john@example.com or call 555-123-4567" \
-F "detect_emails=true" \
-F "detect_phones=true"
# Scan a file for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-file \
-F "file=@customer_data.csv"
# Scan CSV/Excel with column-by-column analysis
curl -X POST http://localhost:8000/api/v1/privacy/scan-dataframe \
-F "file=@customer_data.csv"
# Redact PII from text
curl -X POST http://localhost:8000/api/v1/privacy/redact \
-F "text=Call 555-123-4567 for support" \
-F "mode=mask"
# List supported PII types
curl http://localhost:8000/api/v1/privacy/entity-types
```
#### Redaction Modes
| Mode | Description | Example Output |
|------|-------------|----------------|
| `mask` | Shows partial value | jo***@example.com |
| `remove` | Replaces with [REDACTED] | [REDACTED] |
| `type` | Shows PII type | [EMAIL] |
#### Expected Outcome
After scanning text or a file, you will see results like:
```
RISK SUMMARY
┌────────────┬──────────┬────────────┬─────────────────┐
│ PII Found │ Types │ Risk Score │ Risk Level │
│ 7 │ 5 │ 72 │ CRITICAL │
└────────────┴──────────┴────────────┴─────────────────┘
ENTITIES TAB
┌────────────────┬───────────────────────┬────────────────┬───────┐
│ Type │ Original │ Masked │ Conf │
├────────────────┼───────────────────────┼────────────────┼───────┤
│ EMAIL │ john.smith@example.com│ jo***@example..│ 95% │
│ PHONE │ (555) 123-4567 │ ***-***-4567 │ 85% │
│ SSN │ 123-45-6789 │ ***-**-6789 │ 95% │
│ CREDIT_CARD │ 4532015112830366 │ ****-****-0366 │ 95% │
└────────────────┴───────────────────────┴────────────────┴───────┘
REDACTED PREVIEW
Customer Record:
Email: jo***@example.com
Phone: ***-***-4567
SSN: ***-**-6789
Credit Card: ****-****-****-0366
```
#### Sample API Response
```json
{
"total_entities": 7,
"entities_by_type": {
"EMAIL": 2, "PHONE": 2, "SSN": 1, "CREDIT_CARD": 1, "IP_ADDRESS": 1
},
"risk_level": "CRITICAL",
"risk_score": 72,
"entities": [
{
"type": "SSN",
"value": "123-45-6789",
"confidence": 0.95,
"masked_value": "***-**-6789"
}
],
"redacted_preview": "Email: jo***@example.com\nSSN: ***-**-6789..."
}
```
---
## Detailed Tool Documentation
### 1. Model Drift Monitor
**Purpose:** Detect when model performance degrades over time.
**Features:**
- Real-time confidence score tracking
- Statistical drift detection (KS test, PSI)
- Alert thresholds configuration
- Historical trend visualization
**API Endpoints:**
```
POST /api/v1/drift/baseline # Upload baseline distribution
POST /api/v1/drift/analyze # Analyze production data for drift
GET /api/v1/drift/history # Get drift score history
PUT /api/v1/drift/thresholds # Configure alert thresholds
```
---
### 2. Vendor Cost Tracker
**Purpose:** Aggregate and visualize API spending across providers.
**Supported Providers:**
- OpenAI
- Anthropic
- AWS Bedrock
- Google Vertex AI
- Azure OpenAI
**Features:**
- Daily/weekly/monthly cost breakdowns
- Per-project cost allocation
- Budget alerts
- Usage forecasting
---
### 3. Security Tester
**Purpose:** Identify vulnerabilities in AI endpoints.
**Test Categories:**
- Prompt injection attacks
- Jailbreak attempts
- Data exfiltration probes
- Rate limit testing
- Input validation bypass
**Output:** Security report with severity ratings and remediation steps.
---
### 4. Data History Log
**Purpose:** Maintain audit trail for ML training data.
**Features:**
- Data version hashing (SHA-256)
- Model-to-dataset mapping
- Timestamp logging
- Compliance report generation (GDPR, CCPA)
---
### 5. Model Comparator
**Purpose:** Evaluate and compare model outputs.
**Features:**
- Side-by-side response comparison
- Quality scoring (coherence, accuracy, relevance)
- Latency benchmarking
- Cost-per-query analysis
---
### 6. Privacy Scanner
**Purpose:** Detect and remove PII from datasets.
**Detected Entities:**
- Names
- Email addresses
- Phone numbers
- SSN/National IDs
- Credit card numbers
- Addresses
- IP addresses
**Modes:**
- Detection only
- Automatic redaction
- Pseudonymization
---
### 7. Label Quality Scorer
**Purpose:** Measure inter-annotator agreement.
**Metrics:**
- Cohen's Kappa
- Fleiss' Kappa (multi-rater)
- Krippendorff's Alpha
- Percent agreement
**Output:** Quality report with flagged inconsistent samples.
---
### 8. Inference Estimator
**Purpose:** Predict operational costs before deployment.
**Inputs:**
- Expected request volume
- Average tokens per request
- Model selection
- Peak usage patterns
**Output:** Monthly cost projection with confidence intervals.
---
### 9. Data Integrity Audit
**Purpose:** Clean and validate datasets.
**Checks:**
- Missing values
- Duplicate records
- Data type mismatches
- Outlier detection
- Schema validation
- Referential integrity
**Interface:** Interactive data cleaning with preview and undo.
---
### 10. Content Performance
**Purpose:** Analyze user engagement patterns.
**Features:**
- Drop-off point visualization
- Engagement heatmaps
- A/B test analysis
- Retention curve modeling
---
### 11. Safety/Bias Checks
**Purpose:** Audit AI systems for fairness.
**Metrics:**
- Demographic parity
- Equalized odds
- Calibration across groups
- Disparate impact ratio
**Output:** Compliance checklist with recommendations.
---
### 12. Profitability Analysis
**Purpose:** Connect AI costs to business outcomes.
**Features:**
- Cost attribution by feature/product
- Revenue correlation analysis
- ROI calculation
- Optimization recommendations
---
### 13. Emergency Control
**Purpose:** Safely halt AI systems when needed.
**Features:**
- One-click system suspension
- Graceful degradation modes
- Rollback capabilities
- Incident logging
**Implementation:** API endpoints + admin dashboard.
---
### 14. Result Interpretation
**Purpose:** Translate metrics into business actions.
**Features:**
- Automated insight generation
- Executive summary creation
- Action item extraction
- Trend interpretation
**Output:** Markdown/PDF reports for stakeholders.
---
## Directory Structure
```
ai_tools_suite/
├── PRODUCT_MANUAL.md
├── docker-compose.yml
├── .env.example
├── frontend/ # SvelteKit Application
│ ├── src/
│ │ ├── routes/
│ │ │ ├── +layout.svelte # Shared layout with sidebar
│ │ │ ├── +page.svelte # Dashboard home
│ │ │ ├── drift-monitor/
│ │ │ │ └── +page.svelte
│ │ │ ├── cost-tracker/
│ │ │ │ └── +page.svelte
│ │ │ ├── security-tester/
│ │ │ ├── data-history/
│ │ │ ├── model-comparator/
│ │ │ ├── privacy-scanner/
│ │ │ ├── label-quality/
│ │ │ ├── inference-estimator/
│ │ │ ├── data-audit/
│ │ │ ├── content-performance/
│ │ │ ├── bias-checks/
│ │ │ ├── profitability/
│ │ │ ├── emergency-control/
│ │ │ └── reports/
│ │ ├── lib/
│ │ │ ├── components/ # Shared UI components
│ │ │ │ ├── Sidebar.svelte
│ │ │ │ ├── Chart.svelte
│ │ │ │ ├── DataTable.svelte
│ │ │ │ └── FileUpload.svelte
│ │ │ ├── stores/ # Svelte stores
│ │ │ └── api/ # API client
│ │ └── app.html
│ ├── static/
│ ├── package.json
│ ├── svelte.config.js
│ ├── tailwind.config.js
│ └── tsconfig.json
├── backend/ # FastAPI Application
│ ├── main.py # Application entry point
│ ├── requirements.txt
│ ├── routers/
│ │ ├── drift.py
│ │ ├── costs.py
│ │ ├── security.py
│ │ ├── history.py
│ │ ├── compare.py
│ │ ├── privacy.py
│ │ ├── labels.py
│ │ ├── estimate.py
│ │ ├── audit.py
│ │ ├── content.py
│ │ ├── bias.py
│ │ ├── profitability.py
│ │ ├── emergency.py
│ │ └── reports.py
│ ├── services/ # Business logic
│ │ ├── drift_detector.py
│ │ ├── cost_aggregator.py
│ │ ├── pii_scanner.py
│ │ ├── bias_analyzer.py
│ │ └── ...
│ ├── models/ # Pydantic schemas
│ │ ├── drift.py
│ │ ├── costs.py
│ │ └── ...
│ ├── database/
│ │ ├── connection.py
│ │ └── models.py # SQLAlchemy models
│ └── tests/
├── shared/ # Shared utilities (deprecated)
├── tests/ # Integration tests
└── examples/ # Example data and usage
├── sample_baseline.csv
├── sample_production.csv
└── sample_pii_data.csv
```
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 0.1.0 | TBD | Phase 1 - Foundation (3 tools) |
| 0.2.0 | TBD | Phase 2 - Monitoring & Costs |
| 0.3.0 | TBD | Phase 3 - Security & Compliance |
| 0.4.0 | TBD | Phase 4 - Quality & Comparison |
| 1.0.0 | TBD | Phase 5 - Full Release (14 tools) |
---
## Support
For issues or feature requests, refer to the project documentation or contact the development team.
---
*Last Updated: December 2024*