ai-tools-suite/PRODUCT_MANUAL.md

1042 lines
42 KiB
Markdown
Raw Permalink Normal View History

2025-12-27 15:33:06 +00:00
# AI Tools Suite - Product Manual
> A comprehensive collection of AI/ML operational tools for monitoring, security, compliance, and cost management.
---
## Table of Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Tool Catalog](#tool-catalog)
4. [Product Roadmap](#product-roadmap)
5. [Installation](#installation)
6. [Quick Start](#quick-start)
7. [User Guide](#user-guide)
8. [Detailed Tool Documentation](#detailed-tool-documentation)
9. [Directory Structure](#directory-structure)
10. [Version History](#version-history)
---
## Overview
This suite provides 14 essential tools for managing AI/ML systems in production environments. Each tool addresses a specific operational need, from cost tracking to security testing.
### Target Users
- ML Engineers
- Data Scientists
- DevOps/MLOps Teams
- Product Managers
- Compliance Officers
---
## Architecture
### System Overview
The AI Tools Suite uses a modern web architecture with a unified SvelteKit frontend and FastAPI backend.
```
┌─────────────────────────────────────────────────────────────────────────┐
│ SVELTEKIT FRONTEND │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ UNIFIED DASHBOARD │ │
│ │ ┌──────────────────┐ ┌─────────────────────────────────────────┐ │ │
│ │ │ Sidebar Nav │ │ Main Content Area │ │ │
│ │ │ ──────────── │ │ ──────────────── │ │ │
│ │ │ Dashboard │ │ │ │ │
│ │ │ Drift Monitor │ │ [Selected Tool View] │ │ │
│ │ │ Cost Tracker │ │ │ │ │
│ │ │ Security Test │ │ - Interactive Charts │ │ │
│ │ │ Data History │ │ - Data Tables │ │ │
│ │ │ Model Compare │ │ - Configuration Forms │ │ │
│ │ │ Privacy Scan │ │ - Real-time Updates │ │ │
│ │ │ Label Quality │ │ - Export Options │ │ │
│ │ │ Cost Estimate │ │ │ │ │
│ │ │ Data Audit │ │ │ │ │
│ │ │ Content Perf │ │ │ │ │
│ │ │ Bias Checks │ │ │ │ │
│ │ │ Profitability │ │ │ │ │
│ │ │ Emergency Ctrl │ │ │ │ │
│ │ │ Reports │ │ │ │ │
│ │ └──────────────────┘ └─────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│ REST API / WebSocket
┌─────────────────────────────────────────────────────────────────────────┐
│ FASTAPI BACKEND │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ API ROUTERS │ │
│ │ /api/drift /api/costs /api/security /api/privacy /api/... │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SERVICE LAYER │ │
│ │ DriftDetector │ CostAggregator │ PIIScanner │ BiasAnalyzer │ ... │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SHARED SERVICES │ │
│ │ Authentication │ Database ORM │ File Storage │ Background Jobs │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ PostgreSQL/ │ │ Redis │ │ File Storage │ │
│ │ SQLite │ │ (Cache/Queue) │ │ (Uploads/Reports) │ │
│ │ - Users │ │ - Session cache │ │ - CSV/JSON uploads │ │
│ │ - Audit logs │ │ - Job queue │ │ - Generated reports │ │
│ │ - Metrics │ │ - Real-time │ │ - Model artifacts │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
```
### Tech Stack
| Layer | Technology | Purpose |
|-------|------------|---------|
| **Frontend** | SvelteKit + TypeScript | Unified single-page application |
| **UI Components** | Tailwind CSS + shadcn-svelte | Modern, accessible components |
| **Charts** | Apache ECharts | Interactive data visualizations |
| **Backend** | FastAPI (Python 3.11+) | REST API + WebSocket support |
| **ORM** | SQLAlchemy 2.0 | Database abstraction |
| **Database** | SQLite (dev) / PostgreSQL (prod) | Persistent storage |
| **Cache** | Redis (optional) | Session cache, job queue |
| **Deployment** | Docker Compose | Container orchestration |
### API Design
All tools are accessible via a RESTful API:
```
Base URL: http://localhost:8000/api/v1
Endpoints:
├── /drift/ # Model Drift Monitor
├── /costs/ # Vendor Cost Tracker
├── /security/ # Security Tester
├── /history/ # Data History Log
├── /compare/ # Model Comparator
├── /privacy/ # Privacy Scanner
├── /labels/ # Label Quality Scorer
├── /estimate/ # Inference Estimator
├── /audit/ # Data Integrity Audit
├── /content/ # Content Performance
├── /bias/ # Safety/Bias Checks
├── /profitability/ # Profitability Analysis
├── /emergency/ # Emergency Control
└── /reports/ # Result Interpretation
```
---
## Tool Catalog
| # | Tool Name | Deliverable | Description | Status |
|---|-----------|-------------|-------------|--------|
| 1 | **Model Drift Monitor** | Dashboard | Tracks prediction confidence over time to detect when AI accuracy begins to decline. | Pending |
| 2 | **Vendor Cost Tracker** | API spend aggregator | Provides a single view of all API expenses across providers like OpenAI, Anthropic, and AWS. | Pending |
| 3 | **Security Tester** | Input fuzzer | Tests AI endpoints for exploits and prompt injections to prevent unauthorized access. | Pending |
| 4 | **Data History Log** | Audit trail logger | Maintains a record of which data versions were used to train specific models for legal compliance. | Pending |
| 5 | **Model Comparator** | Response evaluator | Compares outputs from different models side-by-side to determine the best fit for specific tasks. | Pending |
| 6 | **Privacy Scanner** | PII detector | Automatically finds and removes personal information (names, emails) from training datasets. | Pending |
| 7 | **Label Quality Scorer** | Agreement calculator | Measures the consistency of data labeling teams to ensure high-quality training inputs. | Pending |
| 8 | **Inference Estimator** | Token/Price calculator | Predicts monthly operational costs based on expected usage before a project is deployed. | Pending |
| 9 | **Data Integrity Audit** | Data cleaning app | Identifies and fixes errors in databases to prevent data loss and improve model performance. | Pending |
| 10 | **Content Performance** | Retention model | Visualizes audience drop-off points to identify which content segments drive engagement. | Pending |
| 11 | **Safety/Bias Checks** | Bias scanner checklist | Audits recommendation engines to ensure they follow privacy laws and treat users fairly. | Pending |
| 12 | **Profitability Analysis** | Cost-vs-revenue view | Correlates AI costs with business revenue to identify specific areas for monthly savings. | Pending |
| 13 | **Emergency Control** | Manual override template | Provides a reliable mechanism to immediately suspend automated processes if they fail. | Pending |
| 14 | **Result Interpretation** | Automated report generator | Converts technical metrics into a standardized list of actions for business decision-makers. | Pending |
---
## Product Roadmap
### Phase 1: Foundation (MVP)
**Goal:** Core infrastructure and 3 essential tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **1.1 Project Setup** | SvelteKit frontend scaffold, FastAPI backend scaffold, Docker Compose config, CI/CD pipeline | None |
| **1.2 Shared Infrastructure** | Authentication system, Database models, API client library, Shared UI components (sidebar, charts, tables) | 1.1 |
| **1.3 Inference Estimator** | Token counting, Multi-provider pricing, Cost projection UI, Export to CSV | 1.2 |
| **1.4 Data Integrity Audit** | File upload, Missing value detection, Duplicate finder, Interactive cleaning UI | 1.2 |
| **1.5 Privacy Scanner** | PII detection engine, Redaction modes, Scan results UI, Batch processing | 1.2 |
### Phase 2: Monitoring & Costs
**Goal:** Production monitoring and cost management tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **2.1 Model Drift Monitor** | Baseline upload, KS/PSI tests, Drift visualization, Alert configuration | Phase 1 |
| **2.2 Vendor Cost Tracker** | API key integration (OpenAI, Anthropic, AWS), Cost aggregation, Budget alerts, Usage forecasting | Phase 1 |
| **2.3 Profitability Analysis** | Revenue data import, Cost-revenue correlation, ROI calculator, Savings recommendations | 2.2 |
### Phase 3: Security & Compliance
**Goal:** Security testing and compliance tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **3.1 Security Tester** | Prompt injection test suite, Jailbreak detection, Vulnerability report generation | Phase 1 |
| **3.2 Data History Log** | Data versioning (SHA-256), Model-dataset linking, Audit trail UI, GDPR/CCPA reports | Phase 1 |
| **3.3 Safety/Bias Checks** | Fairness metrics (demographic parity, equalized odds), Bias detection, Compliance checklist | Phase 1 |
### Phase 4: Quality & Comparison
**Goal:** Data quality and model evaluation tools
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **4.1 Label Quality Scorer** | Multi-rater agreement (Kappa, Alpha), Inconsistency flagging, Quality reports | Phase 1 |
| **4.2 Model Comparator** | Side-by-side comparison UI, Quality scoring, Latency benchmarks, Cost-per-query analysis | Phase 1 |
### Phase 5: Analytics & Control
**Goal:** Advanced analytics and operational control
| Milestone | Deliverables | Dependencies |
|-----------|--------------|--------------|
| **5.1 Content Performance** | Engagement tracking, Drop-off visualization, Retention curves, A/B analysis | Phase 1 |
| **5.2 Emergency Control** | Kill switch API, Graceful degradation, Rollback triggers, Incident logging | Phase 1 |
| **5.3 Result Interpretation** | Metric-to-insight engine, Executive summary generator, PDF/Markdown export | Phase 1 |
### Roadmap Visualization
```
Phase 1: Foundation Phase 2: Monitoring Phase 3: Security
───────────────────── ───────────────────── ─────────────────────
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Project Setup │ │ Model Drift Monitor │ │ Security Tester │
│ Shared Infra │ ───► │ Vendor Cost Tracker │ │ Data History Log │
│ Inference Estimator │ │ Profitability │ │ Safety/Bias Checks │
│ Data Integrity │ └─────────────────────┘ └─────────────────────┘
│ Privacy Scanner │ │ │
└─────────────────────┘ │ │
▼ ▼
Phase 4: Quality Phase 5: Analytics
───────────────────── ─────────────────────
┌─────────────────────┐ ┌─────────────────────┐
│ Label Quality │ │ Content Performance │
│ Model Comparator │ │ Emergency Control │
└─────────────────────┘ │ Result Interpret │
└─────────────────────┘
```
### Success Metrics
| Phase | Key Metrics |
|-------|-------------|
| Phase 1 | Frontend/backend running, 3 tools functional, <2s page load |
| Phase 2 | Real-time drift alerts, Cost tracking across 3+ providers |
| Phase 3 | 90%+ PII detection rate, Compliance reports generated |
| Phase 4 | Inter-rater agreement calculated, Model comparison functional |
| Phase 5 | Emergency shutoff <1s response, Automated reports generated |
---
## Installation
### Prerequisites
```bash
# Required
Node.js 18+
Python 3.11+
Docker & Docker Compose (recommended)
# Optional
PostgreSQL 15+ (for production)
Redis 7+ (for caching/queues)
```
### Quick Setup with Docker
```bash
# Clone and navigate
cd ai_tools_suite
# Start all services
docker-compose up -d
# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
```
### Manual Setup
#### Backend (FastAPI)
```bash
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run development server
uvicorn main:app --reload --port 8000
```
#### Frontend (SvelteKit)
```bash
cd frontend
# Install dependencies
npm install
# Run development server
npm run dev
# Build for production
npm run build
```
### Environment Variables
Create `.env` files in both `frontend/` and `backend/` directories:
**backend/.env**
```env
DATABASE_URL=sqlite:///./ai_tools.db
SECRET_KEY=your-secret-key-here
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
```
**frontend/.env**
```env
PUBLIC_API_URL=http://localhost:8000
```
---
## Quick Start
### Accessing the Dashboard
1. Start the application (Docker or manual setup)
2. Open http://localhost:3000 in your browser
3. Use the sidebar to navigate between tools
### API Usage
All tools are accessible via REST API:
```bash
# Check API health
curl http://localhost:8000/api/v1/health
# Estimate inference costs
curl -X POST http://localhost:8000/api/v1/estimate/calculate \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "tokens": 1000000, "requests_per_day": 1000}'
# Scan for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan \
-F "file=@data.csv"
# Check model drift
curl -X POST http://localhost:8000/api/v1/drift/analyze \
-F "baseline=@baseline.csv" \
-F "production=@production.csv"
```
---
## User Guide
This section provides step-by-step instructions for using the tools available in Phase 1.
### Inference Estimator
**Purpose:** Calculate AI API costs before deploying your application.
#### How to Use
1. **Navigate to the Tool**
- Click "Inference Estimator" in the sidebar (or go to `/inference-estimator`)
2. **Configure Your Model**
- Select your AI provider (OpenAI, Anthropic, Google, or Custom)
- Choose the specific model (e.g., GPT-4, Claude 3, Gemini Pro)
- For custom models, enter your own input/output prices per 1M tokens
3. **Enter Usage Parameters**
- **Input Tokens per Request:** Average tokens you send to the model
- **Output Tokens per Request:** Average tokens the model returns
- **Requests per Day:** Expected daily API calls
- **Peak Multiplier:** Account for traffic spikes (1x = normal, 2x = double traffic)
4. **View Cost Breakdown**
- **Daily Cost:** Input cost + Output cost per day
- **Monthly Cost:** 30-day projection
- **Annual Cost:** 365-day projection
5. **Override Pricing (Optional)**
- Click the edit icon next to any model to set custom pricing
- Useful for negotiated enterprise rates or new models
#### Example Calculation
```
Model: GPT-4 Turbo
Input: 500 tokens/request × $10.00/1M = $0.005/request
Output: 200 tokens/request × $30.00/1M = $0.006/request
Requests: 10,000/day
Daily Cost: (0.005 + 0.006) × 10,000 = $110.00
Monthly Cost: $110 × 30 = $3,300.00
```
#### Expected Outcome
After entering your parameters, you will see:
```
┌─────────────────────────────────────────────────────────────┐
│ COST BREAKDOWN │
├─────────────────────────────────────────────────────────────┤
│ Provider: OpenAI │
│ Model: GPT-4 Turbo │
│ │
│ ┌─────────────┬────────────┬────────────┬───────────────┐ │
│ │ Period │ Input Cost │ Output Cost│ Total │ │
│ ├─────────────┼────────────┼────────────┼───────────────┤ │
│ │ Daily │ $50.00 │ $60.00 │ $110.00 │ │
│ │ Monthly │ $1,500.00 │ $1,800.00 │ $3,300.00 │ │
│ │ Yearly │$18,250.00 │$21,900.00 │ $40,150.00 │ │
│ └─────────────┴────────────┴────────────┴───────────────┘ │
│ │
│ Tokens per Day: 7,000,000 (5M input + 2M output) │
│ Cost per Request: $0.011 │
│ Cost per 1K Requests: $11.00 │
└─────────────────────────────────────────────────────────────┘
```
---
### Data Integrity Audit
**Purpose:** Analyze datasets for quality issues, missing values, duplicates, and outliers.
#### How to Use
1. **Navigate to the Tool**
- Click "Data Integrity Audit" in the sidebar (or go to `/data-audit`)
2. **Upload Your Dataset**
- **Drag and drop** a file onto the upload area, OR
- **Click to browse** and select a file
- Supported formats: CSV, Excel (.xlsx, .xls), JSON
3. **Click "Analyze Data"**
- The tool will process your file and display results
4. **Review the Results**
**Quick Stats Panel:**
- **Rows:** Total number of records
- **Columns:** Number of fields
- **Duplicates:** Count of duplicate rows
- **Issues:** Number of problems detected
**Overview Tab:**
- Missing values summary with counts and percentages
- Duplicate row detection results
**Columns Tab:**
- Detailed statistics for each column:
- Data type (int64, float64, object, etc.)
- Missing value count and percentage
- Unique value count
- Sample values
**Issues & Recommendations Tab:**
- List of detected problems with icons:
- `!` = Missing values
- `2x` = Duplicates
- `~` = Outliers
- `#` = High cardinality
- `=` = Constant column
- `OK` = No issues
- Actionable recommendations for fixing each issue
#### Understanding the Results
| Issue Type | What It Means | Recommended Action |
|------------|---------------|-------------------|
| Missing Values | Empty cells in the data | Fill with mean/median or remove rows |
| Duplicate Rows | Identical records | Remove duplicates to avoid bias |
| Outliers | Extreme values | Investigate if valid or remove |
| High Cardinality | Too many unique values | Check if column is an ID field |
| Constant Column | Only one value | Consider removing from analysis |
#### Expected Outcome
After uploading a dataset (e.g., `customers.csv`), you will see:
```
┌─────────────────────────────────────────────────────────────────────┐
│ QUICK STATS │
│ ┌──────────┬──────────┬──────────────┬─────────────┐ │
│ │ Rows │ Columns │ Duplicates │ Issues │ │
│ │ 10,542 │ 12 │ 47 │ 5 │ │
│ └──────────┴──────────┴──────────────┴─────────────┘ │
├─────────────────────────────────────────────────────────────────────┤
│ OVERVIEW TAB │
│ ──────────── │
│ Missing Values: │
│ ┌─────────────────┬─────────┬─────────┐ │
│ │ Column │ Count │ Percent │ │
│ ├─────────────────┼─────────┼─────────┤ │
│ │ email │ 23 │ 0.22% │ │
│ │ phone │ 156 │ 1.48% │ │
│ │ address │ 89 │ 0.84% │ │
│ └─────────────────┴─────────┴─────────┘ │
│ │
│ ⚠ 47 duplicate rows found (0.45%) │
├─────────────────────────────────────────────────────────────────────┤
│ ISSUES & RECOMMENDATIONS TAB │
│ ──────────────────────────── │
│ Issues Found: │
│ [!] Dataset has 268 missing values across 3 columns │
│ [2x] Found 47 duplicate rows (0.45%) │
│ [~] Column 'age' has 12 potential outliers │
│ [#] Column 'user_id' has very high cardinality (10,495 unique) │
│ [=] Column 'status' has only one unique value │
│ │
│ Recommendations: │
│ 💡 Fill missing values with mean/median for numeric columns │
│ 💡 Consider removing duplicate rows to improve data quality │
│ 💡 Review if 'user_id' should be used as an identifier │
│ 💡 Consider removing constant column 'status' │
└─────────────────────────────────────────────────────────────────────┘
```
#### API Endpoints
```bash
# Analyze a dataset
curl -X POST http://localhost:8000/api/v1/audit/analyze \
-F "file=@your_data.csv"
# Clean a dataset (remove duplicates and missing rows)
curl -X POST http://localhost:8000/api/v1/audit/clean \
-F "file=@your_data.csv"
# Validate schema
curl -X POST http://localhost:8000/api/v1/audit/validate-schema \
-F "file=@your_data.csv"
# Detect outliers
curl -X POST http://localhost:8000/api/v1/audit/detect-outliers \
-F "file=@your_data.csv"
```
#### Sample API Response
```json
{
"total_rows": 10542,
"total_columns": 12,
"missing_values": {
"email": {"count": 23, "percent": 0.22},
"phone": {"count": 156, "percent": 1.48},
"address": {"count": 89, "percent": 0.84}
},
"duplicate_rows": 47,
"duplicate_percent": 0.45,
"column_stats": [
{
"name": "customer_id",
"dtype": "int64",
"missing_count": 0,
"missing_percent": 0.0,
"unique_count": 10542,
"sample_values": [1001, 1002, 1003, 1004, 1005]
}
],
"issues": [
"Dataset has 268 missing values across 3 columns",
"Found 47 duplicate rows (0.45%)",
"Column 'age' has 12 potential outliers"
],
"recommendations": [
"Consider filling missing values with mean/median",
"Consider removing duplicate rows to improve data quality"
]
}
```
---
### Privacy Scanner
**Purpose:** Detect and redact personally identifiable information (PII) from text and files.
#### How to Use
1. **Navigate to the Tool**
- Click "Privacy Scanner" in the sidebar (or go to `/privacy-scanner`)
2. **Choose Input Mode**
- **Text Mode:** Paste or type text directly
- **File Mode:** Upload CSV, TXT, or JSON files
3. **Configure Detection Options**
- Toggle which PII types to detect:
- Emails
- Phone numbers
- SSN (Social Security Numbers)
- Credit Cards
- IP Addresses
- Dates of Birth
4. **Enter or Upload Content**
- **For text:** Paste content into the text area
- **For files:** Drag and drop or click to upload
- **Tip:** Click "Load Sample" to see example PII data
5. **Click "Scan for PII"**
- The tool will analyze your content and display results
6. **Review the Results**
**Risk Summary:**
- **PII Found:** Total number of PII entities detected
- **Types:** Number of different PII categories
- **Risk Score:** Calculated severity (0-100)
- **Risk Level:** CRITICAL, HIGH, MEDIUM, or LOW
**Overview Tab:**
- PII counts by type with color-coded severity
- Risk assessment with explanation
**Entities Tab:**
- Detailed list of each detected PII item:
- Type (EMAIL, PHONE, SSN, etc.)
- Original value
- Masked value
- Confidence score (percentage)
**Redacted Preview Tab:**
- Shows your text with all PII masked
- Safe to share after verification
#### PII Detection Patterns
| Type | Example | Masked As |
|------|---------|-----------|
| EMAIL | john.doe@example.com | jo***@example.com |
| PHONE | (555) 123-4567 | ***-***-4567 |
| SSN | 123-45-6789 | ***-**-6789 |
| CREDIT_CARD | 4532015112830366 | ****-****-****-0366 |
| IP_ADDRESS | 192.168.1.100 | 192.***.***.*|
| DATE_OF_BIRTH | 03/15/1985 | **/**/1985 |
#### Risk Levels Explained
| Level | Score | Description |
|-------|-------|-------------|
| CRITICAL | 70-100 | Highly sensitive PII (SSN, Credit Cards). Immediate action required. |
| HIGH | 50-69 | Multiple sensitive PII elements. Consider redaction before sharing. |
| MEDIUM | 30-49 | Some PII detected that may require attention. |
| LOW | 0-29 | Minimal or no PII detected. |
#### API Endpoints
```bash
# Scan text for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-text \
-F "text=Contact john@example.com or call 555-123-4567" \
-F "detect_emails=true" \
-F "detect_phones=true"
# Scan a file for PII
curl -X POST http://localhost:8000/api/v1/privacy/scan-file \
-F "file=@customer_data.csv"
# Scan CSV/Excel with column-by-column analysis
curl -X POST http://localhost:8000/api/v1/privacy/scan-dataframe \
-F "file=@customer_data.csv"
# Redact PII from text
curl -X POST http://localhost:8000/api/v1/privacy/redact \
-F "text=Call 555-123-4567 for support" \
-F "mode=mask"
# List supported PII types
curl http://localhost:8000/api/v1/privacy/entity-types
```
#### Redaction Modes
| Mode | Description | Example Output |
|------|-------------|----------------|
| `mask` | Shows partial value | jo***@example.com |
| `remove` | Replaces with [REDACTED] | [REDACTED] |
| `type` | Shows PII type | [EMAIL] |
#### Expected Outcome
After scanning text or a file, you will see results like:
```
RISK SUMMARY
┌────────────┬──────────┬────────────┬─────────────────┐
│ PII Found │ Types │ Risk Score │ Risk Level │
│ 7 │ 5 │ 72 │ CRITICAL │
└────────────┴──────────┴────────────┴─────────────────┘
ENTITIES TAB
┌────────────────┬───────────────────────┬────────────────┬───────┐
│ Type │ Original │ Masked │ Conf │
├────────────────┼───────────────────────┼────────────────┼───────┤
│ EMAIL │ john.smith@example.com│ jo***@example..│ 95% │
│ PHONE │ (555) 123-4567 │ ***-***-4567 │ 85% │
│ SSN │ 123-45-6789 │ ***-**-6789 │ 95% │
│ CREDIT_CARD │ 4532015112830366 │ ****-****-0366 │ 95% │
└────────────────┴───────────────────────┴────────────────┴───────┘
REDACTED PREVIEW
Customer Record:
Email: jo***@example.com
Phone: ***-***-4567
SSN: ***-**-6789
Credit Card: ****-****-****-0366
```
#### Sample API Response
```json
{
"total_entities": 7,
"entities_by_type": {
"EMAIL": 2, "PHONE": 2, "SSN": 1, "CREDIT_CARD": 1, "IP_ADDRESS": 1
},
"risk_level": "CRITICAL",
"risk_score": 72,
"entities": [
{
"type": "SSN",
"value": "123-45-6789",
"confidence": 0.95,
"masked_value": "***-**-6789"
}
],
"redacted_preview": "Email: jo***@example.com\nSSN: ***-**-6789..."
}
```
---
## Detailed Tool Documentation
### 1. Model Drift Monitor
**Purpose:** Detect when model performance degrades over time.
**Features:**
- Real-time confidence score tracking
- Statistical drift detection (KS test, PSI)
- Alert thresholds configuration
- Historical trend visualization
**API Endpoints:**
```
POST /api/v1/drift/baseline # Upload baseline distribution
POST /api/v1/drift/analyze # Analyze production data for drift
GET /api/v1/drift/history # Get drift score history
PUT /api/v1/drift/thresholds # Configure alert thresholds
```
---
### 2. Vendor Cost Tracker
**Purpose:** Aggregate and visualize API spending across providers.
**Supported Providers:**
- OpenAI
- Anthropic
- AWS Bedrock
- Google Vertex AI
- Azure OpenAI
**Features:**
- Daily/weekly/monthly cost breakdowns
- Per-project cost allocation
- Budget alerts
- Usage forecasting
---
### 3. Security Tester
**Purpose:** Identify vulnerabilities in AI endpoints.
**Test Categories:**
- Prompt injection attacks
- Jailbreak attempts
- Data exfiltration probes
- Rate limit testing
- Input validation bypass
**Output:** Security report with severity ratings and remediation steps.
---
### 4. Data History Log
**Purpose:** Maintain audit trail for ML training data.
**Features:**
- Data version hashing (SHA-256)
- Model-to-dataset mapping
- Timestamp logging
- Compliance report generation (GDPR, CCPA)
---
### 5. Model Comparator
**Purpose:** Evaluate and compare model outputs.
**Features:**
- Side-by-side response comparison
- Quality scoring (coherence, accuracy, relevance)
- Latency benchmarking
- Cost-per-query analysis
---
### 6. Privacy Scanner
**Purpose:** Detect and remove PII from datasets.
**Detected Entities:**
- Names
- Email addresses
- Phone numbers
- SSN/National IDs
- Credit card numbers
- Addresses
- IP addresses
**Modes:**
- Detection only
- Automatic redaction
- Pseudonymization
---
### 7. Label Quality Scorer
**Purpose:** Measure inter-annotator agreement.
**Metrics:**
- Cohen's Kappa
- Fleiss' Kappa (multi-rater)
- Krippendorff's Alpha
- Percent agreement
**Output:** Quality report with flagged inconsistent samples.
---
### 8. Inference Estimator
**Purpose:** Predict operational costs before deployment.
**Inputs:**
- Expected request volume
- Average tokens per request
- Model selection
- Peak usage patterns
**Output:** Monthly cost projection with confidence intervals.
---
### 9. Data Integrity Audit
**Purpose:** Clean and validate datasets.
**Checks:**
- Missing values
- Duplicate records
- Data type mismatches
- Outlier detection
- Schema validation
- Referential integrity
**Interface:** Interactive data cleaning with preview and undo.
---
### 10. Content Performance
**Purpose:** Analyze user engagement patterns.
**Features:**
- Drop-off point visualization
- Engagement heatmaps
- A/B test analysis
- Retention curve modeling
---
### 11. Safety/Bias Checks
**Purpose:** Audit AI systems for fairness.
**Metrics:**
- Demographic parity
- Equalized odds
- Calibration across groups
- Disparate impact ratio
**Output:** Compliance checklist with recommendations.
---
### 12. Profitability Analysis
**Purpose:** Connect AI costs to business outcomes.
**Features:**
- Cost attribution by feature/product
- Revenue correlation analysis
- ROI calculation
- Optimization recommendations
---
### 13. Emergency Control
**Purpose:** Safely halt AI systems when needed.
**Features:**
- One-click system suspension
- Graceful degradation modes
- Rollback capabilities
- Incident logging
**Implementation:** API endpoints + admin dashboard.
---
### 14. Result Interpretation
**Purpose:** Translate metrics into business actions.
**Features:**
- Automated insight generation
- Executive summary creation
- Action item extraction
- Trend interpretation
**Output:** Markdown/PDF reports for stakeholders.
---
## Directory Structure
```
ai_tools_suite/
├── PRODUCT_MANUAL.md
├── docker-compose.yml
├── .env.example
├── frontend/ # SvelteKit Application
│ ├── src/
│ │ ├── routes/
│ │ │ ├── +layout.svelte # Shared layout with sidebar
│ │ │ ├── +page.svelte # Dashboard home
│ │ │ ├── drift-monitor/
│ │ │ │ └── +page.svelte
│ │ │ ├── cost-tracker/
│ │ │ │ └── +page.svelte
│ │ │ ├── security-tester/
│ │ │ ├── data-history/
│ │ │ ├── model-comparator/
│ │ │ ├── privacy-scanner/
│ │ │ ├── label-quality/
│ │ │ ├── inference-estimator/
│ │ │ ├── data-audit/
│ │ │ ├── content-performance/
│ │ │ ├── bias-checks/
│ │ │ ├── profitability/
│ │ │ ├── emergency-control/
│ │ │ └── reports/
│ │ ├── lib/
│ │ │ ├── components/ # Shared UI components
│ │ │ │ ├── Sidebar.svelte
│ │ │ │ ├── Chart.svelte
│ │ │ │ ├── DataTable.svelte
│ │ │ │ └── FileUpload.svelte
│ │ │ ├── stores/ # Svelte stores
│ │ │ └── api/ # API client
│ │ └── app.html
│ ├── static/
│ ├── package.json
│ ├── svelte.config.js
│ ├── tailwind.config.js
│ └── tsconfig.json
├── backend/ # FastAPI Application
│ ├── main.py # Application entry point
│ ├── requirements.txt
│ ├── routers/
│ │ ├── drift.py
│ │ ├── costs.py
│ │ ├── security.py
│ │ ├── history.py
│ │ ├── compare.py
│ │ ├── privacy.py
│ │ ├── labels.py
│ │ ├── estimate.py
│ │ ├── audit.py
│ │ ├── content.py
│ │ ├── bias.py
│ │ ├── profitability.py
│ │ ├── emergency.py
│ │ └── reports.py
│ ├── services/ # Business logic
│ │ ├── drift_detector.py
│ │ ├── cost_aggregator.py
│ │ ├── pii_scanner.py
│ │ ├── bias_analyzer.py
│ │ └── ...
│ ├── models/ # Pydantic schemas
│ │ ├── drift.py
│ │ ├── costs.py
│ │ └── ...
│ ├── database/
│ │ ├── connection.py
│ │ └── models.py # SQLAlchemy models
│ └── tests/
├── shared/ # Shared utilities (deprecated)
├── tests/ # Integration tests
└── examples/ # Example data and usage
├── sample_baseline.csv
├── sample_production.csv
└── sample_pii_data.csv
```
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 0.1.0 | TBD | Phase 1 - Foundation (3 tools) |
| 0.2.0 | TBD | Phase 2 - Monitoring & Costs |
| 0.3.0 | TBD | Phase 3 - Security & Compliance |
| 0.4.0 | TBD | Phase 4 - Quality & Comparison |
| 1.0.0 | TBD | Phase 5 - Full Release (14 tools) |
---
## Support
For issues or feature requests, refer to the project documentation or contact the development team.
---
*Last Updated: December 2024*