# Data Validation Guide

## Problem Identified

The database contains fields with formatting issues:
- **Federation numbers** have newline characters (`\n`) at the end
- **Email addresses** may have leading/trailing whitespace
- **Names** may have multiple spaces or newlines

This causes lookup failures when trying to match exact values.

## Solution Implemented

### 1. Data Validation Helper (`data-validation-helper.php`)

Provides utility functions for cleaning data:
- `cleanField()` - Removes newlines, carriage returns, and trims whitespace
- `cleanFederation()` - Specifically for federation/license numbers
- `cleanEmail()` - Cleans and normalizes email addresses
- `cleanName()` - Cleans names (removes extra spaces)
- SQL helper functions for building WHERE clauses with proper cleaning

### 2. Data Cleanup Script (`data-cleanup-script.php`)

A utility script to clean existing data in the database:
- **Dry Run Mode**: Shows what would be changed without modifying data
- **Actual Cleanup**: Cleans all affected fields
- Cleans: `Users.federation`, `Users.email`, `Users.name`, `Guests.federation`, `Guests.email`, `Guests.name`

### 3. Updated Import Logic

The import process now:
- Uses cleaned federation lookups: `TRIM(REPLACE(REPLACE(federation, '\n', ''), '\r', '')) = TRIM('value')`
- Falls back to LIKE with wildcards if exact match fails
- Uses cleaned email addresses for lookups

## Usage

### Running the Data Cleanup Script

1. **First, run a DRY RUN** (recommended):
   ```
   Access: data-cleanup-script.php
   Click: "Run Dry Run (Safe)"
   Review the output to see what would be changed
   ```

2. **Backup your database** before running actual cleanup

3. **Run Actual Cleanup**:
   ```
   Click: "Run Actual Cleanup"
   Confirm the action
   ```

### Using Validation Functions in Code

```php
require 'data-validation-helper.php';

// Clean federation before lookup
$federation = cleanFederation($rawFederation);

// Build WHERE clause with proper cleaning
$whereClause = buildFederationWhereClause($conn, 'federation', $searchValue);
$sql = "SELECT * FROM Users WHERE $whereClause";

// Validate before processing
$validation = validateFederation($federation);
if ($validation['valid']) {
    // Use $validation['cleaned']
} else {
    // Handle error: $validation['error']
}
```

## Fields That Need Validation

### Critical Lookup Fields:
1. **Federation/License Numbers** - Used for user lookups
2. **Email Addresses** - Used for user identification
3. **Names** - Used for display and matching

### Other Fields to Consider:
- Phone numbers
- Member IDs
- Any other unique identifiers

## Best Practices

1. **Always clean data** before database lookups
2. **Validate before inserting** new data
3. **Use the helper functions** instead of manual cleaning
4. **Run cleanup script periodically** to maintain data quality
5. **Check for newlines** when importing from Excel/CSV files

## Testing

Use `test-lamont-lookup.php` to test federation lookups:
- Shows which queries work
- Identifies formatting issues
- Helps debug lookup problems

## Future Improvements

Consider:
- Adding database triggers to auto-clean on insert/update
- Adding validation rules to frontend forms
- Implementing data quality monitoring
- Regular automated cleanup jobs

