Quick Start Tutorial

This tutorial will guide you through your first data sync with Hubio Sync in under 5 minutes.

What You’ll Learn

  • Install Hubio Sync
  • Configure a data source and destination
  • Run your first sync
  • Verify the results

Step 1: Install Hubio Sync

Choose your platform and run the installation command:

macOS/Linux:

curl -fsSL https://install.hubio.team/install.sh | sh

Windows (PowerShell):

Invoke-RestMethod https://install.hubio.team/install.ps1 | Invoke-Expression

Verify the installation:

hubio-sync --version

You should see output like:

hubio-sync version 1.0.0

Step 2: Create Configuration File

Create the configuration directory:

# macOS/Linux
mkdir -p ~/.config/hubio-sync

# Windows (PowerShell)
New-Item -ItemType Directory -Path "$env:APPDATA\hubio-sync" -Force

Create a configuration file at ~/.config/hubio-sync/config.toml (or %APPDATA%\hubio-sync\config.toml on Windows):

# config.toml - Quick start example

[source]
type = "mysql"
host = "localhost"
port = 3306
database = "myapp"
username = "readonly_user"
password = "your_password"

[destination]
type = "filesystem"
path = "./data-exports"
format = "json"

[sync]
tables = ["users", "orders"]
mode = "full"
batch_size = 1000

Note: Replace the database credentials with your actual values.


Step 3: Validate Configuration

Before running a sync, validate your configuration:

hubio-sync validate

This will:

  • ✅ Check configuration syntax
  • ✅ Test source database connection
  • ✅ Test destination write permissions
  • ✅ Verify table access

Expected output:

✓ Configuration syntax valid
✓ Source connection successful
✓ Destination accessible
✓ Tables found: users, orders

Configuration is valid and ready to sync!

Step 4: Run Your First Sync

Run a test sync with the --dry-run flag to preview what will happen:

hubio-sync run --dry-run

This shows what would be synced without actually writing data.

When ready, run the actual sync:

hubio-sync run

You’ll see progress output:

Starting sync...
[1/2] Syncing table: users
  → Extracted 1,245 rows
  → Transformed 1,245 rows
  → Loaded 1,245 rows
  ✓ users completed in 2.3s

[2/2] Syncing table: orders
  → Extracted 5,892 rows
  → Transformed 5,892 rows
  → Loaded 5,892 rows
  ✓ orders completed in 8.1s

Sync completed successfully!
Total rows synced: 7,137
Duration: 10.4s

Step 5: Verify Results

Check the synced data in your destination:

Filesystem destination:

ls -la ./data-exports

You should see:

data-exports/
├── users/
│   └── users_2025-11-25.json
└── orders/
    └── orders_2025-11-25.json

View the data:

cat data-exports/users/users_2025-11-25.json | head -n 5

S3 destination (if configured):

aws s3 ls s3://your-bucket/hubio-sync/

Step 6: Check Sync Status

View the status of your syncs:

hubio-sync status

Output:

Last Sync Summary
─────────────────────────────────────────────
Status:        Success
Started:       2025-11-25 14:30:00 UTC
Completed:     2025-11-25 14:30:10 UTC
Duration:      10.4s
Tables Synced: 2
Total Rows:    7,137
─────────────────────────────────────────────

View detailed history:

hubio-sync status --history

Next Steps

Configure Incremental Syncs

Instead of syncing all data every time, sync only new/updated records:

[sync]
mode = "incremental"
incremental_column = "updated_at"

Schedule Automatic Syncs

Set up recurring syncs using cron syntax:

[sync]
schedule = "0 2 * * *"  # Daily at 2 AM
timezone = "America/New_York"

Start the scheduler:

hubio-sync scheduler start

Add Data Transformations

Apply transformations during sync:

[[transformations]]
table = "users"
transform = "anonymize"
columns = ["email", "phone"]

[[transformations]]
table = "orders"
transform = "filter"
condition = "status = 'completed'"

Monitor Performance

View sync metrics:

hubio-sync metrics

Enable Prometheus metrics:

[metrics]
enabled = true
port = 9090

Connect to Cloud Destinations

Amazon S3:

[destination]
type = "s3"
bucket = "my-data-lake"
region = "us-east-1"
format = "parquet"
compression = "snappy"

Snowflake:

[destination]
type = "snowflake"
account = "xy12345.us-east-1"
warehouse = "COMPUTE_WH"
database = "ANALYTICS"
schema = "HUBIO_SYNC"
username = "sync_user"
password = "secure_password"

Common Use Cases

Use Case 1: MySQL to S3 Data Lake

Scenario: Export production MySQL database to S3 for analytics.

[source]
type = "mysql"
host = "prod-mysql.example.com"
database = "production"
username = "readonly"
password = "${MYSQL_PASSWORD}"

[destination]
type = "s3"
bucket = "company-data-lake"
region = "us-east-1"
prefix = "mysql-exports/"
format = "parquet"
partition_by = ["year", "month", "day"]

[sync]
tables = ["users", "orders", "products"]
mode = "incremental"
incremental_column = "updated_at"
schedule = "0 */4 * * *"  # Every 4 hours

Use Case 2: API to Database

Scenario: Pull data from REST API into PostgreSQL.

[source]
type = "rest_api"
base_url = "https://api.example.com/v1"
auth_type = "bearer"
auth_token = "${API_TOKEN}"
rate_limit = 100  # requests per minute

[destination]
type = "postgres"
host = "localhost"
database = "analytics"
username = "sync_user"
password = "${POSTGRES_PASSWORD}"

[sync]
endpoints = ["/users", "/orders"]
mode = "append"
schedule = "*/15 * * * *"  # Every 15 minutes

Use Case 3: Database Replication

Scenario: Replicate data from production to staging database.

[source]
type = "postgres"
host = "prod-db.example.com"
database = "production"
username = "replicator"
password = "${PROD_PASSWORD}"

[destination]
type = "postgres"
host = "staging-db.example.com"
database = "staging"
username = "writer"
password = "${STAGING_PASSWORD}"

[sync]
tables = ["*"]  # All tables
mode = "full"
schedule = "0 0 * * *"  # Daily at midnight

[[transformations]]
table = "users"
transform = "anonymize"
columns = ["email", "phone", "ssn"]

Troubleshooting Quick Start

”Command not found: hubio-sync”

Solution: Installation path not in PATH.

# Check if binary exists
which hubio-sync

# Add to PATH (Linux/macOS)
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Restart terminal
exec $SHELL

”Connection refused” Error

Solution: Database not accessible or wrong credentials.

# Test database connection manually
mysql -h localhost -u readonly_user -p myapp

# Check firewall
ping localhost
telnet localhost 3306

# Verify credentials in config.toml

”Permission denied” Error

Solution: Database user lacks permissions.

-- Grant SELECT permission (MySQL)
GRANT SELECT ON myapp.* TO 'readonly_user'@'%';
FLUSH PRIVILEGES;

-- Grant SELECT permission (PostgreSQL)
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;

”Destination not writable” Error

Solution: Check destination permissions.

# Filesystem: Check write permissions
mkdir -p ./data-exports
chmod 755 ./data-exports

# S3: Verify IAM permissions
aws s3 ls s3://your-bucket/
aws iam get-user

Example Data Sources

Don’t have a database handy? Try these examples:

SQLite Sample Database

# Download sample database
curl -O https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.db

# Configure Hubio Sync
cat > ~/.config/hubio-sync/config.toml <<EOF
[source]
type = "sqlite"
path = "./chinook.db"
read_only = true

[destination]
type = "filesystem"
path = "./exports"
format = "json"

[sync]
tables = ["customers", "invoices"]
mode = "full"
EOF

# Run sync
hubio-sync run

PostgreSQL with Docker

# Start PostgreSQL with sample data
docker run --name demo-postgres \
  -e POSTGRES_PASSWORD=demo123 \
  -e POSTGRES_DB=demo \
  -p 5432:5432 \
  -d postgres:15

# Wait for startup
sleep 5

# Create sample table
docker exec demo-postgres psql -U postgres -d demo -c "
  CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100),
    created_at TIMESTAMP DEFAULT NOW()
  );
  INSERT INTO users (name, email) VALUES
    ('Alice', 'alice@example.com'),
    ('Bob', 'bob@example.com'),
    ('Charlie', 'charlie@example.com');
"

# Configure Hubio Sync
cat > ~/.config/hubio-sync/config.toml <<EOF
[source]
type = "postgres"
host = "localhost"
port = 5432
database = "demo"
username = "postgres"
password = "demo123"

[destination]
type = "filesystem"
path = "./exports"
format = "json"

[sync]
tables = ["users"]
mode = "full"
EOF

# Run sync
hubio-sync run

Learn More


Happy syncing! 🚀