Connecting Data Sources - Infactory Documentation

The first step in using Infactory is connecting to your data sources. This establishes a secure link between Infactory and your data, enabling intelligent query generation without copying or storing your data.

Supported Data Source Types

Infactory supports a comprehensive range of data sources to meet diverse organizational needs:

SQL Databases

PostgreSQL

Powerful, open-source object-relational database system

MySQL

Popular open-source relational database management system

Microsoft SQL Server

Enterprise-grade relational database system

Azure SQL Database

Microsoft’s cloud-based SQL database service

Oracle Database

Enterprise database with advanced analytics capabilities

SQLite

Lightweight, file-based relational database

Cloud Data Warehouses

Amazon Redshift

Fully-managed, petabyte-scale data warehouse service

Google BigQuery

Fully-managed, serverless data warehouse

Amazon Athena

Serverless interactive query service using S3 data

Trino

Distributed SQL query engine for big data

ClickHouse

Column-oriented OLAP database for analytics

NoSQL & Document Databases

MongoDB

Leading NoSQL document database

Azure Cosmos DB

Microsoft’s globally distributed, multi-model database

DuckDB

In-process SQL OLAP database for analytics

Time Series & Specialized Databases

InfluxDB

Time series database for metrics and IoT data

Cloud Platforms & Storage

Amazon S3

Object storage service for data files

HTTP REST APIs

Connect to external APIs and web services

Airtable

Cloud-based database and spreadsheet hybrid

File Upload

Data Files

Upload CSV, JSON, JSONL, and Parquet files for analysis

Image Files

Upload photos and images with rich metadata extraction

Video Files

Upload videos with metadata extraction and analysis

Coming Soon

We’re continuously adding support for new data sources. If you need support for a specific data source not listed here, please contact our support team.

Connection Process Overview

Create or select a project

Start by creating a new project or selecting an existing one in the Infactory workshop.

Navigate to Connect tab

Go to the Connect tab in your project to begin setting up a data source connection.

Select data source type

Choose which type of data source you want to connect from the available options.

Enter connection details

Provide the necessary credentials and parameters to establish the connection.

Select specific tables/containers

Choose which tables, collections, or containers you want to include in your project.

Datasource Specific Connection Instructions

Data Files
Image Files
Video Files
PostgreSQL
MySQL
Microsoft SQL Server
MongoDB
BigQuery
Azure Cosmos DB
Amazon S3
HTTP REST APIs
Airtable

Uploading Data Files

To upload and analyze structured data files directly:

Step-by-Step Instructions

In the Connect tab, select File Upload from the options
Click Choose File or drag and drop your file(s)
Configure parsing options based on file type:
- CSV: Verify headers, adjust delimiters, set data types
- JSON: Confirm structure and nested object handling
- JSONL: Validate line-delimited format
- Parquet: Review column types and compression
Preview the data to ensure it’s parsed correctly
Click Upload to import the data
Your structured data is now available for query generation

Supported Data Formats

CSV: Comma or tab-separated values
JSON: JavaScript Object Notation files
JSONL: JSON Lines format - one JSON object per line
Parquet: Apache Parquet columnar format

Requirements

UTF-8 encoding recommended for all text files
CSV files should have header rows with column names
JSON files should contain arrays of objects or single objects
Parquet files are processed with full schema preservation

The Data Model

After connecting to your data source, Infactory analyzes your schema and creates a data model. This data model is the foundation for intelligent query generation.

How Data Modeling Works

Infactory’s data modeling process:

Takes a small sample of your data (about 50 rows)
Analyzes the schema to understand data types and relationships
Generates appropriate display names and descriptions
Creates a data model that powers intelligent query generation

Viewing and Editing the Data Model

You can view and edit the generated data model:

Table View
Code View

The Table View provides a user-friendly interface to see and edit:

Field names
Data types
Descriptions

This is the recommended view for most users.

Customizing the Data Model

Customizing the data model can significantly improve the quality of generated queries. Consider these customizations:

Improve Field Descriptions: Add more context about what each field represents
Correct Data Types: Ensure field types are correctly identified
Add Display Names: Make field names more human-readable
Define Relationships: Clarify how different tables or collections relate to each other

Advanced Connection Features

Custom Queries as Data Sources

For complex data models or specific data needs, you can provide a custom query for your data source:

Select 'Custom Query' as your data source

When connecting to your data source, choose the Custom Query option.

Write your query

Enter the SQL, MongoDB query, or other appropriate query language for your data source.

-- Example custom query for PostgreSQL
SELECT 
  p.player_id, 
  p.first_name, 
  p.last_name, 
  t.team_name, 
  s.points_per_game
FROM 
  players p
JOIN 
  teams t ON p.team_id = t.team_id
JOIN 
  statistics s ON p.player_id = s.player_id
WHERE 
  s.season = '2023'

Test your query

Click the “Test Query” button to verify that your query returns the expected results.

Use as data source

Click “Connect” to use this query result as your data source.

Multiple Table/Container Connections

You can connect to multiple tables or containers within the same project:

Follow the standard connection process for your first table/container
Return to the Connect tab and click “Add Table/Container”
Select additional tables/containers from your connected data source
Configure each connection as needed

This enables queries that can join data across multiple tables or containers.

Connection Security

Infactory prioritizes the security of your data connections:

Read-Only Access: We recommend using credentials with read-only access
No Data Storage: Infactory does not store copies of your data
Encrypted Connections: All data source connections use encrypted channels
Secure Credential Storage: Your connection credentials are securely stored

Connection Troubleshooting

Connection Timeouts

Problem: The connection attempt times out before completing.Solutions:

Check if your data source is accessible from external networks
Verify that the correct port is open in your firewall settings
Ensure your data source server isn’t overloaded
Check for network issues between Infactory and your data source

Authentication Errors

Problem: You receive an “Authentication failed” error.Solutions:

Double-check your username and password
Verify that the user has permission to access the specified data source
Ensure the user has permission to read from the tables/collections you’re trying to access
Check if your data source uses IP whitelisting and ensure Infactory’s IPs are allowed

Schema Analysis Issues

Problem: Infactory has trouble analyzing your schema or reports errors during schema analysis.Solutions:

Ensure your tables/collections have data (at least a few rows)
Check for unusual data types or schema structures
Try connecting to a subset of tables/collections first
Consider using a custom query to simplify the data model

Video Upload Issues

Problem: Video files fail to upload or process correctly.Solutions:

Ensure your MP4 file is under 500MB in size
Check that the file is a valid MP4 format
For large files, ensure stable internet connection during chunked upload
Verify sufficient storage space is available
If metadata extraction fails, the video will still be uploaded with basic information

Best Practices for Data Source Connections

Schema Design

Use Clear Column Names: Choose descriptive names that reflect content
Consistent Naming: Use consistent naming patterns across your schema
Add Comments: Where supported, add comments to explain fields
Appropriate Data Types: Use proper data types for each field

Data Quality

Provide Representative Data: Ensure your data source contains quality sample data
Avoid Empty Tables: Tables without data won’t generate meaningful query templates
Clean Data: Remove inconsistent or duplicate data when possible
Normalize When Possible: Well-normalized data often leads to better query generation

Connection Management

Use Read-Only Connections: Connect with read-only credentials when possible
Regular Updates: If your schema changes significantly, reconnect to update the data model
Start Simple: Begin with core tables/collections and add more as needed
Document Your Connections: Keep track of which data sources are connected and why

Media File Best Practices

Organize by Collections: Group related images/videos together for better analysis
Consistent Naming: Use descriptive filenames for better metadata organization
Quality Metadata: Ensure cameras/devices embed proper metadata when possible
File Size Management: Keep video files under 200MB when possible for optimal performance

Next Steps

After successfully connecting your data source, Infactory will analyze your schema and automatically generate queries. Continue to Building Queries to learn how to work with these generated queries and create custom ones.

Getting Started

Core Features

Developer Guides

Use Cases

Enterprise

Resources

​Supported Data Source Types

​SQL Databases

PostgreSQL

MySQL

Microsoft SQL Server

Azure SQL Database

Oracle Database

SQLite

​Cloud Data Warehouses

Amazon Redshift

Google BigQuery

Amazon Athena

Trino

ClickHouse

​NoSQL & Document Databases

MongoDB

Azure Cosmos DB

DuckDB

​Time Series & Specialized Databases

InfluxDB

​Cloud Platforms & Storage

Amazon S3

HTTP REST APIs

Airtable

​File Upload

Data Files

Image Files

Video Files

​Coming Soon

​Connection Process Overview

​Datasource Specific Connection Instructions

​Uploading Data Files

​Step-by-Step Instructions

​Supported Data Formats

​Requirements

​The Data Model

​How Data Modeling Works

​Viewing and Editing the Data Model

​Customizing the Data Model

​Advanced Connection Features

​Custom Queries as Data Sources

​Multiple Table/Container Connections

​Connection Security

​Connection Troubleshooting

​Best Practices for Data Source Connections

​Schema Design

​Data Quality

​Connection Management

​Media File Best Practices

​Next Steps