The first step in using Infactory is connecting to your data sources. This establishes a secure link between Infactory and your data, enabling intelligent query generation without copying or storing your data.

Supported Data Source Types

Infactory supports a comprehensive range of data sources to meet diverse organizational needs:

SQL Databases

PostgreSQL

Powerful, open-source object-relational database system

MySQL

Popular open-source relational database management system

Microsoft SQL Server

Enterprise-grade relational database system

Azure SQL Database

Microsoft’s cloud-based SQL database service

Oracle Database

Enterprise database with advanced analytics capabilities

SQLite

Lightweight, file-based relational database

Cloud Data Warehouses

Amazon Redshift

Fully-managed, petabyte-scale data warehouse service

Google BigQuery

Fully-managed, serverless data warehouse

Amazon Athena

Serverless interactive query service using S3 data

Trino

Distributed SQL query engine for big data

ClickHouse

Column-oriented OLAP database for analytics

NoSQL & Document Databases

MongoDB

Leading NoSQL document database

Azure Cosmos DB

Microsoft’s globally distributed, multi-model database

DuckDB

In-process SQL OLAP database for analytics

Time Series & Specialized Databases

InfluxDB

Time series database for metrics and IoT data

Cloud Platforms & Storage

Amazon S3

Object storage service for data files

HTTP REST APIs

Connect to external APIs and web services

Airtable

Cloud-based database and spreadsheet hybrid

File Upload

Data Files

Upload CSV, JSON, JSONL, and Parquet files for analysis

Image Files

Upload photos and images with rich metadata extraction

Coming Soon

We’re continuously adding support for new data sources. If you need support for a specific data source not listed here, please contact our support team.

Connection Process Overview

1

Create or select a project

Start by creating a new project or selecting an existing one in the Infactory workshop.
2

Navigate to Connect tab

Go to the Connect tab in your project to begin setting up a data source connection.
3

Select data source type

Choose which type of data source you want to connect from the available options.
4

Enter connection details

Provide the necessary credentials and parameters to establish the connection.
5

Select specific tables/containers

Choose which tables, collections, or containers you want to include in your project.

Datasource Specific Connection Instructions

Uploading Data Files

To upload and analyze structured data files directly:

Step-by-Step Instructions

  1. In the Connect tab, select File Upload from the options
  2. Click Choose File or drag and drop your file(s)
  3. Configure parsing options based on file type:
    • CSV: Verify headers, adjust delimiters, set data types
    • JSON: Confirm structure and nested object handling
    • JSONL: Validate line-delimited format
    • Parquet: Review column types and compression
  4. Preview the data to ensure it’s parsed correctly
  5. Click Upload to import the data
  6. Your structured data is now available for query generation

Supported Data Formats

  • CSV: Comma or tab-separated values
  • JSON: JavaScript Object Notation files
  • JSONL: JSON Lines format - one JSON object per line
  • Parquet: Apache Parquet columnar format

Requirements

  • UTF-8 encoding recommended for all text files
  • CSV files should have header rows with column names
  • JSON files should contain arrays of objects or single objects
  • Parquet files are processed with full schema preservation

The Data Model

After connecting to your data source, Infactory analyzes your schema and creates a data model. This data model is the foundation for intelligent query generation.

How Data Modeling Works

Infactory’s data modeling process:
  1. Takes a small sample of your data (about 50 rows)
  2. Analyzes the schema to understand data types and relationships
  3. Generates appropriate display names and descriptions
  4. Creates a data model that powers intelligent query generation

Viewing and Editing the Data Model

You can view and edit the generated data model:
The Table View provides a user-friendly interface to see and edit:
  • Field names
  • Data types
  • Descriptions
This is the recommended view for most users.

Customizing the Data Model

Customizing the data model can significantly improve the quality of generated queries. Consider these customizations:
  1. Improve Field Descriptions: Add more context about what each field represents
  2. Correct Data Types: Ensure field types are correctly identified
  3. Add Display Names: Make field names more human-readable
  4. Define Relationships: Clarify how different tables or collections relate to each other

Advanced Connection Features

Custom Queries as Data Sources

For complex data models or specific data needs, you can provide a custom query for your data source:
1

Select 'Custom Query' as your data source

When connecting to your data source, choose the Custom Query option.
2

Write your query

Enter the SQL, MongoDB query, or other appropriate query language for your data source.
-- Example custom query for PostgreSQL
SELECT 
  p.player_id, 
  p.first_name, 
  p.last_name, 
  t.team_name, 
  s.points_per_game
FROM 
  players p
JOIN 
  teams t ON p.team_id = t.team_id
JOIN 
  statistics s ON p.player_id = s.player_id
WHERE 
  s.season = '2023'
3

Test your query

Click the “Test Query” button to verify that your query returns the expected results.
4

Use as data source

Click “Connect” to use this query result as your data source.

Multiple Table/Container Connections

You can connect to multiple tables or containers within the same project:
  1. Follow the standard connection process for your first table/container
  2. Return to the Connect tab and click “Add Table/Container”
  3. Select additional tables/containers from your connected data source
  4. Configure each connection as needed
This enables queries that can join data across multiple tables or containers.

Connection Security

Infactory prioritizes the security of your data connections:
  • Read-Only Access: We recommend using credentials with read-only access
  • No Data Storage: Infactory does not store copies of your data
  • Encrypted Connections: All data source connections use encrypted channels
  • Secure Credential Storage: Your connection credentials are securely stored

Connection Troubleshooting

Best Practices for Data Source Connections

Schema Design

  • Use Clear Column Names: Choose descriptive names that reflect content
  • Consistent Naming: Use consistent naming patterns across your schema
  • Add Comments: Where supported, add comments to explain fields
  • Appropriate Data Types: Use proper data types for each field

Data Quality

  • Provide Representative Data: Ensure your data source contains quality sample data
  • Avoid Empty Tables: Tables without data won’t generate meaningful query templates
  • Clean Data: Remove inconsistent or duplicate data when possible
  • Normalize When Possible: Well-normalized data often leads to better query generation

Connection Management

  • Use Read-Only Connections: Connect with read-only credentials when possible
  • Regular Updates: If your schema changes significantly, reconnect to update the data model
  • Start Simple: Begin with core tables/collections and add more as needed
  • Document Your Connections: Keep track of which data sources are connected and why

Next Steps

After successfully connecting your data source, Infactory will analyze your schema and automatically generate queries. Continue to Building Queries to learn how to work with these generated queries and create custom ones.