Creating HTTP Connections Programmatically

This guide demonstrates how to programmatically create and execute HTTP connections using the Infactory API.

Overview

When creating an HTTP connection, the process involves:

Testing the connection to verify the API endpoint works
Creating a datasource to store connection information
Setting up credentials for authentication
Executing the request to establish the connection and load data

Behind the scenes, this process creates:

A datasource record in the database
A data object for the HTTP request configuration
A data object for the response data (in parquet format)
Data lineage records connecting these objects

Prerequisites

Authentication token or session cookie
Project ID where you want to create the connection
API endpoint URL you want to connect to

Step 1: Test the HTTP Connection

First, test that your API endpoint is accessible and returns the expected data:

curl 'https://your-instance.infactory.ai/api/infactory/v1/http/test-connection' \
  -H 'content-type: application/json' \
  -H 'authorization: YOUR_AUTH_TOKEN' \
  --data-raw '{
    "url": "https://api-endpoint.example.com/data",
    "method": "GET",
    "headers": {},
    "parameters": {
      "key": {
        "value": "your-api-key-value",
        "required": true
      }
    },
    "parameterGroups": [],
    "authType": "None",
    "auth": {},
    "responsePathExtractor": "value"
  }'

Request Parameters:

url: The API endpoint URL
method: HTTP method (GET, POST, PUT, etc.)
headers: HTTP headers to include with the request
parameters: Query parameters with values and required flags
parameterGroups: Groups of related parameters (optional)
authType: Authentication type (None, API Key, Bearer Token, Basic Auth)
auth: Authentication details based on the auth type
responsePathExtractor: JSON path to extract a specific key from the response (e.g., “value”)

Response:

{
  "success": true,
  "status": 200,
  "response_time": 123,
  "content_type": "application/json",
  "size": 1024,
  "data": {
    "example": "response data"
  }
}

Step 2: Create a Datasource

Create a datasource to store your HTTP connection data:

curl 'https://your-instance.infactory.ai/api/infactory/v1/datasources' \
  -H 'content-type: application/json' \
  -H 'authorization: YOUR_AUTH_TOKEN' \
  --data-raw '{
    "name": "My HTTP Connection",
    "project_id": "your-project-id",
    "type": "http-requests",
    "status": "transformation_started"
  }'

Request Parameters:

name: Name of the datasource
project_id: ID of the project to associate with
type: Must be “http-requests” for HTTP connections
status: Initial status (typically “transformation_started”)

Response:

{
  "id": "datasource-id",
  "name": "My HTTP Connection",
  "project_id": "your-project-id",
  "type": "http-requests",
  "status": "transformation_started",
  "created_at": "2023-09-21T15:30:00Z",
  "updated_at": "2023-09-21T15:30:00Z"
}

Save the returned id as your datasource_id for the next steps.

Step 3: Create Credentials

Create credentials to store connection details and authentication information:

curl 'https://your-instance.infactory.ai/api/infactory/v1/credentials' \
  -H 'content-type: application/json' \
  -H 'authorization: YOUR_AUTH_TOKEN' \
  --data-raw '{
    "name": "API Credentials",
    "type": "api",
    "description": "Credentials for API connection",
    "metadata": {
      "url": "https://api-endpoint.example.com/data",
      "method": "GET",
      "headers": {},
      "auth": {}
    },
    "datasource_id": "your-datasource-id",
    "team_id": "your-team-id",
    "organization_id": "your-org-id",
    "config": {
      "url": "https://api-endpoint.example.com/data",
      "method": "GET",
      "headers": {},
      "auth": {}
    }
  }'

Request Parameters:

name: Name for the credentials
type: “api” for API credentials
description: Description of what the credentials are for
metadata: Additional information about the API
datasource_id: ID of the datasource from step 2
team_id: Team that can access these credentials
organization_id: Organization that owns these credentials
config: Configuration details for the connection

Step 4: Execute the HTTP Request

Finally, execute the HTTP request to establish the connection and load data:

curl 'https://your-instance.infactory.ai/api/infactory/v1/http/execute-request' \
  -H 'content-type: application/json' \
  -H 'authorization: YOUR_AUTH_TOKEN' \
  --data-raw '{
    "url": "https://api-endpoint.example.com/data",
    "method": "GET",
    "headers": {},
    "parameters": {
      "key": {
        "value": "your-api-key-value",
        "required": true
      }
    },
    "parameterGroups": [],
    "authType": "None",
    "auth": {},
    "responsePathExtractor": "value",
    "project_id": "your-project-id",
    "datasource_id": "your-datasource-id",
    "connect_spec": {
      "name": "My HTTP Connection",
      "id": "http-requests",
      "config": {
        "url": "https://api-endpoint.example.com/data",
        "method": "GET",
        "headers": {},
        "parameters": {
          "key": {
            "value": "your-api-key-value",
            "required": true
          }
        },
        "parameterGroups": [],
        "authType": "None",
        "auth": {},
        "responsePathExtractor": "value"
      }
    }
  }'

Request Parameters:

HTTP connection details (same as test-connection)
project_id: ID of the project
datasource_id: ID of the datasource created in step 2
connect_spec: Connection specification object with:
- name: Name of the connection
- id: Type identifier (“http-requests”)
- config: Full configuration matching the test-connection parameters

Response:

{
  "jobs": [
    {
      "id": "job-id",
      "job_type": "build_dataline_from_connected_resource",
      "status": "queued",
      "created_at": "2023-09-21T15:35:00Z"
    }
  ],
  "data_object_id": "data-object-id"
}

What Happens Behind the Scenes

When you execute this flow:

The system creates a data object to store your HTTP request configuration
It executes the HTTP request and fetches data from the API
The response is converted to a Parquet file and stored as another data object
Data lineage is established between request and response data objects
Background jobs analyze the data structure and prepare it for querying
The system automatically generates query programs based on the data structure
These query programs are ready to use immediately without manual coding

Automatic Query Generation

One of the powerful features of this process is that the system automatically creates query programs (ready-to-use queries) based on the API response data. This means:

You don’t need to manually write queries to explore the API data
The system examines the structure and content of the API response
It generates relevant queries tailored to the specific data received
These queries are immediately available in your project for use
You can execute or modify these generated queries as needed

This automatic query generation significantly accelerates the time from connection to insight, allowing you to start working with the API data immediately after establishing the connection.

Code Example (Python)

Here’s a complete Python example showing all steps:

import requests
import json

# Configuration
BASE_URL = "https://your-instance.infactory.ai/api/infactory"
AUTH_TOKEN = "your-auth-token"
PROJECT_ID = "your-project-id"
TEAM_ID = "your-team-id"
ORG_ID = "your-org-id"

headers = {
    "Content-Type": "application/json",
    "Authorization": AUTH_TOKEN
}

# Step 1: Test the connection
test_payload = {
    "url": "https://api-endpoint.example.com/data",
    "method": "GET",
    "parameters": {
        "key": {
            "value": "your-api-key-value",
            "required": True
        }
    },
    "parameterGroups": [],
    "authType": "None",
    "auth": {},
    "responsePathExtractor": "value"
}

test_response = requests.post(
    f"{BASE_URL}/v1/http/test-connection",
    headers=headers,
    data=json.dumps(test_payload)
)

if test_response.status_code != 200 or not test_response.json()["success"]:
    print("Connection test failed:", test_response.text)
    exit(1)

print("Connection test successful!")

# Step 2: Create a datasource
datasource_payload = {
    "name": "My HTTP Connection",
    "project_id": PROJECT_ID,
    "type": "http-requests",
    "status": "transformation_started"
}

datasource_response = requests.post(
    f"{BASE_URL}/v1/datasources",
    headers=headers,
    data=json.dumps(datasource_payload)
)

if datasource_response.status_code != 200:
    print("Failed to create datasource:", datasource_response.text)
    exit(1)

datasource_id = datasource_response.json()["id"]
print(f"Created datasource with ID: {datasource_id}")

# Step 3: Create credentials
credentials_payload = {
    "name": "API Credentials",
    "type": "api",
    "description": "Credentials for API connection",
    "metadata": {
        "url": "https://api-endpoint.example.com/data",
        "method": "GET",
        "headers": {},
        "auth": {}
    },
    "datasource_id": datasource_id,
    "team_id": TEAM_ID,
    "organization_id": ORG_ID,
    "config": {
        "url": "https://api-endpoint.example.com/data",
        "method": "GET",
        "headers": {},
        "auth": {}
    }
}

credentials_response = requests.post(
    f"{BASE_URL}/v1/credentials",
    headers=headers,
    data=json.dumps(credentials_payload)
)

if credentials_response.status_code != 200:
    print("Failed to create credentials:", credentials_response.text)
    exit(1)
else:
    print("Created credentials successfully")

# Step 4: Execute the HTTP request
execute_payload = {
    **test_payload,
    "project_id": PROJECT_ID,
    "datasource_id": datasource_id,
    "connect_spec": {
        "name": "My HTTP Connection",
        "id": "http-requests",
        "config": {
            **test_payload,
            "responsePathExtractor": "value"
        }
    }
}

execute_response = requests.post(
    f"{BASE_URL}/v1/http/execute-request",
    headers=headers,
    data=json.dumps(execute_payload)
)

if execute_response.status_code != 200:
    print("Failed to execute HTTP request:", execute_response.text)
    exit(1)

result = execute_response.json()
print(f"Successfully executed HTTP request!")
print(f"Data object ID: {result['data_object_id']}")
print(f"Jobs: {len(result['jobs'])} jobs created")

Now the HTTP connection is established and the data is available in your project.

Getting Started

Core Features

Developer Guides

Use Cases

Resources

Creating HTTP Connections Programmatically

Overview

Prerequisites

Step 1: Test the HTTP Connection

Step 2: Create a Datasource

Step 3: Create Credentials

Step 4: Execute the HTTP Request

What Happens Behind the Scenes

Automatic Query Generation

Code Example (Python)

Getting Started

Core Features

Developer Guides

Use Cases

Resources

​Overview

​Prerequisites

​Step 1: Test the HTTP Connection

​Step 2: Create a Datasource

​Step 3: Create Credentials

​Step 4: Execute the HTTP Request

​What Happens Behind the Scenes

​Automatic Query Generation

​Code Example (Python)

Overview

Prerequisites

Step 1: Test the HTTP Connection

Step 2: Create a Datasource

Step 3: Create Credentials

Step 4: Execute the HTTP Request

What Happens Behind the Scenes

Automatic Query Generation

Code Example (Python)