Data Sources
Data storage options supported by ByteNite
ByteNite provides versatile storage integrations to facilitate seamless data access and management for your processing needs. In this set of guides, you will discover the currently supported data source connections available for configuration through the use of a dataSourceDescriptor
and a set of params
. To successfully configure these data sources, you will provide information about the data location, such as a file path or a URL. Additionally, you may need to include access or location-related fields, which are detailed in each integration guide.
Supported data sources
Name | dataSourceDescriptor | Data Origin Support | Data Destination Support |
---|---|---|---|
AWS S3 | s3 | ✅ | ✅ |
Google Cloud Storage | gcp | ✅ | ✅ |
Storj | storj | ✅ | ✅ |
Dropbox | dropbox | ✅ | ✅ |
HTTP | url | ✅ | |
FTP | ftp | ✅ | ✅ |
Local Upload | file | ✅ | |
Temporary Bucket | bucket | ✅ |
Configuring origins and destinations
Every job executed on ByteNite requires the configuration of two essential components: a data origin (dataSource
) and a data destination (dataDestination
). The data origin specifies where to retrieve data, while the data destination indicates where to store the processed data. To offer flexibility and accommodate various storage scenarios, we've designed a single API endpoint that breaks down into two distinct data fields. This approach empowers developers to compose different storage combinations and seamlessly transfer data across diverse storage platforms during processing.
To set a data origin and destination, you'll need to make API requests to the Set Datasource endpoint. The structure of these requests is as follows:
curl --request POST \
--url https://api.bytenite.com/v1/customer/jobs/datasource/{jobId} \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '
{
"dataSource": {...},
"dataDestination": {...}
}
'
job_id = "YOUR_JOB_ID"
response = requests.post(f'https://api.bytenite.com/v1/customer/jobs/datasource/{job_id}',
json={
'dataSource': ...,
'dataDestination': ...
},
headers={'Authorization': access_token})
Alternatively, you can configure your data origin and destination while creating a new job using the Create Job endpoint.
In the upcoming guides, we'll delve into the details of the dataSource
and dataDestination
fields, explaining how to configure them to effectively interact with your preferred storage solutions. You'll learn how to utilize our API to seamlessly connect to your chosen storage systems, utilizing API credentials that can be embedded within your requests for added convenience.
Testing data sources
We highly recommend testing your data sources before initiating repetitive jobs or launching them into production. Misconfigurations in either the data origin or the destination can lead to job failures. Therefore, thorough testing ensures that your workflow operates smoothly and reliably, minimizing disruptions during production runs.
Updated about 1 month ago