Data Sources

Data storage options supported by ByteNite

ByteNite provides versatile storage integrations to facilitate seamless data access and management for your processing needs. In this set of guides, you will discover the currently supported data source connections available for configuration through the use of a dataSourceDescriptor and a set of params. To successfully configure these data sources, you will provide information about the data location, such as a file path or a URL. Additionally, you may need to include access or location-related fields, which are detailed in each integration guide.

Supported data sources

NamedataSourceDescriptorData Origin SupportData Destination Support
AWS S3s3
Google Cloud Storagegcp
Local Uploadfile
Temporary Bucketbucket

Configuring origins and destinations

Every job executed on ByteNite requires the configuration of two essential components: a data origin (dataSource) and a data destination (dataDestination). The data origin specifies where to retrieve data, while the data destination indicates where to store the processed data. To offer flexibility and accommodate various storage scenarios, we've designed a single API endpoint that breaks down into two distinct data fields. This approach empowers developers to compose different storage combinations and seamlessly transfer data across diverse storage platforms during processing.

To set a data origin and destination, you'll need to make API requests to the Set Datasource endpoint. The structure of these requests is as follows:

curl --request POST \
     --url{jobId} \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --data '
  "dataSource": {...},
  "dataDestination": {...}
job_id = "YOUR_JOB_ID"

response ='{job_id}', 
                            'dataSource': ...,
                            'dataDestination': ...
                         headers={'Authorization': access_token}) 

Alternatively, you can configure your data origin and destination while creating a new job using the Create Job endpoint.

In the upcoming guides, we'll delve into the details of the dataSource and dataDestination fields, explaining how to configure them to effectively interact with your preferred storage solutions. You'll learn how to utilize our API to seamlessly connect to your chosen storage systems, utilizing API credentials that can be embedded within your requests for added convenience.

Testing data sources

We highly recommend testing your data sources before initiating repetitive jobs or launching them into production. Misconfigurations in either the data origin or the destination can lead to job failures. Therefore, thorough testing ensures that your workflow operates smoothly and reliably, minimizing disruptions during production runs.

What’s Next