Learn how to use Orchesto as a multi-point Data Management Solution, compatible with the Amazon S3 API.
Orchesto is a data management solution that, in one aspect of its capability, can act as a forward proxy against one or more cloud storage providers.
To deploy storage resources in Orchesto, you begin by making a set of storage regions available through one or more storage backends. This will typically require you to provide Orchesto with security credentials when adding the storage backends.
Once those regions are available to Orchesto, they will be also be indirectly available to AWS S3 compatible clients that connect to Orchesto. Specifically, for each underlying storage region, or backing region, we say that Orchesto provides a corresponding virtual region to clients.
You can then start to work with the available storage resources by adding virtual buckets, and upload or download objects to them.
We refer to a bucket available in a virtual region, and its corresponding backing region, as a virtual bucket and backing bucket respectively.
Virtual regions do not automatically grant access to buckets in the underlying storage region. When adding a bucket to Orchesto, you will automatically create a new bucket in the backing region. This backing bucket will have the name of the virtual bucket as a portion of its name. The complete name of the backing bucket will however also include a unique identifier to ensure that there will be no naming conflicts amongst the backing buckets.
In versions after 2.0 of Orchesto it is possible to also reuse existing backing buckets when creating a virtual bucket. For these situations, Orchesto features an import function that reads upstream data from the backing bucket in order to build Orchesto's internal object indexing database.
All data uploaded to Orchesto is stored in the native format of the storage provider, ensuring the data lifecycle is not obstructed across storage environments. This allows direct access to the data outside Orchesto, and access to existing data through Orchesto, at the same time.**
A security credential is provided by way of access keys, consisting of an access key ID and a secret access key. The first issued security credential is the administrative security credential. This is unique and can be used to manage Orchesto with admin privileges, that is, without restriction.
Make sure you store the initially provided admin access keys in a secure location.**
With admin privileges, you can create new security credentials with user privileges, also known as user security credentials, to secure your storage environment. One of many tools at the admin's disposal is IAM policies - tied to users and / or buckets. These allow management of user authorisation at a granular level. For more information on IAM Policies, please see Appendix Postgres installation for Orchesto.
The difference across privilege levels is outlined below:
|Manage security credentials||✓||✗|
|Manage storage backends||✓||✗|
|Manage virtual regions||✓||✗|
|Use the S3 API||✓||✓|
Using admin security credentials, you can limit and revoke access to Orchesto as needed. As a security best practice, do not use your administrative security credential to access the S3 API.
To enable Orchesto to simultaneously manage any combination of storage providers, Orchesto features namespace federation with both virtual regions and buckets.
Virtual Region Namespace
With admin privileges, you have the option to choose virtual region names before, or after, a new storage backend is added to Orchesto. This can be useful to address deployment requirements of your organisation.
By default, Orchesto will choose the same name as the backing region, unless this conflicts with the existing virtual region namespace.
ZIDA Region Namespace
A ZIDA regions is a special type of virtual region. This region is part of the definition required by Orchesto's proprietary algorithm for efficient information dispersal - ZIDA. Applying ZIDA on an object will split the object into a configurable number of shards that will be uploaded to multiple backends.
A ZIDA region is defined by selecting ZIDA in the sidebar and defining a ZIDA region name, dispersal method (replication / erasure coding) as well as a cluster of different backing regions belonging to different Cloud Service Providers. Erasure coding will also require defining the redundancy level of the dispersed object.
Virtual Bucket Namespace
When using the S3 API to create a new bucket, Orchesto will use the name of the virtual bucket as a portion of the name of the corresponding backing bucket(s). A unique identifier will be appended to the name of the virtual bucket(s). This mechanism will ensure that no naming conflicts arise.
With permissible privileges and, provided that you run a later version of Orchesto than 2.0, you will have the option of adding a virtual bucket against an existing backing bucket and, moreover, choose a different virtual bucket name. This may help to address, for example, a conflict with the existing virtual (or backing) bucket namespace.
Orchesto provides support to three different types of storage providers:
- Preferred Cloud Storage Providers
- User-Defined Amazon S3 API Compatible Solution
This enables a number of deployment configurations in order to accommodate different types of use cases.
Special support is given to select cloud storage providers to ensure a consistent unified object model for storage clients. These include:
- Amazon Web Services (Amazon S3)
- Exoscale (Object Storage)
- Digital Ocean (Spaces)
- DreamHost (DreamObjects)
- Alibaba Cloud (OSS: Object Storage Service)
- Google Cloud (GCS: Google Cloud Storage)
- Microsoft Azure (Blob Storage)
Other solutions compatible with industry standard Amazon S3 API can be connected as well. For these, you will need to provide additional information to describe the object storage environment.
To connect the filesystem to Orchesto, you simply select a path and define a new region name for it. Directories in the target path become backing buckets, with their corresponding files treated as objects.
Providing limited access to the host filesystem is a security-sensitive operation. To reduce the attack surface in your storage environment, lock down access to new filesystem backends via the management console, or using the
--fs-lockdown option when starting Orchesto:
$ orchesto --fs-lockdown
To disable the lock down on new filesystem backends, use the inverse