New service announcement: IBM Analytics Engine Beta
  • Analytics Engine
  • US South
  • Description
    We are excited to announce the Beta release of the IBM Analytics Engine, which provides a single Hadoop and Spark service under the Watson Data Platform. It makes it easier for data engineers, data scientists, and developers to develop and deploy analytics applications. With integration through Jupyter notebooks in Data Science Experience, IBM Analytics Engine provides the foundation for executing data science and machine learning workloads. The IBM Analytics Engine utilizes the Hortonworks Data Platform as the underlying Hadoop distribution, which provides access to a market leading open source Hadoop distribution.

    The IBM Analytics Engine provides the ability to spin up clusters within minutes, easily scale clusters up, and supports external Hive metastores. To create and manage cluster lifecycles, administrators can use the Bluemix user interface, REST APIs, and the Cloud Foundry CLI. The latter two options enable programmatic access to operationalize the use of Hadoop and Spark from external applications while deploying data pipelines. Jobs can also be submitted through a Cloud Foundry CLI extension, which provides a nice scriptable way to execute jobs remotely. Also key is the capability to pass scripts to customize clusters at creation time, which enables a predictable configuration across cluster creation and deletion cycles.

    The architecture separates compute and storage for better scalability and reliability. It allows users to easily spin up clusters for the duration of a single job and delete them on completion. Users can execute jobs directly against data in the IBM Cloud Object Storage service and can make the analytic data even more resilient by using the cross-region option. The Analytics Engine leverages Stocator when using Spark to improve data read and write speeds, thereby, delivering better performance on I/O intensive workloads.

    We believe that users should be focused on analyzing data instead of managing clusters and the intricacies of Hadoop or Spark platform configurations. We welcome your participation and feedback in the Beta as we embark on simplifying the process and allow you to focus on gaining insights and taking action.

    This information is based on the IBM Analytics Engine beta goes live article in the Bluemix blog.