Amazon preps Data Pipeline service to automate big data workflows

Amazon’s newly launched Data Pipeline will help Amazon Web Services customers get a better grip no handling data scattered throughout the various AWS data repositories and third-party databases as well, Amazon CTO Werner Vogels said Thursday.

This tool will make it easy for AWS customers to create automated and scheduled workflows — from Dynamo DB to S3 storage, wherever needed. “It’s pre-integrated with AWS data sources and easily connected to third-party and on-premise data sources,”Vogels said.

The proliferation of data — machine logs, sensor data and plain old database data — is driving the need for automating the flowof that data from databases to storage to applications and back. “You have to put everything in logs which creates  even more data…in AWS,” Vogels said.

Users build their workflows with a drag-and-drop interface and schedule them to run periodically. By making it easy to consolidate data in one place, customers will be better able to run big batch analytics on their logs and other information.

There was not a ton of details other than that but from AWS track record, the service should be available soon. Stay tuned for updates.


GigaOM