Today, the Hadoop ecosystem has become a must have enterprise technology stack for organizations seeking to process and understand large-scale data in real time. Hadoop has multiple applications in enterprise like Data Lake, Analytics, ELT, Adhoc Processing, etc. and more such applications are being discovered at an increasingly fast pace.
The first step for any Hadoop data processing pipeline is to ingest data into Hadoop, making data ingestion the first hurdle to utilize the power of Hadoop.
At present, there are many tools available for ingesting data into Hadoop. Some tools are good for specific use cases, for example Apache Sqoop is a great tool to export/import data from RDBMS systems, Apache Falcon is a good option for data set registry, Apache Flume is preferred to ingest real-time event stream of data and there are many more commercial alternatives as well. Few of the tools available are for general purposes like Spring XD (now spring cloud data flow) and Gobblin. The selection of options can be overwhelming and you certainly need the right tool for your job.
But none of these tools are capable of solving all the challenges, so enterprises have to use multiple tools for data ingestion. Overtime they also create custom tools or wrapper on top of existing tools to solve their needs. Furthermore all these tools have text based configuration files (mostly XML) which is not very convenient and user friendly to work with. All this results in lot of complexity and overhead to maintain data ingestion applications.
Looking at these gaps and to enable our clients to streamline Hadoop adoption, Bitwise has developed a GUI based tool for data ingestion and transformation on Hadoop. With convenient drag/drop GUI, it enables developers to quickly develop end to end data pipelines all through from single tool. Apart from multiple source and target options, it also has many pre-built transformations that ranges from usual data warehousing to machine learning and sentiment analysis. The tool is loaded with the following data ingestion features:
Bitwise’s Hadoop Data Ingestion and transformation tool can save enormous effort to develop and maintain data pipelines. Stay tuned for subsequent features that explore the other phases of the data value chain.
Pushpender has been a leading member of the Bitwise Design and Architecture Research Team (DART) responsible for driving innovations in big data architecture, cloud computing, blockchains and microservices, and played a key role in developing Hydrograph as the project’s chief architect.