EBCDIC COBOL CopyBooks to ASCII Conversion

Organizations want to push their mainframe data into their data lake on Hadoop to perform analytics on their complete data and join it with data from other sources. Mainframes use EBCDIC data format, which presents a challenge since Hadoop uses ASCII data format.

Bitwise Hadoop Adaptor for Mainframe Data is a stand-alone EBCDIC to Hadoop format conversion utility that understands all the different ways data is stored in mainframes (including COBOL CopyBooks) and can convert these complex data structures to the right structure for Hadoop (ASCII, Avro, Parquet, etc.).


Mainframe Data Conversion Solution

Data structures within COBOL can get very complex due to complications in COBOL CopyBooks related to CompFields (compressions), Nested fields, Arrays and Redefines within Flat Files, making accurate Code Translation difficult to achieve.

Bitwise offers a solution for converting even the most complex COBOL CopyBooks to ASCII, Avro or Parquet format with optimal conversion performance using our Hadoop Adaptor for Mainframe Data and proven conversion methodology.

  • The solution utilizes key components at run-time to acquire data from mainframe filesystems to transform the data into Hadoop formats and load/store into HDFS.
  • Efficiently uses distributed compute power of big data platforms, so there is no need to maintain separate infrastructure for conversion. Data read, conversion and loading happens in single pass minimizing the I/O overheads.
  • Our Hadoop Adaptor for Mainframe Data handles files with multiple records types, variable length records and signed, packed, binary and special fields can be converted.
  • Bitwise installs and configures the Hadoop Adaptor for Mainframe Data on the edge node.
  • Bitwise can manage the Hadoop Adaptor for Mainframe Data and handle all customized and operational requirements, as needed.

Mainframe Data Conversion Process

Hydrograph Architecture

Mainframe Data Conversion Solution

Bitwise Mainframe Data Conversion Solution consists of three main components:

  • Acquire: Mainframe data is acquired using FTP and copy commands.
  • Conversion: Conversion process executes on Hadoop, conducts SerDe transform process (Mainframe to Hadoop format, Data Types, Fixed and Variable Length, etc.) and converts data.
  • Loading: Once data is converted, data loads into HDFS (HIVE) for further processing.


Key Benefits

  • Bring mainframe data into the data lake, making it available for Advanced Analytics.
  • Combine mainframe data with any other data sources in the data lake
  • Faster analytics on mainframe data in Hadoop.