big data developer Archives

Professional Summary

Ø Multiple bug fixes were implemented to improvise the application

Ø Enhancements were implemented to create new functionalities, which simplified the work of users in their daily routines by button clicks.

Ø Service Improvement Projects were implemented.

Ø I take the ownership to organize Team building activities at my account level and have participated actively in town hall.

Employment history

Big Data Developer, Koelpin, Little and Howe. East Vicente, Michigan

Jun. 2018 – Present

Involved in extracting data from Servers to process
Extract valid data and insert into hive tables
Transform the extracted data
Export Streamlined data to Teradata
Creating support for New Banking Products
Enhance the existing bank product Standards
Technologies:

Big Data: Hadoop, Hive, MapReduce , Spark
Scripting:
Shell Scrpting

Tools:

ETL Based Framework

IDE’s like IntelliJ/ Eclipse

Issue and Project Tracking Tools: JIRA / ServiceNow

Version Control System: GIT / SVN

Databases:

RDBMS: MySQL

Front End Developer, Orn, Berge and Fisher. West Beliahaven, Michigan

Oct. 2016 – Jan. 2017

Maintenance and Support for Insurance Application
Enhancing the features of Insurance Product by providing User friendly Interface.
Debugging the issues raised by Clients
Maintaining Zero Incident Count on Weekly Basis
Maintaining Zero transaction Balance during the monthly and year end Closures.
Technologies Worked: IBM- Domino Technology. Lotus Script, Formula Language, Java, Web Services , JavaScript, HTML, CSS

Tools : Lotus Notes Client 8.5 Domino Designer 8.5 Eclipse (SVN) Helios Apache – Tomcat Server , I-series Emulator
Achievements :
§ Awarded with TCS Gems, for the outstanding contribution to the organization.

§ Appreciations from the Client side network (Allianz – AGCS).

§ Contributions in many Account Initiative activities for Client visits.

Undergone internal training and have taken up internal certifications

Developer, Reinger, Jenkins and Okuneva. Hesselton, New Mexico

Sep. 2015 – Oct. 2015

Education

Herzog Institute, Dickensfurt, Wisconsin

Bachelor of Engineering, Electronics and Communication Engineering, Oct. 2014

Eastern Nebraska Academy, West Lynell, Virginia

Class 12, PCMB, Sep. 2010

Sawayn Academy, Elliotttown, North Dakota

Class 10, Nov. 2008

North Alaska University, Lake Santiago, Idaho

Class 10, Sep. 2008

Skills

Hive

MySQL

Spark

Python

Professional Summary

Passionate Hadoop Developer.
Self motivated, result-driven individual.
Enjoys coding on big data frameworks.Used both Scala & Java for development at the current organisation.
Most of the free time is dedicated to doing things related to Hadoop in one way or another.
Future goal is to make contributions to Apache projects related to big data(been thinking about for a while).

Employment history

Big Data Developer, Kiehn Inc. Lenoreside, New Hampshire

Apr. 2020 – Present

Data Migration from SqlServer to HDP
Played a vital role in building the enterprise data lake.
Hive and Sqoop were used for report generation on the Hadoop platform.
Responsible for both development of new technology in the Hadoop stack & administration of the existing clusters.
Worked on Referral programs (Digidhan & Bhim Referral schemes to name a few).
Worked on designing interactive analytics architecture on HDP.
Architecture is based on lambda architecture.
- Spark was used for building pipeline & as a processing tool.
- Hbase as a storage layer & Apache phoenix as a sql layer for interactive analysis on most granular data
- Apache Druid for interactive analysis on semi & fully aggregated data
- Tableau is used for interactive analysis & reporting
Currently working on building a real-time processing application.
- Data is being read from apache Kafka
- Spark Streaming used as a streaming framework
- Hbase is the data storage layer.
Cluster Details :
- Cluster :- 4 (All running HDP 2.6.5 currently)
- Number of Nodes :- 36 (production) nodes
- Data Size :- 55 TB

Integration & Implementation Engineer, Medhurst, Moore and Barrows. Dickinsonchester, Hawaii

Feb. 2015 – Apr. 2016

Modify existing configuration to correct errors, to adapt it to new hardware, or to upgrade interfaces and improve performance.
Implement or perform preventive maintenance, backup, or recovery procedures.
Install, or coordinate installation of, new or modified software, or programming modules of telecommunications systems.
Monitor and analyze system performance, such as network traffic, security, and capacity.
Daily basis report generation for the client and higher management.
Client :- Mobilly

Education

Waelchi University, East Aikoshire, South Carolina

B.Tech, Electronics & Tele- Communications, Apr. 2013

Skills

Apache Sqoop

Experienced

Apache Hive

Experienced

Apache Spark

Experienced

Apache Hbase

Experienced

Apache Phoenix

Experienced

Apache Nifi

Skillful

Apache Oozie

Skillful

Apache Kafka

Skillful

Java

Skillful

Scala

Skillful

Metadata extraction of salesforce object through the MS Excel and create hive tables dynamically
Implement SCD Type 2 using HQL and shell script
Tools Set: Hive , shell script, HDFS, Informatica developer(BDM), Informatica Cloud Services, CA Workload Automation, HP ALM, FileZilla, MS Excel
Worked with RDDs.
Contributed in creating applications to get knowledge of data.

Worked as a Big Data Developer at FORMCEPT.
Worked mostly in Python, Java, Hadoop, Kafka.
Collaborated with front-end engineers to see projects through, from conception to completion.
Contributed creative ideas and insights for improvement in the product.
Worked on API Design.
Worked with apache-kafka in consuming over a channel and publishing the results to another, used avro-schema for encoding and decoding the input/output.
Optimised the applications.

Develop architectural strategies at the modeling, design and implementation stages to migrate project to Big Data environment.
Collaborate with big data specialist, design analysts, and others to understand business requirements.
Implement database scripts in Hive and test them for identical results and optimized, seamless performance.
Data stored on salesforce based CRM application(Veeva CRM) is being extracted and brought into Medical Integration Hub in Hadoop Ecosystem, which is further used for business analytics and reporting purposes

Currently working as Scala-Developer in Big-Data Technology (Spark) at Johnson Controls.
Responsible for developing multiple use cases from requirement analysis to production deployment.
Direct the analysis, development, and operation of complete feature/systems.
Along-with spark|hadoop, worked on Hbase, Oozie, Hive, Python and Unix shell scripting (moderately).
Also worked in Data Science for retail Data analysis.

Project Name: Japan Digital Data
Data is extracted from Data warehouse based on Oracle which contains all the information that is generated in interaction between HCP’s and sales representatives particularly in Japan.
Developing maps, workflows and applications in Informatica BDM to implement the SCD logic in the data lake
Creation of scripts to automatically generate the DDL’s from the file sources and create the tables in the data lake
Designing Job flow in CA Workload automation tool for monitoring the jobs
Tool Set: Hive, HDFS, Informatica Developer, Shell, CA Workload Automation, HP ALM, TOAD
Project Name: Medical Integration Hub

Worked on Ab Initio(ETL tool) for data migration from
Used GIT as a version control system to keep tracking code changes.
Managed import of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from SQL into HDFS using Sqoop
Worked on the documentation.

Requirements gathering and interactions with business teams to create data engineering solutions for network related problems.
Use ETL tools to gather data from third party sources and ingest them post processing into the data lake.
Building frameworks over data to create meaningful use cases that help identify concerned teams about network hygiene.
Hands on experience of working with Spark SQL and Streaming technologies.

Data Migration from SqlServer to HDP
Played a vital role in building the enterprise data lake.
Hive and Sqoop were used for report generation on the Hadoop platform.
Responsible for both development of new technology in the Hadoop stack & administration of the existing clusters.
Worked on Referral programs (Digidhan & Bhim Referral schemes to name a few).
Worked on designing interactive analytics architecture on HDP.Architecture is based on lambda architecture.Spark was used for building pipeline & as a processing tool.
Hbase as a storage layer & Apache phoenix as a sql layer for interactive analysis on most granular data

Understand ,design and implement requirement to meet the client expectation.
Understand the raw data ,design and develop spark applications to process,store and visualise the data.
Troubleshoot time and memory consuming spark applications and optimise.
Used Agile (SCRUM) methodologies for application development.

Analysis, Design and Development of complex business applications in Apache Spark with Scala.
Orchestration of a vast set of applications involving Spark applications, Web Services, Python and Shell Scripts using Apache Oozie and NIFI.
Design the end to end flow using Apache NIFI which helps in data routing, transformation, and system mediation logic.
Responsible for Development of a scripting based framework which is used as the base for Orchestration in the complete project.
Work with the Architect directly to manage large, complex design projects for corporate clients.
Grooming the next generation of developers to understand the architecture and functioning of the project.

Gathered requirements from the Onsite lead
Involved in all the team meetings & meeting with the clients.
Worked on creating Hive table definitions.
Worked on getting the data from the files and loading them to the Hive external tables.
Enriching the data in Hive tables by joining different tables and creating a final view for the enriched data.
Involved in the Development, testing and validation of the entire code related to Hive, NIFI, Shell Scripting
Worked with the Hadoop admin for the deployment of the project.

Worked as Offshore lead to gather business requirements and guided the team
Implemented spring boot micro services to process the messages into the Kafka cluster setup
Used Spark SQL on data frames to access mysql tables into spark for faster processing of data.
Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
In preprocessing phase of data extraction, used Spark to remove all the missing data for transforming of data to create new features.
Improved Spark performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, PairRDD’s, Spark YARN

Gain hands-on experience on polyglot development culture.
Write Unit and Integration tests to ensure end to end code coverage.
Recommended architectural improvements, design solutions and integration solutions. (Optimise workflows, choosing better platforms, migrating CI Pipelines)
Design RBAC system and developed REST API’s by NodeJs, MySQL database.

Client : Intel
Environment : LINUX Software
Development Model : Agile Model
Implemented automated mail alert system for ETL flow status.

Modified the existing code as per the requirements.
Decreased the run time of the jobs by performing performance tuning in the hive scripts.
Monitored jobs in production ensuring that there are no failure.
Fixed failed jobs and bugs in the application.
Automated the logs of every feed received by the application for the better understanding of the downstream team.
Co-authored design document for the new requirement of the existing application.

implemented Python and Go scripts REST API’s to connect different file systems (HDFS, SFTP and Aws S3), Database systems (ORACLE, PostgreSQL, MySQL, Cassandra and Mongo DB) and external systems (SAP, Salesforce and Kinaxis) for ETL process.
Implemented to upload and process large scale of files with in short time using Scala and Spark RDD/DF technologies.
Design, Modified, Improved Python, JavaScript by code refactor.
Responsible for Design and Develop the Scala and Node based technologies.
Worked closely with Business and Technical Design architects to understand the flow.
Used docker/ kubernetes mechanism to run all services of product.
Automated data extract and load into different systems under SCM with high security.

Involved in analysis of requirements and business rules based on given documentation and work closely with tech leads and Business analysts in understanding the current system.
Analyze the data coming from different resources to know its schema and functionality.
Use Sqoop scripts to ingest data from different RDBMS sources into Hadoop Cluster (HDFS) and created Hive tables, partitions, data loading into hive tables, etc.
Worked on several functions in Scala Library to build Spark Applications, Spark SQL, RDDs -Transformations, Actions, data frames and pushed the results to the HDFS.
Developed Hive programs to parse the raw data, populate staging tables and to store the redefined data in partitioned tables in the EDW.
Created Hive Queries and Scala applications that helped marketing analysts to spot emerging trends by comparing fresh data with EDW reference tables and historic metrics.

9af7b38e-0737-4231-8f57-85290a162890

Andrew Smith

Professional Summary

Employment history

Education

Skills

851bf3cf-68fe-49ac-859e-d2a9e1075662

Andrew Smith

Professional Summary

Employment history

Education

Skills

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

senior big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer

big data developer