Professional Summary

Dedicated Data Engineer having 3 years of relevant experience in areas of data integration and analytics. Expert in communication with higher management and clients, developing end to end ETL processes and building data driven dashboards using strong database programming skills. Ability to work in cross functional teams and a multi cultural environment. Strong will to strive for best solutions in order to improve the product/services constantly.

Employment history

Data Engineer, Will, Marvin and McDermott. Wynonaville, New Hampshire

Oct. 2019 – Present

I am part of Analytics team which work in close coordination with higher management and other departments like Finance,Marketing and affiliates. I develop end to end ETL processes which includes from gathering requirements to planning an architecture and then developing the ETL process. I also work on dashboards that provide in depth analysis of this data to different functions. I have worked on the following projects:

Sql optimization of ETL jobs developed in the past to improve performance.

Integration with Gigya API to get customer related data on daily basis. The data was kept in the table using SCD Type 6 to keep history on customer level.
Worked on a pipeline to segregate the vouchers issued to customers on daily basis from normal payments and later build a dashboard to see the campaigns success ratio.
Integration with XE.com to get daily exchange rates for the payments to process according to the exchange rates of the relevant date.
I have worked on the creation of a consolidated table that contains all the live information related to a user for ease of use of higher management.

Implemented Error Logging Module in all pipelines so that they can re run in case of any ambiguity and email is generated.
Worked on the multi tenancy of ETL Jobs.

Business Intelligence Developer, Jacobs, Muller and Lind. West Sook, New York

Mar. 2017 – Oct. 2017

I was part of the Business Intelligence team which was providing data insights to our insurance clients. My responsibilities included:

Communicating directly with the clients and providing them insights to data through different customized dashboards keeping in view their requirements.
Creating data pipelines using SSIS.
Working on store procedures to create a data source for dashboards developed on in-house tools like RNA and Arcplan.

Associate Software Engineer, Pouros, Wunsch and Becker. Nolanborough, Utah

Nov. 2016 – Dec. 2016

Here I had the privilege to work with the most important team in the company. I worked with .net technologies mainly backend. I worked on the following projects:

Writing web services for linking Web and Desktop version of POS.
Writing store procedures and developing invoice reports using crystal reports.
Converting legacy code to vb.net and implementing mvvm architecture pattern.

Education

Hagenes Academy, Caseyburgh, Nebraska

Bachelor of Science, Computer Science, Nov. 2016

Skills

Git

Jenkins (Jobs Orchestration and Deployment)

Data Modelling

AWS

Postgresql

Data Analytics

Data Integration

SQL Programming

Professional Summary

Software Developer with a year of experience of justifiable success in delivering appropriate technology solutions for desktop and mobile products. Comprehensive knowledge of platform development, enterprise architecture, agile methodologies, cloud services, and web-based applications. Innovative with a mix of high-level technology use&direction and intermediate technical expertise.

Employment history

Data Engineer, Williamson-Breitenberg. Hubertbury, Mississippi

May. 2020 – Present

Worked on several enterprise projects, developed efficient and effective solutions catering the business need.
Worked on company’s product dealing with humongous data from designing the console to building it to a optimized one.
Optimized project’s components by code refactoring, optimization and solved critical issues like cross-browser compatibility, latency, spam-mail and many more.

Mean Stack Developer Trainee, Stanton Inc. Rickieburgh, Oregon

Nov. 2018 – Jan. 2019

Got training for mean stack development and made challenging projects for various stages of career path utilizing Angular, Node.js, Express.js, MongoDB and Bootstrap.
Learned basics of Deployment through AWS’s EC2, Route53 and Nginx.

Education

Eastern Mosciski College, South Lonniechester, North Carolina

Bachelor of Science, COMPUTER SCIENCE AND ENGINEERING WITH SPECIALIZATION IN CLOUD COMPUTING, Jul. 2019

Personal Competencies

Profile

Personal info

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Skills

Node

Express

TypeScript

Socket.io

MongoDB

RESTful API

Angular

AWS,NGINX

Andrew Smith

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Professional Summary

Hands-on, successful Software and Data Engineer with around 6 years of verifiable success in leading teams in delivering appropriate data solutions.
Excellent understanding of Hadoop architecture, Hadoop Distributed File System and various components such as Name Node, Data Node, Job Tracker, Task Tracker, YARN and MR concepts, Pig, Hive.
Experience in supporting data analysis projects using EMR, EC2, S3, Data Pipeline, RDS, Lambda, Glue, Athena on the Amazon Web Services (AWS) cloud.
Good understanding of Spark internals and performance optimization techniques with hands-on experience in creating optimized spark jobs using Python, Spark, and SQL.
Experience working with Hive – creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
Comprehensive knowledge of SDLC, enterprise architecture, agile methodologies, cloud services, and web-based applications.

Skills

AWS Services: S3, EMR, EC2, Athena, Data Pipeline, Lambda, Glue, SNS, SES

Big Data Ecosystem: Hadoop, Map-Reduce, HDFS, Spark

Python, SQL, Hive, Pig, Shell scripting, Airflow, Git

Databases: PostgreSQL, MySQL, Oracle, SQL server, MongoDB, DynamoDB, Snowflake

Employment history

Jun. 2015 – Oct. 2015
East Shalon, North Carolina

Data Engineer, Lebsack and Sons

Full life cycle development including requirements analysis, high-level design, coding, testing, and deployment.
Extensive working knowledge on structured query language (SQL), python, spark, Hadoop, HDFS, AWS, RDBMS, data warehouses and document-oriented No-SQL databases.
Automated the process of downloading raw data into Data Lake from various source systems like SFTP/FTP/S3 using shell scripting and python.
Developed Hive scripts for data parsing of raw data using EMR and store the results in S3 and ingest into data warehouse like Snowflake, which is utilized by enterprise customers.
Designed ETL jobs to process the raw data using Spark and python in Glue, EMR, and Databricks.
Used python to pull raw data from various sources like Google DCM, DBM, AdWords, Facebook, Twitter, Yahoo, and Tubular. Also this data is parsed using spark framework and injected the data into Hive tables.
Implemented MapReduce programs using pyspark to parse out the raw data as per business user requirements and store the results in Data Lake (AWS S3).
Implemented several data pipeline jobs to pull the raw data from different sources to AWS S3 bucket, then processed using pyspark in EMR cluster and store the processed data in AWS S3 bucket.
Created Spark jobs as per business requirements, jobs run on EMR and are triggered by Lambda.
Worked with ETL tools like SSIS and reporting tools like SSRS, PowerBI, and tableau.

Jun. 2013 – Jun. 2014
Jarrettstad, Arkansas

Software Engineer, O’Kon Inc

Writing automated scripts using Python for System Testing. GUI based testing using Selenium and Python.
Took a leading role in test automation and manual testing, actively involved in the creation of detailed test plans test cases and test scenarios for different application modules according to functional requirements and business specifications.
Responsible for conducting smoke, functional, UI, regression and ad-hoc testing.
Facilitate the resolution of testing roadblocks, ensure execution of QA deliverables and guiding team members on agile standards and best practices.
Regularly interact with management and product owners on project status, priority setting and sprint timeframe.
Create test plans and test reports for multiple releases of various mobile applications. Coordinating the off-shore automation testing efforts and test cases by weekly review meetings.
Established and reviewed QA sign off criteria, software build and test process with the scrum team.
Assisted product development teams in the implementation of work plans and the production of review documentation.

Education

Present

Master of Science: Information Technology and Management

Eastern Louisiana University – Fayburgh, Oregon

Dec. 2012

Bachelor of Science: Electrical and Electronics Engineering

Dooley Institute – Port Raphaelville, Nebraska

Professional Summary

Enthusiastic technical professional with complete understanding of entire software development lifecycle. Highly trained in ScyllaDB and known for having talent in Python Development skills for data analysis and data platforms. Ability to do multi-task and learn and adapt new technologies. Excellent work performance and attendance record. Highly motivated to learn and built applications using ML.

Employment history

Data Engineer, Turcotte LLC. Lake Jerilyn, Maine

Jan. 2020 – Present

Write code for Data Platforms and for automation purposes using Python.
Working on ScyllaDB (Fast Processing DB) for message intact mechanism.
Contributed ideas and suggestions in team meetings and delivered updates on deadlines, design and enhancements.
Working on data science technologies for building machine learning models.
Working on DIP (Digital Insights Platform) for Data Ingestion.

Software Developer, O'Reilly, Beatty and Leannon. Chasfort, Georgia

Oct. 2017 – Feb. 2018

Worked on Salesforce Development Environment for Apex Controllers, Visualforce Pages, lightening development applications.
Worked on Salesforce Web Pardot Development projects.

Education

Northern Pennsylvania College, South Rudolphville, Georgia

Bachelor of Technology, Computer Science and Engineering, Aug. 2017

South Kentucky University, East Chadhaven, Idaho

Senior Secondary School, Science, Oct. 2009

Accomplishments

Awards

Publications

Languages

English

Hindi

Skills

Python

ScyllaDB

DIP (Digital Insights Platform)

Data Science

SQL/NOCQL

Andrew Smith

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Professional Summary

Results-driven software engineer with 2.5 years’ of experience developing software that facilitate ETL and ensure high availability of cleaned and ready to use data. Dedicated to improving efficiency, productivity, and profitability of organizations through the development and execution of innovative, cost-effective solutions.

Employment history

Apr. 2020 – Present
Osinskimouth, Massachusetts

Data Engineer, Bahringer-Bradtke

Manage Snowflake Data Warehouse for Gartner Digital Markets team.
Developed high frequency data integration with BigQuery for ingesting sessions and hits data.
Provide advice on cost optimizations and process changes.
Setup Airflow as ETL tool to migrate existing ETL tool in effort to reduce costs by up to $25,000.

Nov. 2018 – Jun. 2019
Collierton, South Dakota

Software Development Engineer II, Bogan-Hegmann

Worked as external Software Engineer for Uber Technologies,

Project – AdTech Targeting

Wrote the script to fetch bid data from MediaMath’s Firehose using Japronto framework, AWS Load Balancer, and EC2 instances
Handled 2gb/min data using 1 ELB and 8 EC2 instances;
Each bid is 1 request and received ~1 million requests per minute Implemented file rotation logic to create a new file every 2 minutes and wrote the script to upload the same on AWS S3 bucket
Developed connector for pushing audiences to Ad networks Implemented data preparation logic using PySpark to optimize performance and reduced upload time from 6 hours to 20 minutes.

Project – City Guides

Implemented data preparation, transformation and update logic pipelines.
Integrated Four Square API to fetch POI attributes.
Developed multithreaded integration to reduce reduce data ingestion time from 4 hours to 30 minutes.

Project – Alexa HR App

Developed Python lambda function to trigger response based on text received from Alexa.

Project – Growth Marketing : Uber Euclid

Development and Maintenance of Ad-network Plugins (Extract-
Transform-Load)

Developed 4 plugins by integrating APIs and SDKs of multiple Ad Networks and affiliates to fetch data for marketing analytics
Extracted and Transformed the marketing data as per business requirements; handled possible edge cases too
Ensured quality check multiple times before delivering the plugin
Used Phabricator for plugins delivery, and fixing existing plugins issues and Jira to solve over 140 tickets related to around 50 plugins
Achieved 100% unit-test code coverage, no lint issues and 0% technical debt; followed the best coding practices (DRY, SOC, etc.)

Project – AdTech BI

Create weekly, monthly and Yearly spend/acquisition Looker dashboard for C level executives in Growth Marketing Team.
Write and evaluate complex SQL , perform Data Analysis and extract presentable insight from scrubbed data.
Perform data cleanup.

Education

Oct. 2017

Bachelor of Technology: Electronics

Northern Cole Institute – Port Melanieport, Rhode Island

Skills

PySpark

Experienced

AWS

Experienced

Docker

Skillful

SQL

Experienced

Hive

Experienced

HDFS

Experienced

Apache Airflow

Experienced

Python

Experienced

Andrew Smith

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Professional Summary

I am a data engineer with good experience and technical knowledge, I am a quick learner and always determined to deliver optimized and quality product.

My main responsibilities –

1. Gather proper understanding of client requirements.
2. Provide optimize designs for clients projects
3. Develop and troubleshoot applications.
4. Built, deploy, run and test application in UAT and Dev environments
5. Guide and help juniors to grab technologies

Employment history

Jul. 2019 – Present
Brandonchester, Washington

Data Engineer, Armstrong-Tremblay

I am currently working as data engineer, as part of my job I am working with different kind of Big data technologies like mapreduce, hive, spark, sqoop in different projects.

Projects
BBO_CUPR
Technologies : hadoop 2.6, hive 13, spark 2.1, sqoop 1.4, jenkins, shell, maven, scala 2.1

Ingest structured and unstructured data from different sources like RDBMS and NAS to Hadoop environment, apply filter/parsing logic on RAW data and dump into hive tables. On top of the filtered data business logics are applied and moved to the final location and loaded into hive tables. From the final location data can be exported to an external location using sqoop and scp.

Oct. 2017 – Nov. 2017
New Joane, Missouri

Data Engineer, Adams Inc

I was working as data engineer, I worked in different big data technologies like hive, spark, hbase, map reduce and languages like java, scala, python during this period.

Project: Ingestion Framework(1.0) – Product Development
Environment: Hadoop 2.8.1, Java 1.8, Scala 2.16, Spark 2.10, U, MySql, Google Cloud

This project contains the development of an Ingestion framework which can ingest any kind of structured and unstructured data into Hadoop system from different data sources like NAS and RDBMS, the framework is dynamic enough to add new source system as required. Validator modules are introduced to validate the ingested data

Project: SpeedBoat(Product development)
Environment : Hadoop 2.6, Java 1.8, Scala 2.16, Spark 2.10, Google Pub sub, Google DataStore, unix, Spring boot

The project consists of creating producer-consumer architecture based applications. The user can ingest data into Hadoop using simplified REST API. Ingested data are validated with the schema provided by the user to the application. Later on, data can be processed in both batch and real-time manner and stored in the Google data store.

Aug. 2014 – Sep. 2014
Evelynnside, Illinois

Data Engineer, Wunsch, Kuhic and Stoltenberg

I was working as a Data engineer, during this period I was working in both enterprise product development and service projects with different hadoop technologies like mapreduce, hive, pig, spark and languages java, python, scala and shell.

Project: Patient Centered Care Model (PCCM) – Provider search
Environment: MapR 3.0.1 (Hadoop 0.20.2, Hive 0.12.0, Sqoop), JDK 1.6, MySQL, Subversion, Maven, AnthilPro, Artifactory

Design an application that can ingest delimited files using Bedrock(customized product) into Hadoop, validate the data using provided schema and business logic are applied to the validated data and moved to the final location and loaded into hive, from final location data can be exported to external systems using scp.

Project: Sentimental Analysis
Environment : Cloudera 5.2, Hive 0.12.0, Tableau, Bedrock 3.2.2, Maven, Spark, Yahoo Query Language, Java 1.7

In this project, companies stock related data and revenue data are extracted using the ticker name from finance site and ingested into hadoop using Bedrock(Custom product). Stock tweets are extracted and processed using machine learning and perform sentiment analysis with help of ingested stock data. Final results are visualized using Tableau desktop

Project: Payment Integrity
Environment: Hadoop 2.5.1-mapr-1501, hive 0.13.0, Java 1.7, Sqoop 1.4.5, Sqlite3, Shell script

This project is divided into different modules those can capable of ingesting structured data from different sources into hadoop file system. ingested data are validated and filtered. Different business logics are applied on filtered data and moved to publish locations. Sqlite DB is used to keep track of process flow, Proper notification are send during the processed to the client.

Submodules –

RDBMS data ingestion
Structured data from different databases are ingested using sqoop,
File movement –
Structured files(Delimited and fixed length) are ingested from different remote clusters using proper load balancing,

Project: Bedrock Data Lineage(4.4) – Product Development
Environment: Hadoop 2.6.0, hive 0.14.0, Java 1.8, Spark, Sqoop, Unix Scripting, Draw2D,Spring, Hibernate, Elastic-Search

This project consists of the development of Data lineage functionality in Bedrock product. Data lineage can track the transition of data inside hadoop file system those are triggered using hive,sqoop, mapreduce, spark and shell. User can see those transition in visual UI. It is also capable of connecting to Apache atlas and Navigator to visualize the lineage in advance stage.

Education

2014

Associate of Applied Science: Computer Science and Engineering

Harris University – West Chuck, Wyoming

Skills

Java

Experienced

Scala

Experienced

Map reduce

Experienced

Hive

Experienced

Spark

Experienced

Shell

Experienced

Professional Summary

Analytical, accomplished, professional Data Engineer with 21 months of experience in multifaceted roles requiring data analysis, data modelling and software/web development. Skilled in the areas of enterprise data warehousing, data analytics, process optimization and problem solving. Motivated and eager to advance my career with a growth-oriented, technically-advanced organization.

Employment history

Data Engineer, Roob, Bayer and Gusikowski. Mariaburgh, West Virginia

Nov. 2019 – Present

Create a Data Lake on AWS environment using Spark (Pyspark) on AWS Glue for ETL Jobs, AWS Lambda for automation and triggering jobs, Athena are the primary source for data store and S3 for storage.
Develop data models for applications, metadata tables, views or related database structures.
Document and communicate database schemas, using accepted notations.

Data Science Intern, Funk LLC. Elanebury, Idaho

Mar. 2018 – May. 2018

Document search using ElasticSearch: Created a Flask API for document processing and storing them in ElasticSearch for search queries. Containerized this API using Docker for easier development, testing and deployment.
Model Serialization for H2O.ai models: Exploring and implementing serialization techniques for h2o machine learning algorithms using pickle, joblib and POJO (Plain Old Java Object).

Business Intelligence Developer, Rutherford-Littel. Port Filomenaville, New Hampshire

Oct. 2016 – Mar. 2017

Survey Application Programming, Lead: Designed and developed the front and back end framework to implement the business logic used for assessing social desirability of employees within an organisation. (https://mindofn.games/faultinme)
Dashboard Designing, Developer: Used tableau to visualize key performance indicators and presented it as a story. Integrated High Charts (Visualization library) in an existing project.
Database developer: Develop and maintain archived procedures, procedural codes and queries for existing applications.
Automation, Developer: Identifying and programming tools to automate data processing to optimize existing processes.

Data Analyst Intern, Smitham Inc. Brakusview, Texas

May. 2016 – Jun. 2016

Platform Development, Developer: Platform development (Web development) using PHP (CodeIgniter) as the server side language, SQL server as the database and HTML/CSS (Bootstrap Framework) and JQuery for the front-end development.
Production Testing, Tester: Test the functionality of new features and services added to the platform.

Education

Eastern Oregon College, Carloborough, Vermont

MBA, Data Science and Data Analytics, May. 2019

The Koelpin, Paucekville, Mississippi

Bachelor of Commerce, Accounting and Finance, Aug. 2015

Northern Tennessee College, New Remaside, Virginia

High School Diploma, Commerce, Jan. 2012

Personal info

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Skills

Python

Machine Learning

Web Development

Excel, VBA

Databases

Professional Summary

Hands-on, successful Engineer with lots of verifiable success leading teams in delivering appropriate technology solutions in hardware as well as software domains. Comprehensive knowledge of platform development, enterprise architecture, agile methodologies, and web-based applications. Innovative change agent with a unique mix of high-level technology direction and deep technical expertise.

Employment history

Web Developer, Hermiston and Sons. Melisaview, Pennsylvania

Sep. 2019 – Present

Select programming languages, design tools, and applications for developing an upgrade for kindle as a part of ACMS program of Amazon.
Develop and document style guidelines for web site content.
Design and build web sites, using scripting languages, content creation tools, management tools, and digital media.
Develop databases that support web applications and web sites.
Maintain understanding of current web technologies and programming practices through continuing education and reading.

Data Engineer, Moore, Hirthe and Morar. Royburgh, South Dakota

Jan. 2019 – Feb. 2019

Write, update, and maintain computer programs to handle specific jobs such as finding synonyms of a word, recreating text into several forms of identical meaning and finding the best fit for the same sentence.
Conduct trial runs of programs and software applications to be sure they will produce the desired information and that the instructions are correct.
Write, analyze, review, and rewrite programs, applying knowledge of computer capabilities, subject matter, and symbolic logic.

Game Developer, MacGyver-Abshire. Margheritaton, California

Jul. 2018 – Oct. 2018

Apply story development, directing, cinematography, and editing to animation to create storyboards that show the flow of the animation and map out key scenes and characters.
Script, plan, and create animated narrative sequences under tight deadlines, using computer software.
Make objects or characters appear lifelike by manipulating light, color, texture, shadow, and transparency, or manipulating static images to give the illusion of motion.
Providing controls and allowing interesting actions to player to be able to lead the game to where the player wants.

IoT Developer, Krajcik LLC. Robelhaven, Nevada

Aug. 2018 – Oct. 2018

Analyze user needs and recommend appropriate hardware.
Test and verify hardware and support peripherals to ensure that they meet specifications and requirements, by recording and analyzing test data.
Monitor functioning of equipment and make necessary modifications to ensure system operates in conformance with specifications.
Confer with team members and consult specifications to evaluate interface between hardware and software and operational and performance requirements of overall system.

Education

Northern Stoltenberg Academy, New Christianchester, Iowa

Bachelor of Technology, Computer Science Engineering, Present

Personal info

Phone:

(000) 000-0000

Email:

andrew_smith@example.com

Address:

287 Custer Street, Hopewell, PA 00000

Skills

Java Programming

C Programming

Web Development

MongoDB

C++ Programming

Hadoop Programming

Unity

Constructed data pipeline services(APIs) for client application in 3 months. Technology used: NodeJs

Sql optimization of ETL jobs developed in the past to improve performance.
Integration with Gigya API to get customer related data on daily basis. The data was kept in the table using SCD Type 6 to keep history on customer level.
Worked on a pipeline to segregate the vouchers issued to customers on daily basis from normal payments and later build a dashboard to see the campaigns success ratio.
Integration with XE.com to get daily exchange rates for the payments to process according to the exchange rates of the relevant date.
I have worked on the creation of a consolidated table that contains all the live information related to a user for ease of use of higher management.
Implemented Error Logging Module in all pipelines so that they can re run in case of any ambiguity and email is generated.
Worked on the multi tenancy of ETL Jobs.

Worked directly with clients to get the requirements and provide them results .
Worked as Project Lead and manage the work as per client requirements
Provided optimized way to process the data using USQL
Working for the client Altria for creating scripts using pySpark and Databricks.
Managed design of dynamic widgets focused on [Area].

Documenting issues encountered for faster backtrack and resolutions
Deliver updates based on analytics
Solved performance issues in Spark and Hive scripts with understanding of Joins, Group and aggregation.
Performed Data loads on the requests of the user and effectively dealt with Business users catering to their requirement vis-à-visessential data delivery and quality issues .

Provided Data Warehousing and Business Intelligence services to a leading global telecommunications company for running their marketing campaigns and driving their data-driven insights with Abinitio as the ETL tool.
Was a Member of the PayGo Run & Operate Team that was responsible for the integration, implementation, and management of the data warehouse for timely delivery of reports and extracts to the business users.
Scheduling and handling workflows in control-M along with delivering the daily reports to the users.
Working on the Nucleus R&O database, analyzing business needs of clients, developing effective and efficient solutions and ensuring client deliverable within committed timelines
Was responsible for the implementation of batch schedules. Perform job end diagnosis and resolution with the help of UNIX and Teradata.
Bug fixing and script enhancements using UNIX. Solving operational issues of the jobs/workflows to ensure data is up to date and meeting the business requirements.
Resolved Application dockets/incidents with high priorities raised by business users and the request of users.

Text analytics pipeline using spark, Kafka, Scala.
building a pipe line for continues and real-time pipeline Text analytics (sentiment analysis score, Entity Extraction, Top asked question)
Gather chat data from the chatbot and create a pipeline to do text analysis at real-time.
Spark configuring for the performance of jobs

Write, update, and maintain computer programs to handle specific jobs such as finding synonyms of a word, recreating text into several forms of identical meaning and finding the best fit for the same sentence.
Write, analyze, review, and rewrite programs, applying knowledge of computer capabilities, subject matter, and symbolic logic.
Innovating existing framework in Talend. Also, enhancing YAML, Shell Script, Scala and Etc
Automating manual process using Unix

As a Lead data engineer understand the current problems of Fortune 500 companies data and make them resolve in a mathematically manner
As a active member of R&D team involved on creating different prototypes by using Big data as well as AI &ML
Training the companies freshers for the upcoming technology in Big data to enable them for the client facin• Lead a team member of 10 people and help them of the breakdowns happened in the data and analyze them
Handling other Data engineering projects as part of R&D team to help them to grow.
Works in cloud technology more indepth in AWS and integration of spark.
Developed text pre-processing code to clean, format, process unstructured data in Spark and Scala
Developed Spark-SQL/Hive scripts for end user / analyst requirements to perform analysis.

Solve ticket, maintaining performances
Creating POC for potentials needs
Analyzing and fixing the technical issues and also providing support on dashboard.
Making sure every JIRA tickets are up to date

Design and implement end to end ERP system to increase operational efficiency and provide increased oversight on production activities
Create SQL data schematics for an end to end ERP system.
Design predictive analytics algorithms and in order to optimize KPI’s.
Automate report generation on daily production, financial stansdings and merchandising.

Data acquisition
Identify ways to improve data reliability, efficiency and quality
Use large data sets to address business issues
Deploy sophisticated analytics programs, machine learning and statistical methods
Prepare data for predictive and prescriptive modeling
Find hidden patterns using data
Use data to discover tasks that can be automated

Implemented database as per the designs and data models when new customer is added
User Acceptance Test (UAT) whenever new customer is added
Modified existing databases to meet unique needs after interacting with users
Monitor processing to ensure information is obtained daily with in SLA
Generated reports using reporting tool and delivering it to users on daily basis once processing done
Collected and organized the data in Data warehouse for future analysis and report generations
Generated Alerts to users to manage on-shelf availability up to 90%

ETL Developer
Developing jobs for ingestion and extraction using Talend Studio
Providing Estimation Model for every project
Conducting Test Case Expected Result for Deployment
Preparing Deployment Workplan and Operation Guide for Deployment
Investigating the bugs and defects for ingestion and extraction jobs
Providing solution/hotfix for the errors encountered in the jobs

Result-driven Data Engineer good in Data Analysis, Python programming and Its libraries like Numpy, Pandas.
Collaborated with team members during projects to achieve concrete goals on a strict deadline .
Having proper knowledge of Python Programming, Tableau, Microsoft Power Bi, Excel.
Other proficiency includes c#, Mvc.Net, HTML, RPA.
Developed new reports where suitable.
Perform study on Rate of survival, demand, income, city and charges for Titanic Dataset.

Instrumental in designing and building data pipelines for data driven application development for internal customers.
Responsible for creating a solution for marketing department to analyse Sotheby’s market share to leverage data available publicly from the competitors.
Implemented a dashboard for Auction Bidding department to monitor customer-bidding process. Implemented Python visualisation techniques to model customer behaviour that directly led to process changes.
Understanding business needs and applying it to developed

Design data pipelines using Python v3.6 and libraries such as boto3(AWS apk library) in conformation with the PEP-8 coding standards. Work with AWS cloud services as this was the infrastructure
Enhance the unit test suite for the data pipeline code using the unittest library.
Codebase maintenance and management using Gitlab.
Deploying and scheduling data pipelines on CI tool TeacmCity.

Built a Streaming Data Platform from the ground up. From initial proof of concept to A/B bench-marking to systems integration to testing to production.
Worked closely with the Ops team for rapid systems building.
Defined Fault tolerant processes with bounded retries and error semantics with a goal of automated error identification, resolution and replay from source.
Created processes around schema evolution and release management for data flows.
Participated in events modelling and unifying semantics used by Dev and Data teams.
Streamlined the Predictive modelling framework for automated daily training and caching of the in-house ML models

Monitor drilling parameter during drilling operation.
Monitor Mud level with increase or decrease.
Monitor Gasses appear with notification company man and driller.
Identify problems such as (Losses, gain, kick, twist off).
Made daily drilling report for company man.
Made final well report document for Well at end of well.
Active, helpful and co-operate with rig up and rig down.

Written optimized python code(PySpark) along with good number of unit testing cases.
Have written Hive Queries as per the requirements.
Primarily Worked on Hadoop environment.
Written the codes in Azure DataBricks.
Done Monitoring services such as Ambari.
Have made changes in the code as per the requirement.

Extensive working knowledge structured query language (SQL), python, spark, Hadoop, HDFS, AWS, RDBMS, data warehouses and document-oriented No-SQL databases.
Automated the process of downloading raw data into Data Lake from various sources systems like SFTP/FTP/S3 using shell scripting, which helps business users to use the data in the form of Job as a service, and query as a service.
Developed Hive Scripts for data parsing of raw data using EMR and store the results in S3 and ingest into Data warehouse(Snowflake), which is utilized by enterprise customers.
Designed ETL Jobs to process the raw data using Spark and python in Glue, EMR, and Databricks.
Implemented Spark jobs in Python in AWS Glue, which process and transform semi-processed data to processed data where data is utilized by data scientists.
Implemented connectors using python to pull raw data from various sources like Google DCM, DBM, AdWords, Facebook, Twitter, Yahoo, and Tubular also this is parsed using Spark framework and injected the data into Hive tables.
Worked with ETL tools like SSIS and reporting tools like SSRS, PowerBI, and tableau.

Worked on internal and External projects to design, develop and deploy QlikView applications.
The role was an opportunity to work best in class Client dashboard functionality with downstream integration systems
Create performance efficient data models and dashboards.
Perform all levels of application design, including data modelling, data transformation and dashboard development.
Work in partnership with the Data Warehouse team and other stakeholders concerning the accuracy of data and efficiency of processes.
As Qlikview System Administrator, I was responsible for hardware, operating system, and job schedules. Also worked on ODBC (Open Database Connectivity) and OLEDB (Object Linking and Embedding Database) connectivity exists between Qlikview server, client and database systems.
Converting reports QlikView dashboard.

Having direct knowledge of python programming.
Modify existing data and correct it, to adapt it to new visualization.
Advised my teammates to perform maintenance of the Analysis tool.
Analyze information to determine, recommend, and plan the installation of a new tool or modification of an existing system as per requirement.
Direct Python programming and analysis of Data.

54eb0244-d8df-474f-ad56-afa3b7242c00

Andrew Smith

Professional Summary

Employment history

Education

Skills

7848f5a4-11d4-4fd2-b0bc-58fd73401526

Andrew Smith

Professional Summary

Employment history

Education

Personal Competencies

Profile

Personal info

Skills

b6f245cc-905f-4e69-9617-1534995725a3

Andrew Smith

Professional Summary

Skills

Employment history

Education

debbb8a1-2fd5-4b09-8d7d-307897a34adc

Andrew Smith

Professional Summary

Employment history

Education

Accomplishments

Awards

Publications

Languages

Skills

e8fad1e4-d5ac-46e9-9e43-7285f5631916

Andrew Smith

Professional Summary

Employment history

Education

Skills

ca04a900-f52e-499a-9943-704a65c6450a

Andrew Smith

Professional Summary

Employment history

Education

Skills

e557999d-59b8-4c3d-863a-41914ae7d1cc

Andrew Smith

Professional Summary

Employment history

Education

Personal info

Skills

64e3be8f-d76f-43de-bbbc-28bcf9a91d12

Andrew Smith

Professional Summary

Employment history

Education

Personal info

Skills

data engineer

data engineer

data engineer

junior data engineer

data engineer

senior data engineer

data engineer

data engineer

data engineer

data engineer

data engineer

data engineer

data engineer

data engineer(trainee)

senior data engineer

data engineer

senior data engineer

data engineer

data engineer

data engineer

data engineer

data engineer