54eb0244-d8df-474f-ad56-afa3b7242c00
Andrew Smith
287 Custer Street, Hopewell, PA 00000
andrew_smith@example.com
(000) 000-0000
Professional Summary
Employment history
- Sql optimization of ETL jobs developed in the past to improve performance.
- Integration with Gigya API to get customer related data on daily basis. The data was kept in the table using SCD Type 6 to keep history on customer level.
- Worked on a pipeline to segregate the vouchers issued to customers on daily basis from normal payments and later build a dashboard to see the campaigns success ratio.
- Integration with XE.com to get daily exchange rates for the payments to process according to the exchange rates of the relevant date.
- I have worked on the creation of a consolidated table that contains all the live information related to a user for ease of use of higher management.
- Implemented Error Logging Module in all pipelines so that they can re run in case of any ambiguity and email is generated.
- Worked on the multi tenancy of ETL Jobs.
- Communicating directly with the clients and providing them insights to data through different customized dashboards keeping in view their requirements.
- Creating data pipelines using SSIS.
- Working on store procedures to create a data source for dashboards developed on in-house tools like RNA and Arcplan.
- Writing web services for linking Web and Desktop version of POS.
- Writing store procedures and developing invoice reports using crystal reports.
- Converting legacy code to vb.net and implementing mvvm architecture pattern.
Education
Skills
7848f5a4-11d4-4fd2-b0bc-58fd73401526
Andrew Smith
Professional Summary
Employment history
- Worked on several enterprise projects, developed efficient and effective solutions catering the business need.
- Worked on company’s product dealing with humongous data from designing the console to building it to a optimized one.
- Optimized project’s components by code refactoring, optimization and solved critical issues like cross-browser compatibility, latency, spam-mail and many more.
- Got training for mean stack development and made challenging projects for various stages of career path utilizing Angular, Node.js, Express.js, MongoDB and Bootstrap.
- Learned basics of Deployment through AWS’s EC2, Route53 and Nginx.
Education
Personal Competencies
Profile
Personal info
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Skills
b6f245cc-905f-4e69-9617-1534995725a3
Andrew Smith
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Professional Summary
- Hands-on, successful Software and Data Engineer with around 6 years of verifiable success in leading teams in delivering appropriate data solutions.
- Excellent understanding of Hadoop architecture, Hadoop Distributed File System and various components such as Name Node, Data Node, Job Tracker, Task Tracker, YARN and MR concepts, Pig, Hive.
- Experience in supporting data analysis projects using EMR, EC2, S3, Data Pipeline, RDS, Lambda, Glue, Athena on the Amazon Web Services (AWS) cloud.
- Good understanding of Spark internals and performance optimization techniques with hands-on experience in creating optimized spark jobs using Python, Spark, and SQL.
- Experience working with Hive – creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
- Comprehensive knowledge of SDLC, enterprise architecture, agile methodologies, cloud services, and web-based applications.
Skills
Employment history
East Shalon, North Carolina
- Full life cycle development including requirements analysis, high-level design, coding, testing, and deployment.
- Extensive working knowledge on structured query language (SQL), python, spark, Hadoop, HDFS, AWS, RDBMS, data warehouses and document-oriented No-SQL databases.
- Automated the process of downloading raw data into Data Lake from various source systems like SFTP/FTP/S3 using shell scripting and python.
- Developed Hive scripts for data parsing of raw data using EMR and store the results in S3 and ingest into data warehouse like Snowflake, which is utilized by enterprise customers.
- Designed ETL jobs to process the raw data using Spark and python in Glue, EMR, and Databricks.
- Used python to pull raw data from various sources like Google DCM, DBM, AdWords, Facebook, Twitter, Yahoo, and Tubular. Also this data is parsed using spark framework and injected the data into Hive tables.
- Implemented MapReduce programs using pyspark to parse out the raw data as per business user requirements and store the results in Data Lake (AWS S3).
- Implemented several data pipeline jobs to pull the raw data from different sources to AWS S3 bucket, then processed using pyspark in EMR cluster and store the processed data in AWS S3 bucket.
- Created Spark jobs as per business requirements, jobs run on EMR and are triggered by Lambda.
- Worked with ETL tools like SSIS and reporting tools like SSRS, PowerBI, and tableau.
Jarrettstad, Arkansas
- Writing automated scripts using Python for System Testing. GUI based testing using Selenium and Python.
- Took a leading role in test automation and manual testing, actively involved in the creation of detailed test plans test cases and test scenarios for different application modules according to functional requirements and business specifications.
- Responsible for conducting smoke, functional, UI, regression and ad-hoc testing.
- Facilitate the resolution of testing roadblocks, ensure execution of QA deliverables and guiding team members on agile standards and best practices.
- Regularly interact with management and product owners on project status, priority setting and sprint timeframe.
- Create test plans and test reports for multiple releases of various mobile applications. Coordinating the off-shore automation testing efforts and test cases by weekly review meetings.
- Established and reviewed QA sign off criteria, software build and test process with the scrum team.
- Assisted product development teams in the implementation of work plans and the production of review documentation.
Education
- Eastern Louisiana University – Fayburgh, Oregon
- Dooley Institute – Port Raphaelville, Nebraska
debbb8a1-2fd5-4b09-8d7d-307897a34adc
Andrew Smith
287 Custer Street, Hopewell, PA 00000
andrew_smith@example.com
(000) 000-0000
Professional Summary
Employment history
Working on ScyllaDB (Fast Processing DB) for message intact mechanism.
Contributed ideas and suggestions in team meetings and delivered updates on deadlines, design and enhancements.
Working on data science technologies for building machine learning models.
Working on DIP (Digital Insights Platform) for Data Ingestion.
Worked on Salesforce Web Pardot Development projects.
Education
Accomplishments
Awards
Publications
Languages
Skills
e8fad1e4-d5ac-46e9-9e43-7285f5631916
Andrew Smith
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Professional Summary
Employment history
Osinskimouth, Massachusetts
- Manage Snowflake Data Warehouse for Gartner Digital Markets team.
- Developed high frequency data integration with BigQuery for ingesting sessions and hits data.
- Provide advice on cost optimizations and process changes.
- Setup Airflow as ETL tool to migrate existing ETL tool in effort to reduce costs by up to $25,000.
Collierton, South Dakota
Project – AdTech Targeting
- Wrote the script to fetch bid data from MediaMath’s Firehose using Japronto framework, AWS Load Balancer, and EC2 instances
- Handled 2gb/min data using 1 ELB and 8 EC2 instances;
- Each bid is 1 request and received ~1 million requests per minute Implemented file rotation logic to create a new file every 2 minutes and wrote the script to upload the same on AWS S3 bucket
- Developed connector for pushing audiences to Ad networks Implemented data preparation logic using PySpark to optimize performance and reduced upload time from 6 hours to 20 minutes.
- Implemented data preparation, transformation and update logic pipelines.
- Integrated Four Square API to fetch POI attributes.
- Developed multithreaded integration to reduce reduce data ingestion time from 4 hours to 30 minutes.
- Developed Python lambda function to trigger response based on text received from Alexa.
Development and Maintenance of Ad-network Plugins (Extract-
Transform-Load)
- Developed 4 plugins by integrating APIs and SDKs of multiple Ad Networks and affiliates to fetch data for marketing analytics
- Extracted and Transformed the marketing data as per business requirements; handled possible edge cases too
- Ensured quality check multiple times before delivering the plugin
- Used Phabricator for plugins delivery, and fixing existing plugins issues and Jira to solve over 140 tickets related to around 50 plugins
- Achieved 100% unit-test code coverage, no lint issues and 0% technical debt; followed the best coding practices (DRY, SOC, etc.)
- Create weekly, monthly and Yearly spend/acquisition Looker dashboard for C level executives in Growth Marketing Team.
- Write and evaluate complex SQL , perform Data Analysis and extract presentable insight from scrubbed data.
- Perform data cleanup.
Education
- Northern Cole Institute – Port Melanieport, Rhode Island
Skills
ca04a900-f52e-499a-9943-704a65c6450a
Andrew Smith
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Professional Summary
My main responsibilities –
1. Gather proper understanding of client requirements.
2. Provide optimize designs for clients projects
3. Develop and troubleshoot applications.
4. Built, deploy, run and test application in UAT and Dev environments
5. Guide and help juniors to grab technologies
Employment history
Brandonchester, Washington
Projects
BBO_CUPR
Technologies : hadoop 2.6, hive 13, spark 2.1, sqoop 1.4, jenkins, shell, maven, scala 2.1
Ingest structured and unstructured data from different sources like RDBMS and NAS to Hadoop environment, apply filter/parsing logic on RAW data and dump into hive tables. On top of the filtered data business logics are applied and moved to the final location and loaded into hive tables. From the final location data can be exported to an external location using sqoop and scp.
New Joane, Missouri
Project: Ingestion Framework(1.0) – Product Development
Environment: Hadoop 2.8.1, Java 1.8, Scala 2.16, Spark 2.10, U, MySql, Google Cloud
This project contains the development of an Ingestion framework which can ingest any kind of structured and unstructured data into Hadoop system from different data sources like NAS and RDBMS, the framework is dynamic enough to add new source system as required. Validator modules are introduced to validate the ingested data
Environment : Hadoop 2.6, Java 1.8, Scala 2.16, Spark 2.10, Google Pub sub, Google DataStore, unix, Spring boot
Evelynnside, Illinois
Project: Patient Centered Care Model (PCCM) – Provider search
Environment: MapR 3.0.1 (Hadoop 0.20.2, Hive 0.12.0, Sqoop), JDK 1.6, MySQL, Subversion, Maven, AnthilPro, Artifactory
Environment : Cloudera 5.2, Hive 0.12.0, Tableau, Bedrock 3.2.2, Maven, Spark, Yahoo Query Language, Java 1.7
In this project, companies stock related data and revenue data are extracted using the ticker name from finance site and ingested into hadoop using Bedrock(Custom product). Stock tweets are extracted and processed using machine learning and perform sentiment analysis with help of ingested stock data. Final results are visualized using Tableau desktop
Environment: Hadoop 2.5.1-mapr-1501, hive 0.13.0, Java 1.7, Sqoop 1.4.5, Sqlite3, Shell script
This project is divided into different modules those can capable of ingesting structured data from different sources into hadoop file system. ingested data are validated and filtered. Different business logics are applied on filtered data and moved to publish locations. Sqlite DB is used to keep track of process flow, Proper notification are send during the processed to the client.
Submodules –
RDBMS data ingestion
Structured data from different databases are ingested using sqoop,
File movement –
Structured files(Delimited and fixed length) are ingested from different remote clusters using proper load balancing,
Environment: Hadoop 2.6.0, hive 0.14.0, Java 1.8, Spark, Sqoop, Unix Scripting, Draw2D,Spring, Hibernate, Elastic-Search
This project consists of the development of Data lineage functionality in Bedrock product. Data lineage can track the transition of data inside hadoop file system those are triggered using hive,sqoop, mapreduce, spark and shell. User can see those transition in visual UI. It is also capable of connecting to Apache atlas and Navigator to visualize the lineage in advance stage.
Education
- Harris University – West Chuck, Wyoming
Skills
e557999d-59b8-4c3d-863a-41914ae7d1cc
Andrew Smith
Professional Summary
Employment history
- Create a Data Lake on AWS environment using Spark (Pyspark) on AWS Glue for ETL Jobs, AWS Lambda for automation and triggering jobs, Athena are the primary source for data store and S3 for storage.
- Develop data models for applications, metadata tables, views or related database structures.
- Document and communicate database schemas, using accepted notations.
- Document search using ElasticSearch: Created a Flask API for document processing and storing them in ElasticSearch for search queries. Containerized this API using Docker for easier development, testing and deployment.
- Model Serialization for H2O.ai models: Exploring and implementing serialization techniques for h2o machine learning algorithms using pickle, joblib and POJO (Plain Old Java Object).
- Survey Application Programming, Lead: Designed and developed the front and back end framework to implement the business logic used for assessing social desirability of employees within an organisation. (https://mindofn.games/faultinme)
- Dashboard Designing, Developer: Used tableau to visualize key performance indicators and presented it as a story. Integrated High Charts (Visualization library) in an existing project.
- Database developer: Develop and maintain archived procedures, procedural codes and queries for existing applications.
- Automation, Developer: Identifying and programming tools to automate data processing to optimize existing processes.
- Platform Development, Developer: Platform development (Web development) using PHP (CodeIgniter) as the server side language, SQL server as the database and HTML/CSS (Bootstrap Framework) and JQuery for the front-end development.
- Production Testing, Tester: Test the functionality of new features and services added to the platform.
Education
Personal info
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Skills
64e3be8f-d76f-43de-bbbc-28bcf9a91d12
Andrew Smith
Professional Summary
Employment history
- Select programming languages, design tools, and applications for developing an upgrade for kindle as a part of ACMS program of Amazon.
- Develop and document style guidelines for web site content.
- Design and build web sites, using scripting languages, content creation tools, management tools, and digital media.
- Develop databases that support web applications and web sites.
- Maintain understanding of current web technologies and programming practices through continuing education and reading.
- Write, update, and maintain computer programs to handle specific jobs such as finding synonyms of a word, recreating text into several forms of identical meaning and finding the best fit for the same sentence.
- Conduct trial runs of programs and software applications to be sure they will produce the desired information and that the instructions are correct.
- Write, analyze, review, and rewrite programs, applying knowledge of computer capabilities, subject matter, and symbolic logic.
- Apply story development, directing, cinematography, and editing to animation to create storyboards that show the flow of the animation and map out key scenes and characters.
- Script, plan, and create animated narrative sequences under tight deadlines, using computer software.
- Make objects or characters appear lifelike by manipulating light, color, texture, shadow, and transparency, or manipulating static images to give the illusion of motion.
- Providing controls and allowing interesting actions to player to be able to lead the game to where the player wants.
- Analyze user needs and recommend appropriate hardware.
- Test and verify hardware and support peripherals to ensure that they meet specifications and requirements, by recording and analyzing test data.
- Monitor functioning of equipment and make necessary modifications to ensure system operates in conformance with specifications.
- Confer with team members and consult specifications to evaluate interface between hardware and software and operational and performance requirements of overall system.
Education
Personal info
Phone:
(000) 000-0000
Email:
andrew_smith@example.com
Address:
287 Custer Street, Hopewell, PA 00000
Skills
data engineer
- Constructed data pipeline services(APIs) for client application in 3 months. Technology used: NodeJs
data engineer
- Sql optimization of ETL jobs developed in the past to improve performance.
- Integration with Gigya API to get customer related data on daily basis. The data was kept in the table using SCD Type 6 to keep history on customer level.
- Worked on a pipeline to segregate the vouchers issued to customers on daily basis from normal payments and later build a dashboard to see the campaigns success ratio.
- Integration with XE.com to get daily exchange rates for the payments to process according to the exchange rates of the relevant date.
- I have worked on the creation of a consolidated table that contains all the live information related to a user for ease of use of higher management.
- Implemented Error Logging Module in all pipelines so that they can re run in case of any ambiguity and email is generated.
- Worked on the multi tenancy of ETL Jobs.
data engineer
- Worked directly with clients to get the requirements and provide them results .
- Worked as Project Lead and manage the work as per client requirements
- Provided optimized way to process the data using USQL
- Working for the client Altria for creating scripts using pySpark and Databricks.
- Managed design of dynamic widgets focused on [Area].
junior data engineer
- Documenting issues encountered for faster backtrack and resolutions
- Deliver updates based on analytics
- Solved performance issues in Spark and Hive scripts with understanding of Joins, Group and aggregation.
- Performed Data loads on the requests of the user and effectively dealt with Business users catering to their requirement vis-à-visessential data delivery and quality issues .
data engineer
- Provided Data Warehousing and Business Intelligence services to a leading global telecommunications company for running their marketing campaigns and driving their data-driven insights with Abinitio as the ETL tool.
- Was a Member of the PayGo Run & Operate Team that was responsible for the integration, implementation, and management of the data warehouse for timely delivery of reports and extracts to the business users.
- Scheduling and handling workflows in control-M along with delivering the daily reports to the users.
- Working on the Nucleus R&O database, analyzing business needs of clients, developing effective and efficient solutions and ensuring client deliverable within committed timelines
- Was responsible for the implementation of batch schedules. Perform job end diagnosis and resolution with the help of UNIX and Teradata.
- Bug fixing and script enhancements using UNIX. Solving operational issues of the jobs/workflows to ensure data is up to date and meeting the business requirements.
- Resolved Application dockets/incidents with high priorities raised by business users and the request of users.
senior data engineer
- Text analytics pipeline using spark, Kafka, Scala.
- building a pipe line for continues and real-time pipeline Text analytics (sentiment analysis score, Entity Extraction, Top asked question)
- Gather chat data from the chatbot and create a pipeline to do text analysis at real-time.
- Spark configuring for the performance of jobs
data engineer
- Write, update, and maintain computer programs to handle specific jobs such as finding synonyms of a word, recreating text into several forms of identical meaning and finding the best fit for the same sentence.
- Write, analyze, review, and rewrite programs, applying knowledge of computer capabilities, subject matter, and symbolic logic.
- Innovating existing framework in Talend. Also, enhancing YAML, Shell Script, Scala and Etc
- Automating manual process using Unix
data engineer
- As a Lead data engineer understand the current problems of Fortune 500 companies data and make them resolve in a mathematically manner
- As a active member of R&D team involved on creating different prototypes by using Big data as well as AI &ML
- Training the companies freshers for the upcoming technology in Big data to enable them for the client facin• Lead a team member of 10 people and help them of the breakdowns happened in the data and analyze them
- Handling other Data engineering projects as part of R&D team to help them to grow.
- Works in cloud technology more indepth in AWS and integration of spark.
- Developed text pre-processing code to clean, format, process unstructured data in Spark and Scala
- Developed Spark-SQL/Hive scripts for end user / analyst requirements to perform analysis.
data engineer
- Solve ticket, maintaining performances
- Creating POC for potentials needs
- Analyzing and fixing the technical issues and also providing support on dashboard.
- Making sure every JIRA tickets are up to date
data engineer
- Design and implement end to end ERP system to increase operational efficiency and provide increased oversight on production activities
- Create SQL data schematics for an end to end ERP system.
- Design predictive analytics algorithms and in order to optimize KPI’s.
- Automate report generation on daily production, financial stansdings and merchandising.
data engineer
- Data acquisition
- Identify ways to improve data reliability, efficiency and quality
- Use large data sets to address business issues
- Deploy sophisticated analytics programs, machine learning and statistical methods
- Prepare data for predictive and prescriptive modeling
- Find hidden patterns using data
- Use data to discover tasks that can be automated
data engineer
- Implemented database as per the designs and data models when new customer is added
- User Acceptance Test (UAT) whenever new customer is added
- Modified existing databases to meet unique needs after interacting with users
- Monitor processing to ensure information is obtained daily with in SLA
- Generated reports using reporting tool and delivering it to users on daily basis once processing done
- Collected and organized the data in Data warehouse for future analysis and report generations
- Generated Alerts to users to manage on-shelf availability up to 90%
data engineer
- ETL Developer
- Developing jobs for ingestion and extraction using Talend Studio
- Providing Estimation Model for every project
- Conducting Test Case Expected Result for Deployment
- Preparing Deployment Workplan and Operation Guide for Deployment
- Investigating the bugs and defects for ingestion and extraction jobs
- Providing solution/hotfix for the errors encountered in the jobs
data engineer(trainee)
- Result-driven Data Engineer good in Data Analysis, Python programming and Its libraries like Numpy, Pandas.
- Collaborated with team members during projects to achieve concrete goals on a strict deadline .
- Having proper knowledge of Python Programming, Tableau, Microsoft Power Bi, Excel.
- Other proficiency includes c#, Mvc.Net, HTML, RPA.
- Developed new reports where suitable.
- Perform study on Rate of survival, demand, income, city and charges for Titanic Dataset.
senior data engineer
- Instrumental in designing and building data pipelines for data driven application development for internal customers.
- Responsible for creating a solution for marketing department to analyse Sotheby’s market share to leverage data available publicly from the competitors.
- Implemented a dashboard for Auction Bidding department to monitor customer-bidding process. Implemented Python visualisation techniques to model customer behaviour that directly led to process changes.
- Understanding business needs and applying it to developed
data engineer
- Design data pipelines using Python v3.6 and libraries such as boto3(AWS apk library) in conformation with the PEP-8 coding standards. Work with AWS cloud services as this was the infrastructure
- Enhance the unit test suite for the data pipeline code using the unittest library.
- Codebase maintenance and management using Gitlab.
- Deploying and scheduling data pipelines on CI tool TeacmCity.
senior data engineer
- Built a Streaming Data Platform from the ground up. From initial proof of concept to A/B bench-marking to systems integration to testing to production.
- Worked closely with the Ops team for rapid systems building.
- Defined Fault tolerant processes with bounded retries and error semantics with a goal of automated error identification, resolution and replay from source.
- Created processes around schema evolution and release management for data flows.
- Participated in events modelling and unifying semantics used by Dev and Data teams.
- Streamlined the Predictive modelling framework for automated daily training and caching of the in-house ML models
data engineer
- Monitor drilling parameter during drilling operation.
- Monitor Mud level with increase or decrease.
- Monitor Gasses appear with notification company man and driller.
- Identify problems such as (Losses, gain, kick, twist off).
- Made daily drilling report for company man.
- Made final well report document for Well at end of well.
- Active, helpful and co-operate with rig up and rig down.
data engineer
- Written optimized python code(PySpark) along with good number of unit testing cases.
- Have written Hive Queries as per the requirements.
- Primarily Worked on Hadoop environment.
- Written the codes in Azure DataBricks.
- Done Monitoring services such as Ambari.
- Have made changes in the code as per the requirement.
data engineer
- Extensive working knowledge structured query language (SQL), python, spark, Hadoop, HDFS, AWS, RDBMS, data warehouses and document-oriented No-SQL databases.
- Automated the process of downloading raw data into Data Lake from various sources systems like SFTP/FTP/S3 using shell scripting, which helps business users to use the data in the form of Job as a service, and query as a service.
- Developed Hive Scripts for data parsing of raw data using EMR and store the results in S3 and ingest into Data warehouse(Snowflake), which is utilized by enterprise customers.
- Designed ETL Jobs to process the raw data using Spark and python in Glue, EMR, and Databricks.
- Implemented Spark jobs in Python in AWS Glue, which process and transform semi-processed data to processed data where data is utilized by data scientists.
- Implemented connectors using python to pull raw data from various sources like Google DCM, DBM, AdWords, Facebook, Twitter, Yahoo, and Tubular also this is parsed using Spark framework and injected the data into Hive tables.
- Worked with ETL tools like SSIS and reporting tools like SSRS, PowerBI, and tableau.
data engineer
- Worked on internal and External projects to design, develop and deploy QlikView applications.
- The role was an opportunity to work best in class Client dashboard functionality with downstream integration systems
- Create performance efficient data models and dashboards.
- Perform all levels of application design, including data modelling, data transformation and dashboard development.
- Work in partnership with the Data Warehouse team and other stakeholders concerning the accuracy of data and efficiency of processes.
- As Qlikview System Administrator, I was responsible for hardware, operating system, and job schedules. Also worked on ODBC (Open Database Connectivity) and OLEDB (Object Linking and Embedding Database) connectivity exists between Qlikview server, client and database systems.
- Converting reports QlikView dashboard.
data engineer
- Having direct knowledge of python programming.
- Modify existing data and correct it, to adapt it to new visualization.
- Advised my teammates to perform maintenance of the Analysis tool.
- Analyze information to determine, recommend, and plan the installation of a new tool or modification of an existing system as per requirement.
- Direct Python programming and analysis of Data.