data engineer

  • CREATED PYTHON PROGRAM FOR JOB MONITORING RUNNING IN HADOOP.
  • CREATED PROGRAM TO KEEP TRACK OF ALL TICKETS CREATED IN JIRA,TO TRACK THE AREA OF CONCERN FOR OUR TEAM.
  • Creating Data pipelines using SCDF on cloud foundry.
  • Deploying CDH cluster to process terabytes of data via Spark.

senior data engineer

  • Design and build data processing pipelines using tools and frameworks in the Hadoop/Spark ecosystem
  • Implement and configure big data technologies as well as tune processes for performance at scale
  • Manage, mentor, and grow a team of big data engineers
  • Collaborated with other teams. Leading cross-teams solution integrations

senior data engineer

  • Ability and interest in translating business requirements into technological visions, and ​re-imagining ​​data-processing technologies
  • Working on ETL and data transformation of medical claims data
  • Design and support of the medical claims (Big) Data Warehouses
  • Delivering and implementing successful solutions for sustained client impact 
  • Developing complex algorithms and proprietary analytics in hive/Python based on discussions with client and the service delivery team
  • Has supervised and mentored colleagues in different areas and projects

senior data engineer

  • Worked on data transformation from raw data to hive (HDFS) tables using python and spark. Performed Hadoop cluster management on HDInsight Cluster. 
  • Configured Stream analytics for Azure Stream analytics for ingesting Sensor data into Azure storage. 
  • Implemented several data pipeline jobs to pull the raw data from different sources to AWS S3 bucket, then processed using pyspark in EMR cluster and store the processed data in AWS S3 bucket.
  • Created Spark jobs as per business requirements, jobs run on EMR and are triggered by Lambda.

data engineer

  • Full life cycle development including requirements analysis, high-level design, coding, testing, and deployment.
  • Extensive working knowledge on structured query language (SQL), python, spark, Hadoop, HDFS, AWS, RDBMS, data warehouses and document-oriented No-SQL databases.
  • Automated the process of downloading raw data into Data Lake from various source systems like SFTP/FTP/S3 using shell scripting and python.
  • Developed Hive scripts for data parsing of raw data using EMR and store the results in S3 and ingest into data warehouse like Snowflake, which is utilized by enterprise customers.
  • Designed ETL jobs to process the raw data using Spark and python in Glue, EMR, and Databricks.
  • Used python to pull raw data from various sources like Google DCM, DBM, AdWords, Facebook, Twitter, Yahoo, and Tubular. Also this data is parsed using spark framework and injected the data into Hive tables.
  • Implemented MapReduce programs using pyspark to parse out the raw data as per business user requirements and store the results in Data Lake (AWS S3).

data engineer

  • Create a Data Lake on AWS environment using Spark (Pyspark) on AWS Glue for ETL Jobs, AWS Lambda for automation and triggering jobs, Athena are the primary source for data store and S3 for storage. 
  • Developed a ETL(Extract, Transform, Load) functionality for time-based data files and generating the access anomalies for all employees across the company
  • Developed an ETL project for transforming and visualizing log data using Elastic Search and Kibana
  • Designed and developed a web application for employee access management/reporting across the company – Full stack

data engineer

  • Create ETL Jobs, maintain it and supervise it.
  • Handle clients needs, demands and helping them reach a realistic deadline and needs
  • Understanding data and business needs and match it to the technology in use.
  • Lead a team of other data engineers and developers at the client premise.
  • Develop dashboards and visualizations  for data.
  • Run scripts and query for data processing.
  • Coordinate work in the company and with competing companies in the same project

senior data engineer

  • Designing, developing and maintaining highly-available java server, which handled and preprocessed data coming from thousands connections each second.
  • Designing/Developing concurrent data migration tool based on JavaRx Streams
  • Designed load test framework which emulated millions of connections
  • Engaged in production deploys and ecosystem management in AWS

data engineer

  • I working on artificial intelligence and machine learning sometimes called machine intelligence, is intelligence demonstrated by machines.
  • In Machine learning is the scientific study of algorithms and statistical models that computer systems use to perform a specific task
  • All project are consist of images, videos, image pattern these data are perform a specific task is called as annotation these annotation data after convert into json and send back into to the client.
  • Performed data analysis, documentation, implementation

data engineer

  • Build ETL pipelines (Data Extraction, Transformation and Loading) for collecting data, data decomposition and push to models.
  • Built & Train models (ML models)
  • Used Redis and Cassandra as DB for data storage.
  • Build docker images for tools, models used in this project and deployed
  • Used Git as a version control tool, configured and used Jenkins for Continuous Integration/ Continuous Deployment.
  • Implemented Elasticsearch/ELK to monitor applications as well as Nginx and system logs which can be viewed on Kibana for easy troubleshooting.

data engineer

  •  Involve and handle Jr. Data Engineer responsibilities 
  • Support Jr. Data Engineer 
  • Implement data services solutions (Data Ingestion, Data Processing, APIs, Computations) 
  • Implement data schemas and structure 
  • Implement and develop data quality control 
  • Business intelligence and report generation Development 
  • Develop data set processes 

data engineer

  •  Led efforts to modernize processes of the manufacturing BI team, such as cloud migration and enabling Machine Learning capabilities beyond tradition analyses and visualizations
  •  Initially finished developing an ETL architecture based on Apache Spark in an EMR cluster, but after experimentation and research, successfully pitched and developed a more powerful, fully serverless architecture
  •  Took initiative to find and develop machine learning projects, resulting in a project which was picked up in the plywood division
  •  Currently working on creating an architecture for machine learning model deployment and retraining

data engineer

  • Designing and developing data models in line with the client’s business needs 
  • Designing, creating, testing and maintaining the complete data management system
  • Taking care of the entire ETL process
  • Ensuring the architecture meets the business requirements
  • Working closely with the stakeholders and solution architect
  • Improving data quality, reliability & efficiency of the individual components & the complete system

data engineer

  • Extracted and cleaned large datasets from SQL server to acquire necessary components for analysis
  • Maintained company website that determined performance analysis on appliances utilizing Python, HTML, CSS3, Javascript and SQL alchemy 
  • Built automatic weblog analysis Analysed web server access logs files
  • Finally pushing the datasets developed to dashboard so that insights can be pulled in on top of data to drive business better.

data engineer

  •  Works closely with product owner to develop the work plan for Data Migration by understanding the business requirements of the project.
  •  Work with Project Management to provide timely estimates, updates & status .
  • Understanding the complete pipeline, dependencies and flow of data by creating data model while working closely with the product owner.
  • Fetching data in either structured or unstructured format from different sources(e.g. JDBC, sftp server, etc.) and in different formats(doc,txt,csv,gz,etc).
  • Creating metadata for different datasets pulled in to provide a proper schema and datatype as per requirement and analysis made.
  • Developing scripts using Pyspark,Python and SQL for cleansing of data, applying several business logic by joining of datasets and produce final datasets as per requirement.
  • Applying Proper scheduling and mointoring on the datasets to reduce manual effort.

data engineer

  • Worked closely with the business leads to understand and finalize the requirements with consideration to best practices and reporting needs
  • Taken complete ownership for the assigned task and timely inform the progress of the project
  • Knowledge of Unix and shell scripting skills are a plus
  • Statistical analysis of web server logs Role: Developer | Spark streaming, Scala

data engineer

  • As a data engineer, my daily task includes working with lots of data
  • Implemented BI solution framework for end-to-end business intelligence projects.
  • Created dashboard for better data Visualisation.
  • Used data visualization tool such as Google Data Studio,PowerBI

data engineer

  • Google Cloud Certified Professional Data Engineering professional with 7+ month of expertise in the area of Big Data Analytics,Data Integration, Data Preparation, Data Visualization and developing modern data platform over Public cloud platforms.
  • Extensive experience in Python scripting
  • Expertise on Data Visualization with Google Data Studio.
  • Expertise on RDBMS systems like SqlServer & Mysql.
  • Data analytics solutions on Google cloud platform.
  • Working knowledge on Big Data processing technologies.
  • An Accountable team player.

data engineer

  • Developed strategies and algorithms to automate PCI Compliance in Hadoop Environment.
  • Big Data Migration from local to cloud environment
  • Implemented Benfords law with KL Divergence to detect potentially fraudulent datasets using python
  • Applied Decision Tree Regression to predict monthly debt (+data wrangling and feature selection) using python
  • Implemented test driven development by designing unit and integration tests.

data engineer

  • Design, construct, install, test and maintain data management systems.
  • Integrate up-and-coming data management and software engineering technologies into existing data structures.
  • Develop set processes for data mining, data modeling, and data production.
  • Collaborate with members of your team (eg, data architects, the IT team, data scientists) on the project’s goals.

data engineer

  • Prepare script and modify existing scripts as per requirement using Java.
  • Write queries in PostgreSQL to resolve data issues.
  • Design and creation of relational and non-relational database schemas.
  • Worked on automated data model project.
  • Configure and check SSO on servers using various frameworks using LDAP, SHIB.

data engineer

  • Worked for Client DSM in their Data Management Initiative 
  • Participate in the end-to-end life cycle of MDM implementations
  • Utilize data expertise, business knowledge and technical skills to successfully deliver Data Management initiatives like Vendor Master
  • Deliver training and provide knowledge transfer to end user clients
  • Apply performance, tuning enhancements and house-keeping activities
  • Work with clients to identify new areas within the business where the built solution can be utilized to drive business results
  • On-Site lead, On-shore/Off-shore way of working

data engineer

  • Used text-mining process of reviews to determine customers concentrations. 
  • Delivered result analysis to support team for hotel and travel recommendations. 
  • Designed Tableau bar graphs, scattered plots, and geographical maps to create detailed level summary reports and dashboards. 
  • Developed hybrid model to improve the accuracy rate. 

junior data engineer

  • Development of web-scrapers based on the Scrapy and custom(Requests, bs4, Mechanical soup).
  • Managing API(DRF) and DB (PostgreSQL, MongoDB)
  • A few exp. with Celery
  • Determined the most accurately prediction model based on the accuracy rate. 

data engineer

  • As a Data Engineer I have successfully designed and implemented the data load for multiple revenue sources of the bussiness. 
  • Optimized long running jobs.
  • Have supported data base architects, data analysts and data scientists.
  • Implemented automations for data extraction from  with shell scripts and python.
  • Assembled large,complex data sets that met the functional/ non – functional business requirement.

data engineer

  • Automated business process by implementing statistical models and reduced the manual efforts.
  • Have implemented data engineering concept with Hadoop Big Data ecosystem.
  • Participated in development and working on multiple applications/Modules.
  • Worked with Amazon Redshift tools like SQL workbench/J, PG Admin, DB Hawk, Squirrel SQL. 

data engineer

  •  Achieved building highly reusable core recommender application used by 5 different projects.
  • Worked with team leader to introduce data pipelines that drastically save up 20x storage usage of target in-memory system.
  • Authored and concisely presented documentations to 3 project owners.
  • Upgraded existing news crawl tool with significant misses reduction of 95%.
  • Coordinated with testers and provided effective solutions in terms of APIs performance and data-related systems.
  • Involved in multiple decision-making businesses with data analysts in addition to offering data-driven reports collected from top traffic websites.
  • Researched and assessed ETL technologies to apply in the upcoming products. 

senior data engineer

  • Responsible for creating migration scripts in Oracle to migrate data from one db to another
  • Responsible for creating validation process to make sure 100% data accuracy and movement
  • Document all test procedures for systems and processes and coordinate with business analysts and users to resolve all requirement issues and maintain quality for same. 
  • Worked on Automating the provisioning of AWS cloud using cloud formation for Ticket routing techniques. 

data engineer

  • Worked on several enterprise projects, developed efficient and effective solutions catering the business need.
  • Worked on company’s product dealing with humongous data from designing the console to building it to a optimized one.
  • Optimized project’s components by code refactoring, optimization and solved critical issues like cross-browser compatibility, latency, spam-mail and many more.
  • Worked with ARIMAX, Holt Winters VARMAX to predict the sales in the regular and seasonal intervals. 

data engineer /machine learning engineer

  • Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC). 
  • Performed data ETL by collecting, exporting, merging and massaging data from multiple sources and platforms including SSRS/SSIS (SQL Server Integration Services) in SQL Server. 
  • Worked with cross-functional teams (including data engineer team) to extract data and rapidly execute from MongoDB through MongoDB connector. 
  • Performed data cleaning and feature selection using Scikit-learn package in python. 
  • Partitional clustering into 100 by k-means clustering using Scikit-learn package in Python where similar hotels for a search are grouped together. 
  • Used Python to perform ANOVA test to analyze the differences among hotel clusters. 
  • Implemented application of various machine learning algorithms and statistical modeling like Decision Tree, Text Analytics, Sentiment Analysis, Naive Bayes, Logistic Regression and Linear Regression using Python to determine the accuracy rate of each model.