30e58b36-3a9b-4c5e-9e0e-e688bcab0143

Andrew Smith

287 Custer Street, Hopewell, PA 00000
[email protected]
(000) 000-0000

Professional Summary

Innovative, ambitious, forward-thinking Data Scientist with over 2 years of experience in the area of 
  • Good statistical and programming skills with the ability to work effectively with complex data
  • Strong programming skills in Python, SQL, Linux
  • Good Analytical, interpersonal and communications skills and proactive attitude
  • Good communication skills, with an ability to make advanced analytics concepts accessible to non-technical team members
  • Strong verbal and non-verbal communication skills, fluency in spoken and written English
  • Strong follow-through skills and acute attention to detail

Employment history

Data Science Intern, Huels Group. Ashleybury, Florida
Apr. 2020 – Present
1. I worked on the live CC-TV footage data using the Computer Vision techniques
2. Integrate multiple Screen cc-footage video data
3. Working on computer vision, image processing algorithms implementation, and testing
4. Work on image processing & computer vision
5. Work on object recognition and tracking, image registration, image calibration, and correction
6. Work on image processing algorithms, pattern recognition methods, and rule-based classifiers

Python Developer, Parker, Lynch and Jacobson. Haroldborough, Washington
May. 2019 – Jun. 2019
  • Correct errors by making appropriate changes and rechecking the program to ensure that the desired results are produced.

Organiser, Heathcote LLC. West Freddyfurt, New Jersey
Jan. 2019 – Feb. 2019
Key Objectives

  • To provide participants with an in-depth picture of the state-of-the-art in new and emerging areas of Multimedia Processing and Analysis
  • To provide research and application-oriented learning for a secure and healthy society using intelligent systems
  • To impart hands-on sessions on tools for biometrics, biomedical engineering, and intelligent multimedia
  • To provide the academic body of knowledge on IT-enabled transformation in public services

Education

Eastern D'Amore, New Nedborough, Indiana
B. Tech – M. Tech : 8 GPA, Data Science, Present

Western Bogan, Champlinville, Louisiana
Senior Secondary : 96.5%, Physics , Chemistry , Maths , Information Technology, Mar. 2018

East Schiller University, Skilesstad, South Dakota
Secondary : 10 GPA, Oct. 2016

Skills

Image Processing
Experienced

Python
Experienced

Machine Learning
Skillful

Data Structures
Experienced

Java
Skillful

data science intern

  •  Created a novel predictive model by implementing random forest regression followed by a classification model for survival analysis using AWS instances.
  • ETL and visual analysis of raw sensor data from MySQL server and feature engineering.

data science intern

  •  Ocr on medical bills to collect the exact data from medicine bills.
  •  Nlp on icd drg dataset,to get the correct drg code from a disease name.
  • Selenium automation for  getting data from state medical council.
  • Json api scraping from state medical council
  • Get correct explanation of different pharma medicine name and translation on different language.
  • Interactive graph making in Tablue.
  • Unstructured data cleaning using regex and making meta data.

data science intern

  • Report actionable, statistical, and analytical insights to executives for effective strategic positioning in the market place
  • Shadow data scientists and assist in developing algorithms for predictive modeling
  • Analyze and process sophisticated data sets using SAS, MySQL, and Excel
  • Write python scripts to automate everyday tasks 

data science intern

  • Collected data related to art galleries from all over the world and recorded them in an Excel sheet
  • Validated incoming data to check the accuracy and integrity of information while independently locating and correcting concerns
  • Performed data analysis using a decision tree algorithm
  • Analyze information to determine, recommend, and plan a learning path for employees using SQL, Python and Natural Language Processing.

data science intern

  •  Used Python to scrape, clean and analyze large datasets for Nudity Detection Project.
  •  Assisted Data scientists in developing algorithms for predictive modeling.  
  • Created a machine learning model with python to predict fake and spoofed images.
  • Prepared Dataset for Object detection using python, assisted in pre-processing of Object Detection model.
  • Wrote  ETL for tables for the KPI dashboard using BigQuery and populated the graphs from the table in Superset to monitor the new registrations on the 17-media app across the world. 

data science intern

  • Used Statistical tools to interpret datasets, paying particular attention to trends and patterns that could be valuable for diagnostic & Predictive Analytics efforts
  • Data modeling, Validation & Implementation of the models and tracking
  • Arranged and corrected research data to create representative graphs and charts highlighting results for better presentation and understanding
  • Applied Machine Learning models such as Logistic regression, Linear regression, Decision Tree Classifier, Random Forest Classifier, KNN, etc. to come up with best results
  • Worked on a few basic deep learning projects such as Age Detection of Indian Actors Data, Urban Sound Classification, Identify your Digits and a few more
  • Used Tensorflow and Keras to work on a few datasets

data science intern

  •  Development of some quick and useful Power BI dashboards to maintain and follow our models perfomance, marketing analysis and our main platform entries.
  • Worked a lot with our Databases, creating relationships, schemas, tables, index and complex queries.
  • Development of a complete Business Intelligence application on Django to support our sales team.
  • Data Analysis, Modelling and Visualization with Python.

data science intern

  • Time series data analysis and visualization.
  • Time series machine learning techniques for classification and prediction.
  • Developed an object detection model to automate store performance calculation using deep learning algorithms.
  •  Increased model accuracy by 10% to provide consistent and accurate predictions

data science intern

  • Data Wrangling with SQL on SSMS
  • Data Visualisation and Communication with Power BI.
  • Data Preprocessing(Cleaning) with Python
  • ETL Development(SSIS) and Data Analysis in Visual Studio.
  • Model Building and Evaluation, Time Series Analysis.

data science intern

  • Developed recommendation system for jobs on Fikka app starting by cleaning raw production data and created over 50 variables for analytics database
  • Used clustering algorithms like k-means and hierarchical clustering to cluster the data
  • Devised recommendation models using different machine learning algorithms and validated the models
  • Used an ensemble of models to improve the prediction accuracy upto 75%

data science intern

  • Involved learning the various processes like Data Mining involved in Analytics
  • Develop model for sraping data from websites
  • Use GUI based tools for Data Scraping
  • Analyse the collected data and identify trends
  • Aid in building a real estate pricing model. 

data science intern

  • Used PyTorch to train a neural network on a large fashion dataset.
  • Extracted features and computed euclidean distance between images to get similar images.
  • Built an image search engine using Flask framework and hosted it on a Google Cloud server.
  • Used TensorFlow to create an object detection model to recognize fashion products in a video and show relevant ads in a banner overlay with the help of OpenCV.
  • Made an android app for Smart TVs to allow users to search for visually similar products online for items seen on TV. (Used NanoHTTPServer and MediaProjectionAPI ).

data science intern

  • Optimizing the accuracy of NLP text classifiers for conducting a sentiment analysis on tweets relate dto the gold stock
  • Performing paper trading tests implementing 
  • Sentiment analysis of user review
  •  Identification of missing and rare terms in the drug labels.

data science intern

  • Data Wrangling with Google BigQuery
  • Exploratory Data Analysis and Data Pre-Processing with Python 
  • Data Visualisation on Datastudio 
  • Machine Learning Model(Supervised and Unsupervised) Building and Evaluation with Python
  • Repository interaction with Gcloud.

data science intern (remote)

  • Developing Computer Vision solution for face recognition system
  • Working on websockets for data communications.
  • Working on Nginx, Django and Django channels
  • Crawling and Cleaning of web data for drug labels and user reviews.

data science intern

  • Document search using ElasticSearch: Created a Flask API for document processing and storing them in ElasticSearch for search queries. Containerized this API using Docker for easier development, testing and deployment.
  • Model Serialization for H2O.ai models: Exploring and implementing serialization techniques for h2o machine learning algorithms using pickle, joblib and POJO (Plain Old Java Object). 
  • Learning advanced computer vision techniques, data analysis algorithms and production code.
  • Developed a pilot inventory management system to count stock and update database in factories and warehouses.

data science intern

  • Studied about the SLING framework by Google for parsing text using Natural Language Processing.
  • Extracted data from sources like PubMed and GENIA biomedical repositories using tools like Scrapy & BeautifulSoup.
  • Structured the data in proper format for Frame Semantic Representation.
  • Modified the framework and used it to parse medical documents for extracting entities from text data.
  • Optimized and tuned the Neural Network Model used in the framework to obtain better results.

data science intern

  • Successfully created the 1st product of Quantiphi – AthenasOwl which received positive reviews at the Google Next ’17, San Fransisco
  • Extensively used CNNs for building a location detection model and achieved shot level predicition accuracy of  98%
  • Created a Character recognition model for TV shows using face detection, feature generation from resnet50 and finally mapping it to the original video
  • Designed a proposal for predicting bus arrival using geolocation data

data science intern

  • Analyze large amounts of historical dictionary data of 20 years to find patterns of anomalies using Natural Language Processing.
  • Conduct future prediction of the words which will be in use 5 years from now.
  • Summarize all key information regarding investigation into detailed report that has to be delivered to supervisor.
  • Perform data modelling and visualization using python libraries

data science intern

  • Scraped tweets off twitter through twitter API
  • Applied text pre-processing techniques like tokenisation, stemming and lemmatisation
  • Applied feature engineering techniques like Bag of words models, TF-IDF vectorisation to get required features
  • Used cosine similarity to cluster statements which have similar sentiments
  • Applied Naive-bayes model and linear models
  • Got an accuracy of 87% on test dataset for the model

data science intern

  • Working on projects that span from Supervised & Unsupervised learning to NLP & Deep Learning. 
  • Implement ML Algorithms on Spark via PySpark. 
  • Use Elastic Stack for Log Analysis and Nagios for Infrastructure Monitoring. 
  • Developed REST APIs. 
  • Gained Exposure to Databases such as MongoDB and Postgres

data science intern

  • Developed and deployed a polynomial regression model to predict a date for when a given number of projects uploaded on www.pypi.org could be reached. 
  • Performed an exploratory data analysis on 30,000 properties in Berlin to determine price trends. 
  • Technologies/languages used: Python (BeautifulSoup, Requests, Pandas, Seaborn, Sklearn, Flask).
  • Developed code for implementing algorithms to detect outlier based anomalies in sensor readings using statistical methods and machine learning algorithms using Python

data science intern

  • Developed a graph based customer recommendation model for 30+ customers which resulted in creating aggregated business value of $20mn+ for our customers.
  • Primary responsibilities include initial research, designing graph data schema, gathering, cleaning, ingesting data and building a graph based recommendation engine.
  • Technologies used : Neo4j, Python, Keras
  • Performed multiple Exploratory Data Analysis on a given dataset.

data science intern

  • Developing a tool for Brand Logo Detection  to find influencers and content creators that are in sync with a brand’s audience for sponsored content    
  •  Building  and Deploying State of the art scalable Deep learning models to  flag Unsafe Youtube Channels to enable ease of deployment of ad campaigns  for  brands
  • Contributing to extraction  and  visualization of Channel and Video level data using Youtube Data API 
  • Sentimental analysis of youtube
  • Keyword extraction by using web scraping and natural language processing

data science intern

  • Creditmate, is a fin-tech startup based in Mumbai. It is registered under the name of Urja Money Private Limited. Creditmate provides loans on second hand two-wheelers in Mumbai and Pune.
  • Worked as a Data Science Intern for two months and my project was to analyze the credit risk involved and come out with helpful insights to minimize the losses incurred.
  • Creditmate extends loans to the financially backward sections as well, who are not eligible for loans in a bank and don’t have a CIBIL score to judge the risk involved in the transaction. So, using historical data of the company, the goal was to come up with a model which would predict the risk involved in a particular loan.
  • This involved assessing the loan applicant based on a certain amount of features, and then judging the asset (i.e. the two-wheeler) based on certain different amount of features and then combining the result to come up with a business oriented score to differentiate between a high risk and low risk customer.
  • Initially, did research on domain knowledge to come up with a list of parameters to be used to assess the applicant and the vehicle, this involved a lot of brain-storming and then used web-scraping to get the historical data needed as the training set for the model.
  • Used Random Forests and Decision Trees along with Logistic Regression to come up with a formula to predict the custom score for the company.
  • Was able to successfully evaluate the customer with an accuracy of 89%, however the internship was two months long so the solution could not implemented during that phase. 

data science intern

  • Articulate storytelling to business & product stakeholders
  • Share knowledge and insights with the Data Science team
  • Communication: Demonstrated ability to distill and share complex technical concepts to a less technical audience
  • Experience with visualization tool such as Tableau as well as with Python (matplotlib and seaborn)

data science intern

  • Research, document, and select alternatives approaches for modelling Vibration data
  • Explore feasibility of contact tracing via Geo-location tagging
  • Data gathering, cleaning and re-shaping for fitting model
  • Exploring alternate approaches for modeling vibration data such as scaleograms, spectral analysis, etc.

data science intern

  • OOP using Python 3, PDF Mining using pdfminor.six. PDF Vectorization, Data Analysis and Dataframe manipulation using pandas and Matplotlib. Worked with small sized team collaborating on bitbucket.
  • Good programming experience on Angular 6, Javascript.
  • Having experience on integration of third party libraries like YouTube, firebase, Firebase realtime database, web site hosting into Google firebase etc 
  •  Coding, Analysing and optimising in Django framework.  

data science intern

  • Data Science & ML – Have done data wrangling/munging/slicing, cleaning and translation, dealt with missing data, treated outliers to deliver impact analysis.
  • Data Visualization – Delivered insightful dashboards and reports which was used by stakeholder to take decisions.
  • Feasibility Analysis – Directly coordinated with Client partners to understand business requirements, convert them into business use-cases and analyze their technical implementation feasibility.
  •  Advanced Analytics – Hands on and well-versed with advanced analytical concepts like Probability, Distributions, Sampling & Estimations, Hypothesis Testing.