
Professional Summary
I successfully combined my studies with work and other commitments showing myself to be self-motivated, Organized and able of working under pressure. I have a clear, logical mind with a practical approach to problem solving and a drive to see things through to completion. I enjoy working on my own initiative or in a team. I am reliable, trustworthy, hardworking and eager to learn.
Employment history
Site Reliability Engineer, Carroll, Legros and Bogisich. West Cordellberg, Missouri
Apr. 2018 – Present
- Develop tools and advanced automation features that continuously reduce the need for manual intervention
- Automate the deployment and monitoring of the assigned SAP Hybris cloud services across all the relevant infrastructures (private and public cloud)
- Perform troubleshooting to quickly resolve the issues per documented procedures
- Document root cause analysis reports and develop standard operating procedures
- Provide knowledge management in the team and between operations and development
- Improve system landscape support in the assigned functional areas of responsibility
- Ensure excellent quality and maintain high standards in all processes listed in the relevant process map - Collaborate with engineering and product management as well as other service groups
Platform Monitoring Engineering, Mills Inc. New Kiethview, Alaska
Jun. 2017 – Jul. 2017
- Be responsible for the design, support and maintenance of a large monitoring infrastructure
- Relevant knowledge of monitoring industry standards and the use of analytics in an operational setting
- Design, implement and maintain a comprehensive monitoring solution
- Collaborate with Engineers, Operations and other teams to ensure application, network and system monitoring best practices
- Ensure thorough and complete monitoring of all environments and layers
- Innovate techniques for visualizing large amounts of complex, real-time data in a simple, elegant manner for users
Senior Web Developer, Gulgowski Inc. Kennyside, Kentucky
Feb. 2016 – Mar. 2016
- Design, build, or maintain web sites, using authoring or scripting languages, content creation tools, management tools, and digital media.
- Write, design, or edit web page content, or direct others producing content.
- Analyze user needs to determine technical requirements.
- Maintain understanding of current web technologies or programming practices through continuing education, reading, or participation in professional conferences, workshops, or groups.
- Select programming languages, design tools, or applications.
- Build web services to handle cross platform request
Software Engineer, Dicki and Sons. Port Gus, Rhode Island
Oct. 2014 – Jan. 2015
- Design an intelligent, portable product, which once positioned would produce pedestrian and vehicle counts, whenever, wherever.
- Systematic & dynamic problem-solving with the ability to assess various constituent group needs.
- Update knowledge and skills to keep up with rapid advancements in computer technology.
- Monitor functioning of equipment and make necessary modifications to ensure system operates in conformance with specifications.
- Analyze information to determine, recommend, and plan layout, including type of computers and peripheral equipment modifications.
Education
Lockman College, Salvadorshire, Tennessee
Master of Science, Computer Science, Present
Northern Georgia Academy, Port Kera, Vermont
Bachelor of Science, Computer Science and Engineering, Jan. 2015
North Langworth Academy, Schulistburgh, Maryland
Diploma of Engineering, Computer Engineering, Sep. 2012
Skills
Object Oriented Programming, Script Writing
Web Development, Automation
Solution Architect
Linux, Windows, AWS, Azure
Google Cloud Platform
Ansible, Jenkins with Maven
Docker, Kubernates
GO, Python, NodeJS
C, C++, Perl, PHP, Java, HTML, CSS, JS, AJAX
Relational Database
Andrew Smith
Phone:
(000) 000-0000
Email:
[email protected]
Address:
287 Custer Street, Hopewell, PA 00000
Professional Summary
Seeking opportunities as DevOps Engineer, Senior Service Reliability Engineer, Service
Engineer or System Admin.
Extensive systems and production service architecture troubleshooting, DNS, SMTP,
SSH, Host-side network, apache, VIP, Brooklyn. Also design and implement platforms for
monitoring, log processing, metrics collection and data visualisation.
Employment history
Nov. 2018 – Present
Millardfurt, Wyoming
Site Reliability Engineer (SRE), Walker and Sons
- Verify stability, interoperability, portability, security, or scalability of system architecture.
- Handled Site-up activities for all Walmart e-commerce Market – US,
Canada, ASDA etc.
- Experience working on Grafana, Cassandra, OneOps, Nagios, DNS
and Elastic Search issues while in this team.
- Managed uptime and availability of the site, critical applications and
other internal support structure
- Helped with automation of tasks, focus on automating routine
functions
- Configured and built tools for monitoring, metrics and alerting.
- Took proactive measures to ensure system stability and maximum
productivity.
- Ensured issues are resolved and escalated before the SLA breach.
- Collaborate with engineers or software developers to select appropriate design solutions or ensure the compatibility of system components.
- Provide technical guidance or support for the development or troubleshooting of systems.
Jun. 2016 – Oct. 2016
East Nathaniel, New Jersey
Production Engineer (DevOps), Huel and Sons
- Works on smaller, moderately complex tasks in support of a
project that requires a singular area of expertise.
- Work closely with developers to drive live title release,
deployment practices and processes.
- Setup/use monitoring tools to find problems and resolve and/or
escalate to development.
- Communicate with developers, product managers and technical
support specialists on product issues.
- Performs script maintenance and updates due to changes in
requirements or implementations.
- Deploys new modules, upgrades and fixes to the production
environment.
- Linux experience: ssh, monitoring processes, attaching storage,
cleaning disk space, tailing logs, etc
- Knowledge of WebServers and LoadBalancers, Apache HTTP
Server, Apache Traffic Server, Proxies, DNS, DHCP, HAProxy,VPN
- Document and escalate issues to the appropriate subject matter
experts as required after going through syslog’s.
- Responsible for process improvement/documentation. Alert and
noise reduction. Statistical analysis of quality and quantity of
alerts and incidents. Report/graph generation.
- Assist in Creating and maintaining Configuration and Change
Management Plan for the project.
- Gauge the effectiveness and efficiency of existing systems;
develop and implement strategies for improving or further
leveraging these systems
Jan. 2012 – Feb. 2012
New Cliffview, Nebraska
IT Engineer, Ernser, Thiel and Lang
- Working based SLA on SR and Incidents
- Installation, Configuration and Management of Active Directory
Services.
- Creation, deletion of hostname in Active Directory.
- Configuration and trouble shooting of MS Office 2007 & 2010.
- MS Outlook Configuration and troubleshooting.
- Installation Oracle 10g 11g and Toad.
- Configuration & Troubleshooting VPN client and Any connect
VPN.
- Antivirus MacAfee
- Windows OS troubleshooting.
- Trouble shooting of all Application software’s
- MAC OS trouble shooting.
- Trouble shooting of MAC Applications.
- Patches updating and Installation
- Installation, Configuration of Local and Network Printers.
- Fundamentals of Computer Networks.
- All laptops and Desktop issues.
- Ensuring timely closure of Incident tickets.
Education
Jan. 2010
Bachelor Of Engineering: Computer Science
- East Alabama College - Friesenshire, Mississippi
Skills
Splunk
Experienced
AWS
Skillful
Tomcat
Experienced
Apache
Experienced
Problem Management
Expert
Incident Management
Expert
Jenkins
Experienced
Docker and Kubernetes
Experienced
CI/CD
Experienced
Linux
Expert
Not in love with this template? Browse our full library of resume templates
Related Resumes & Cover Letters
More Job Descriptions for site reliability engineer Resumes
1
site reliability engineer
- Escalate issues as needed to product development or service engineering team per documented procedures, while at the same time establishing a contingency plan to eliminate any intermittent service disruption
- Document and detail areas of improvement to bolster architecture, design, technical requirements and service specifications.
- Present architecture, design, and technical choices to internal audiences Design and deploy metrics, monitoring, and logging systems on AWS / Infra systems to understand the system performance and isolate bottlenecks.
- helps drive efforts to improve triage time and bring down MTTR (Mean Time to Repair) and provides follow-up support to provide mitigation in the future
- Proactively monitor availability and performance of the SAP ARIBA cloud products using the required toolset
- Effectively respond to Monitoring alerts, incident tickets, email requests or other channels coming in to Site Reliability Engineering team
2
senior site reliability engineer
- design and DevOps implementation of a multi-tenant Kubernetes cluster, running a set of open ecosystem tools (calico, nginx-ingress, fluentd, Prometheus, kube2iam, LDAP auth, etc),
- authoring of configuration management procedures, workflows, and playbooks,
- design and execution of management procedures and configuration standards
- implementations based on devops tools (Ansible, Terraform, AWS API)
- implementation e2e tests for infrastructure CI and CD processes based on pytest framework
3
site reliability engineer
- Ensured production service availability with maximum uptime for Adobe Campaign. Developing tools to facilitate production system uptime and achieving product SLA .
- Automated production deployment using Ansible.
- Infrastructure Automation and Orchestration .
- Automation of daily Ad-hoc manual processes .
- Experience in development, deployment and scaling systems across DC and Cloud infrastructure.
- Production System Monitoring , Incident Management , Server Capacity Management .
- Troubleshoot operational and application issues and fix them within the SLA.
4
site reliability engineer
- Query AnalyzerSniffs packets(using pcap) on ethernet interface, decodes the packet by using MySQL client-server protocol, calculates the checksum of the query and sends aggregated data to the centralized server .
- Built UI on top of above data to project meaningful data.
- Writing up control scripts for new processes.
- CVE Tracking and Security related fixes in Infrastructure.
5
site reliability engineer
- Build from scratch, a web application for Infrastructure inventory management, using the LAMP stack.
- Developed micro-services, in Golang, for ETL jobs and data collection.
- Created web application using Django, Javascript/JQuery, D3.js for graphs and ag-grid for the reports.
- Performing data analyses and reporting key insights using python modules like NumPy, Pandas, Matplotlib, …
- Maintaining the Service Level Agreements (SLAs) with respect to key SLI indicators, in the project.
- Created CI/CD pipeline setup and followed Test Driven Development (TDD).