# Tyler M. Davis ## Contact - Cell Phone: 913-207-0701 - Email: - Github: [github.com/tydavis][1] - Linkedin: [https://www.linkedin.com/in/davisty/][14] ## Work History ### HP Enterprise (HPE) -- Lead SRE Evangelist, Solutions Architect (IC) Duration: July 2021 - _Current_ Remote ([HP Enterprise][15]) Position Summary: Proposed and drove programs for collective Engineering Standards across business units (BUs). Identified and delivered solutions for addressing \[micro\]service sprawl. Delivered and ratified standards for unified Reliability language, coding language standards for Go, and Maturity Model for services. Mentored SRE team and other engineers. - Designed and deployed DevOps Entity ID (DED) system for unified systems identification. - Created and promoted the Maturity Model for service readiness. - Proposed and delivered Engineering Standards within Greenlake and influenced the organization-wide Standards committee which followed. - Lead the Reliability Standards Working Group (WG). - Integrated SRE, Security, and Architecture reviews in a three-pronged approach to reduce incident frequency across new and existing systems. - Mentored Customer Operations, SRE teams, and individuals across engineering roles. ### Nordstrom -- Engineer Sr 2 (SRE, IC Role) Duration: Feb 2021 - June 2021 Seattle, WA ([www.nordstrom.com][2]) Position Summary: Senior SRE providing direct guidance for architecture, design, reliability, security, and tokenization. Engaged in troubleshooting for infrastructure issues, developed company-wide language toolsets, and delivered critical software fixes for multiple core systems. - Primary advisor for tokenization in Design Reviews. - Primary constributor for Go SIG within Nordstrom. - Developed sample applications and standard-pipeline template for Go. - Developed multi-cloud solution for achieving load-test at over 30x original performance goals. ### Nordstrom -- Manager, Engineering (RET) Duration: Nov 2019 - Feb 2021 Seattle, WA ([www.nordstrom.com][2]) Position Summary: Created the Reliability Engineering Tools (RET) team, a development-focused team responsible for the NERDS event-readiness platform, legacy Nordstrom Enterprise Tokenization service (NETS), and the modern tokenization platform "Fort Knox" (FK). _Note_: Nordstrom Engineering Managers are expected to have a 60%/40% split between people-management and direct-contribution time (e.g. contributing code to projects). - Lead FortKnox from inception to MVP in 6 weeks. Production readiness achieved two weeks later, using Java and Memcached. - Rebuilt FortKnox in Go using an internal cache in 4 weeks. - Onboarded 40+ clients within six months, reducing PI (CCPA/GDPR/PIPEDA) exposure across the organization. - Supported TokenSwapper (PCI FortKnox) build in 12 weeks, followed by migration for all stakeholders from Legacy tokenization system and CC processors. - SME for Go language (golang) - Contributed core Go CI pipelines for the Standard Pipeline. - Reduced load times for NERDS pages 100x, improving customer experience and performance. ### Nordstrom -- Manager, Engineering (SRE) Duration: Oct 2018 - Nov 2019 Seattle, WA ([www.nordstrom.com][2]) Position Summary: Engineering Manager supporting the SRE team which provides direct support to other teams, especially in regards to service and system reliability. The team owned site-wide load testing, the NERDS event-readiness system, and lead the Design Review system for each new service produced in Nordstrom engineering. - Managed SRE team through deployment and migration of Splunk and NewRelic across the organization. - Supported SRE team through multiple changes in Engineering Design Review process as both an advocate and escalation point. - Ongoing development support for load-test engine, Go code in other teams, and other technical advice. ### Nordstrom -- Engineer Sr 2 (SRE) Duration: Nov 2017 - Oct 2018 Seattle, WA ([www.nordstrom.com][2]) Position Summary: Senior Site Reliability Engineer responsible for improving system reliability, optimization, SME for technologies, mentoring employees (including non-engineers), and Tier-2/3 on-call response for incidents. - Lead Design Review advocate for engineering-wide review of projects and architectures/implementations. - SME for Go at Nordstrom (Golang) - Key technical contributor for the "NERDS" event-readiness platform (Golang, MySQL, AWS) - Built, project-managed, and delivered site-wide load-testing and acceptability-testing platform for Nordstrom.com (and now Nordstrom.ca). (Golang, GCP, GRPC, Protobuf) ### TUNE -- SRE II Duration: Sept 2016 - Nov 2017 Seattle, WA ([www.tune.com][3]) Position Summary: Hybrid Ops/Developer role dedicated to TUNE Management Console (TMC) teams, managing needs and conflicting priorities of Operations, IT/DevOps, and TMC teams. - Reduced Ops ticket initial response time from 2 days to 15 mins, and mean resolution time by half for TMC teams. - Contributed features and tests to multiple high-performance edge and ETL applications (>30k RPS/host) across the ecosystem. (Golang) - Migrated multiple systems into docker and EC2 Container Service (ECS) significantly reducing overhead and resource requirements. - Automated system recovery operations in AWS via Lambda & Cloudwatch. - Implemented internal functions in production Go services to reduce external dependencies (Golang, ECS). - Deployed and managed Kafka, Cassandra, and docker infrastructure in production. - Multiple bugfixes against company-wide deployment system. (Python) ### Porch.com -- DevOps Engineer II Duration: April 2015 - Sept 2016 Seattle, WA ([www.porch.com][4]) Position Summary: Build, maintain, manage, and secure computing infrastructure while working with Engineering and Business teams to meet all schedules and deadlines.- Responsible for building unbreakable infrastructure with testing and production requirements. - Migrated entire AWS environment to Google Compute Platform (GCP) in four months, assisting code modifications and infrastructure cleanup, while meeting and accommodating deadlines from all teams and management. - Reduced hosting costs by >70% through analysis, optimization, and monitoring. - Delivered presentation on behalf of Porch.com at OSCON 2015 for Kubernetes 1.0 launch: [https://youtu.be/JDUV3fjhFEI][5] - Presented as part of a Customer Panel on Containers and container-related technologies (February 2016) hosted by Google and Redapt. - Replaced multiple file and database driven services with programs providing simple RESTful APIs, written in Go (golang), improving all related metrics. - Migrated application build operations into ephemeral docker containers, reducing system management requirements, improving Developer productivity and flexibility. - Presented "Containers in Production" portion of Google Webinar: [https://youtu.be/w-snFo0pPJE][6] ### Recurly -- Systems Engineer Duration: Nov 2014 - April 2015 San Francisco, CA ([www.recurly.com][7]) Position Summary: Maintain and improve existing systems across IT and DevOps while interfacing with high-profile customers and the development team. - Responsible for full stack deployment, maintenance, and troubleshooting. - All activities are required to adhere to SSAE16 and PCIv3 Level 1 compliance due to PII and credit-card data security requirements. - Maintained and deployed new datacenters while retaining full PCIv3 Level 1 compliance at each point during the process, maintaining "four nines" (99.99%) uptime. - Streamlined and upgraded WiFI deployment to minimize roaming difficulties and maximize throughput with Apple laptops while simultaneously offloading non-essential wireless networks to isolated access points. - Proposed cost-saving hardware solution for headquarters' IDS, DNS, and other local services while passing fire-code regulations in the "Networking Closet." - Successfully extracted, processed, and transferred notification data to customer after the load overwhelmed delivery systems (80,000 records). - Provided ongoing support to high-profile client while exceeding SLA agreements for data-delivery. - Deployed new platform (JVM-based) while updating CI and deployment systems to match (moving from Capistrano+Git-based deploy to Salt+Debian packages). - Retained a high-profile client by updating and generalizing software for records export and compliance purposes. This software is now also used by other clients. ### KIXEYE -- Security Engineer Duration: May 2014 - Nov 2014 San Francisco, CA ([www.kixeye.com][8]) Position Summary: Successfully deployed secure infrastructure from scratch. Responsible for ongoing security stance, investigations, and architecture designs company-wide, including development and analysis of all products and integrations. - Proposed, researched, and gained support for replacement of insecure Skype chat with Atlassian HipChat. - Ongoing investigation and deployment of network and host monitoring systems (OSSEC) including creating new configurations and data representations. - Managed client-side configuration and deployment of 802.1X network authentication (wired and wireless) to 100% compliance within three business days (88% compliance within one day) by personally visiting each employee. - Researched, administered, and personally implemented two-factor authentication (Duo Security) to all employees and contractors within two business days. - Researched, administered, and deployed corporate password and shared-secret platform (LastPass Enterprise). - Managed puppet deployment and configuration of above-mentioned systems. ### KIXEYE -- IT Developer & Systems Admin Duration: March 2013 - May 2014 San Francisco, CA ([www.kixeye.com][8]) Position Summary: Responsible for general troubleshooting, maintenance, upgrades, and management of the Atlassian software stack, third-party / SaaS integrations with our customized in-house infrastructure, and expert opinions on in-house technologies. - Proposed and performed migration of remotely-hosted Atlassian software instances to KIXEYE-owned infrastructure while respecting all teams’ development schedules. - Managed and expanded Active Directory and network services infrastructure (DNS, DHCP, file-sharing) as needed throughout company growth. - Created and extended multiple systems integrations with corporate LDAP. - Lead Investigator for planning and execution of subsequent investigations after KIXEYE security breach. ### Atlassian -- Senior Support Engineer Duration: Jan 2011 - March 2013 San Francisco, CA ([www.atlassian.com][9]) Position Summary: Final-tier investigator prior to sending bug reports to the Development teams. These issues would be “passed up” from Tier-1 support to myself (with only one Senior per team) where I was responsible for addressing issues across multiple time zones (EMEA, Pacific/US, and APAC). Also responsible for training new recruits and any prospective “Seniors.” In my spare time, I also assisted other teams and enhanced Support techniques for better/faster analysis and resolution of issues across all products. - Considered the "JVM tuning expert" across worldwide support team. - Supported Bitbucket users to determine large-scale support-facility requirements. - Principal Developer for community-help site (answers.atlassian.com). Coordinated with contracted programmer and provided features as requested by other teams and management. (Python) - Recognized “scaling issues” within the Support Organization and proposed procedural changes which were adopted across international teams. - Improved Support tools (automatic log analysis programs and Hercules support bot) and provided DevOps resources to the San Francisco team. - Patched and published JIRA SVN plugin to solve a customer-reported issue. ### Popstar Networks- Systems Administrator Duration: Oct 2008 - Jan 2011 Olathe, KS (popstarnetworks.com -- defunct) Position Summary: “Jack of all trades” as the company went from 16 to 50 people during employment. Responsible for maintaining, updating, and troubleshooting nationwide deployments of 97%-uptime client systems, including on-call and proactive support, while maintaining product infrastructure and developing corporate internal systems as the company grew. Also worked closely with the engineering team, taking on abandoned projects (e.g. device integration via serial interface, service updater written in C++, etc). - Responsible for debugging production code for client-side systems and working with Engineering to resolve any issues. - Developed 24/7 monitoring system to proactively identify issues in our product. This system was soon declared “critical infrastructure” and immediately enhanced Sales revenue. - Created and deployed custom company-wide backup software for employee laptops. (Perl) - Developed custom software for transferring and processing client runtime reports to on-site office storage for billing. (Python) - Investigated and coded device integration software for flagship software product (JavaSE) - Coded critical bugfix and additional features for product service platform updater (C++) ### Community America Credit Union- IT Intern Duration: May 2005 - July 2005 Olathe, KS ([www.cacu.com][10]) Position Summary: Campaigned to be hired as an intern when the company had no internship program. Was accepted on the merits of my interview and discussions with the team. - Worked with developers creating new backend processing systems (C#) ### Johnson County Community College- Computer Lab Technician Duration: Oct 2004 - May 2005 Overland Park, KS ([jccc.edu][11]) Position Summary: Responsible for maintaining computers, assisting users, and otherwise making sure everyone who used the labs continued to enjoy using the lab services. - Performed forensic recovery of files stored on floppy disk and then-new flash drives as a free service for students and professors. ## Education - [University of Kansas][12] College of Liberal Arts & Sciences, B.A. Sociology Lawrence, KS (August 2005 - September 2008) - Coursera - Strategic Leadership and Management Specialization - [Certificate EAP4D522TGU6A][13] (August 2017 - July 2017) ## Presentations and Publications - [Kubernetes 1.0 Launch - Customer Showcase: Porch][5] (2015) - [Webinar: Managing Containers in Production with Google Container Engine & Kubernetes][6] (2016) ## Core Technical Skills - Languages: Go (Golang), Python, Java, Perl, C, C++, Rust (rustlang) - Platforms: GCP, Kubernetes, Docker, JVM, MySQL, PostgreSQL, AWS, Datadog, NewRelic, Splunk - Competencies: Information Security, Privacy, Systems Architecture, Software Design, Site Reliability Engineering (SRE) [1]:https://github.com/tydavis/ [2]:https://www.nordstrom.com/ [3]:https://www.tune.com/ [4]:https://www.porch.com/ [5]:https://youtu.be/JDUV3fjhFEI [6]:https://youtu.be/w-snFo0pPJE [7]:https://recurly.com/ [8]:https://www.kixeye.com/ [9]:https://www.atlassian.com/ [10]:https://www.communityamerica.com/ [11]:https://www.jccc.edu/ [12]:https://sociology.ku.edu/ [13]:https://www.coursera.org/account/accomplishments/specialization/certificate/EAP4D522TGU6 [14]:https://www.linkedin.com/in/davisty/ [15]:https://www.hpe.com/us/en/home.html