1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
|
# Tyler M. Davis
## Contact
- Cell Phone: 913-207-0701
- Email: <tydavis@gmail.com>
- Github: [github.com/tydavis][1]
- Linkedin: [https://www.linkedin.com/in/davisty/][14]
## Work History
### Apple -- ASE SRE (IC)
Duration: February 2024 - _Current_
Position Summary: Supported major lines of business (LOB) within the
Apple Services (ASE) division.
- Reduced new-hire onboarding time (40 -> 24 hours) through scripting
and automation.
### HP Enterprise (HPE) -- Lead SRE Evangelist, Solutions Architect (IC)
Duration: July 2021 - February 2024
Remote ([HP Enterprise][15])
Position Summary: Proposed and drove programs for collective Engineering
Standards across business units (BUs). Identified and delivered
solutions for addressing \[micro\]service sprawl. Delivered and ratified
standards for unified Reliability language, coding language standards
for Go, and Maturity Model for services. Mentored SRE team and other
engineers.
- Designed and deployed DevOps Entity ID (DED) system for unified
systems identification.
- Created and promoted the Maturity Model for service readiness.
- Proposed and delivered Engineering Standards within Greenlake and
influenced the organization-wide Standards committee which followed.
- Lead the Reliability Standards Working Group (WG).
- Integrated SRE, Security, and Architecture reviews in a three-pronged
approach to reduce incident frequency across new and existing systems.
- Mentored Customer Operations, SRE teams, and individuals across engineering roles.
### Nordstrom -- Engineer Sr 2 (SRE, IC Role)
Duration: Feb 2021 - June 2021
Seattle, WA ([www.nordstrom.com][2])
Position Summary: Senior SRE providing direct guidance for architecture,
design, reliability, security, and tokenization. Engaged in
troubleshooting for infrastructure issues, developed company-wide
language toolsets, and delivered critical software fixes for multiple
core systems.
- Primary advisor for tokenization in Design Reviews.
- Primary constributor for Go SIG within Nordstrom.
- Developed sample applications and standard-pipeline template for Go.
- Developed multi-cloud solution for achieving load-test at over 30x
original performance goals.
### Nordstrom -- Manager, Engineering (RET)
Duration: Nov 2019 - Feb 2021
Seattle, WA ([www.nordstrom.com][2])
Position Summary: Created the Reliability Engineering Tools (RET) team,
a development-focused team responsible for the NERDS event-readiness
platform, legacy Nordstrom Enterprise Tokenization service (NETS), and
the modern tokenization platform "Fort Knox" (FK).
_Note_: Nordstrom Engineering Managers are expected to have a 60%/40%
split between people-management and direct-contribution time (e.g.
contributing code to projects).
- Lead FortKnox from inception to MVP in 6 weeks. Production readiness
achieved two weeks later, using Java and Memcached.
- Rebuilt FortKnox in Go using an internal cache in 4 weeks.
- Onboarded 40+ clients within six months, reducing PI
(CCPA/GDPR/PIPEDA) exposure across the organization.
- Supported TokenSwapper (PCI FortKnox) build in 12 weeks, followed by
migration for all stakeholders from Legacy tokenization system and CC
processors.
- SME for Go language (golang)
- Contributed core Go CI pipelines for the Standard Pipeline.
- Reduced load times for NERDS pages 100x, improving customer experience
and performance.
### Nordstrom -- Manager, Engineering (SRE)
Duration: Oct 2018 - Nov 2019
Seattle, WA ([www.nordstrom.com][2])
Position Summary: Engineering Manager supporting the SRE team which
provides direct support to other teams, especially in regards to service
and system reliability. The team owned site-wide load testing, the NERDS
event-readiness system, and lead the Design Review system for each new
service produced in Nordstrom engineering.
- Managed SRE team through deployment and migration of Splunk and
NewRelic across the organization.
- Supported SRE team through multiple changes in Engineering Design
Review process as both an advocate and escalation point.
- Ongoing development support for load-test engine, Go code in other
teams, and other technical advice.
### Nordstrom -- Engineer Sr 2 (SRE)
Duration: Nov 2017 - Oct 2018
Seattle, WA ([www.nordstrom.com][2])
Position Summary: Senior Site Reliability Engineer responsible for
improving system reliability, optimization, SME for technologies,
mentoring employees (including non-engineers), and Tier-2/3 on-call
response for incidents.
- Lead Design Review advocate for engineering-wide review of projects
and architectures/implementations.
- SME for Go at Nordstrom (Golang)
- Key technical contributor for the "NERDS" event-readiness platform
(Golang, MySQL, AWS)
- Built, project-managed, and delivered site-wide load-testing and
acceptability-testing platform for Nordstrom.com (and now
Nordstrom.ca). (Golang, GCP, GRPC, Protobuf)
### TUNE -- SRE II
Duration: Sept 2016 - Nov 2017
Seattle, WA ([www.tune.com][3])
Position Summary: Hybrid Ops/Developer role dedicated to TUNE Management
Console (TMC) teams, managing needs and conflicting priorities of
Operations, IT/DevOps, and TMC teams.
- Reduced Ops ticket initial response time from 2 days to 15 mins, and
mean resolution time by half for TMC teams.
- Contributed features and tests to multiple high-performance edge and
ETL applications (>30k RPS/host) across the ecosystem. (Golang)
- Migrated multiple systems into docker and EC2 Container Service (ECS)
significantly reducing overhead and resource requirements.
- Automated system recovery operations in AWS via Lambda & Cloudwatch.
- Implemented internal functions in production Go services to reduce
external dependencies (Golang, ECS).
- Deployed and managed Kafka, Cassandra, and docker infrastructure in
production.
- Multiple bugfixes against company-wide deployment system. (Python)
### Porch.com -- DevOps Engineer II
Duration: April 2015 - Sept 2016
Seattle, WA ([www.porch.com][4])
Position Summary: Build, maintain, manage, and secure computing
infrastructure while working with Engineering and Business teams to meet
all schedules and deadlines.- Responsible for building unbreakable
infrastructure with testing and production requirements.
- Migrated entire AWS environment to Google Compute Platform (GCP) in
four months, assisting code modifications and infrastructure cleanup,
while meeting and accommodating deadlines from all teams and management.
- Reduced hosting costs by >70% through analysis, optimization, and
monitoring.
- Delivered presentation on behalf of Porch.com at OSCON 2015 for
Kubernetes 1.0 launch: [https://youtu.be/JDUV3fjhFEI][5]
- Presented as part of a Customer Panel on Containers and
container-related technologies (February 2016) hosted by Google and
Redapt.
- Replaced multiple file and database driven services with programs
providing simple RESTful APIs, written in Go (golang), improving all
related metrics.
- Migrated application build operations into ephemeral docker
containers, reducing system management requirements, improving
Developer productivity and flexibility.
- Presented "Containers in Production" portion of Google Webinar:
[https://youtu.be/w-snFo0pPJE][6]
### Recurly -- Systems Engineer
Duration: Nov 2014 - April 2015
San Francisco, CA ([www.recurly.com][7])
Position Summary: Maintain and improve existing systems across IT and
DevOps while interfacing with high-profile customers and the development
team.
- Responsible for full stack deployment, maintenance, and
troubleshooting.
- All activities are required to adhere to SSAE16 and PCIv3 Level 1
compliance due to PII and credit-card data security requirements.
- Maintained and deployed new datacenters while retaining full PCIv3
Level 1 compliance at each point during the process, maintaining "four
nines" (99.99%) uptime.
- Streamlined and upgraded WiFI deployment to minimize roaming
difficulties and maximize throughput with Apple laptops while
simultaneously offloading non-essential wireless networks to isolated
access points.
- Proposed cost-saving hardware solution for headquarters' IDS, DNS, and
other local services while passing fire-code regulations in the
"Networking Closet."
- Successfully extracted, processed, and transferred notification data
to customer after the load overwhelmed delivery systems (80,000
records).
- Provided ongoing support to high-profile client while exceeding SLA
agreements for data-delivery.
- Deployed new platform (JVM-based) while updating CI and deployment
systems to match (moving from Capistrano+Git-based deploy to
Salt+Debian packages).
- Retained a high-profile client by updating and generalizing software
for records export and compliance purposes. This software is now also
used by other clients.
### KIXEYE -- Security Engineer
Duration: May 2014 - Nov 2014
San Francisco, CA ([www.kixeye.com][8])
Position Summary: Successfully deployed secure infrastructure from
scratch. Responsible for ongoing security stance, investigations, and
architecture designs company-wide, including development and analysis of
all products and integrations.
- Proposed, researched, and gained support for replacement of insecure
Skype chat with Atlassian HipChat.
- Ongoing investigation and deployment of network and host monitoring
systems (OSSEC) including creating new configurations and data
representations.
- Managed client-side configuration and deployment of 802.1X network
authentication (wired and wireless) to 100% compliance within three
business days (88% compliance within one day) by personally visiting
each employee.
- Researched, administered, and personally implemented two-factor
authentication (Duo Security) to all employees and contractors within
two business days.
- Researched, administered, and deployed corporate password and
shared-secret platform (LastPass Enterprise).
- Managed puppet deployment and configuration of above-mentioned
systems.
### KIXEYE -- IT Developer & Systems Admin
Duration: March 2013 - May 2014
San Francisco, CA ([www.kixeye.com][8])
Position Summary: Responsible for general troubleshooting, maintenance,
upgrades, and management of the Atlassian software stack, third-party /
SaaS integrations with our customized in-house infrastructure, and
expert opinions on in-house technologies.
- Proposed and performed migration of remotely-hosted Atlassian software
instances to KIXEYE-owned infrastructure while respecting all teams’
development schedules.
- Managed and expanded Active Directory and network services
infrastructure (DNS, DHCP, file-sharing) as needed throughout company
growth.
- Created and extended multiple systems integrations with corporate
LDAP.
- Lead Investigator for planning and execution of subsequent
investigations after KIXEYE security breach.
### Atlassian -- Senior Support Engineer
Duration: Jan 2011 - March 2013
San Francisco, CA ([www.atlassian.com][9])
Position Summary: Final-tier investigator prior to sending bug reports
to the Development teams. These issues would be “passed up” from Tier-1
support to myself (with only one Senior per team) where I was
responsible for addressing issues across multiple time zones (EMEA,
Pacific/US, and APAC). Also responsible for training new recruits and
any prospective “Seniors.” In my spare time, I also assisted other teams
and enhanced Support techniques for better/faster analysis and
resolution of issues across all products.
- Considered the "JVM tuning expert" across worldwide support team.
- Supported Bitbucket users to determine large-scale support-facility
requirements.
- Principal Developer for community-help site (answers.atlassian.com).
Coordinated with contracted programmer and provided features as
requested by other teams and management. (Python)
- Recognized “scaling issues” within the Support Organization and
proposed procedural changes which were adopted across international
teams.
- Improved Support tools (automatic log analysis programs and Hercules
support bot) and provided DevOps resources to the San Francisco team.
- Patched and published JIRA SVN plugin to solve a customer-reported
issue.
### Popstar Networks- Systems Administrator
Duration: Oct 2008 - Jan 2011
Olathe, KS (popstarnetworks.com -- defunct)
Position Summary: “Jack of all trades” as the company went from 16 to 50
people during employment. Responsible for maintaining, updating, and
troubleshooting nationwide deployments of 97%-uptime client systems,
including on-call and proactive support, while maintaining product
infrastructure and developing corporate internal systems as the company
grew. Also worked closely with the engineering team, taking on abandoned
projects (e.g. device integration via serial interface, service updater
written in C++, etc).
- Responsible for debugging production code for client-side systems and
working with Engineering to resolve any issues.
- Developed 24/7 monitoring system to proactively identify issues in our
product. This system was soon declared “critical infrastructure” and
immediately enhanced Sales revenue.
- Created and deployed custom company-wide backup software for employee
laptops. (Perl)
- Developed custom software for transferring and processing client
runtime reports to on-site office storage for billing. (Python)
- Investigated and coded device integration software for flagship
software product (JavaSE)
- Coded critical bugfix and additional features for product service
platform updater (C++)
### Community America Credit Union- IT Intern
Duration: May 2005 - July 2005
Olathe, KS ([www.cacu.com][10])
Position Summary: Campaigned to be hired as an intern when the company
had no internship program. Was accepted on the merits of my interview
and discussions with the team.
- Worked with developers creating new backend processing systems (C#)
### Johnson County Community College- Computer Lab Technician
Duration: Oct 2004 - May 2005
Overland Park, KS ([jccc.edu][11])
Position Summary: Responsible for maintaining computers, assisting
users, and otherwise making sure everyone who used the labs continued to
enjoy using the lab services.
- Performed forensic recovery of files stored on floppy disk and
then-new flash drives as a free service for students and professors.
## Education
- [University of Kansas][12] College of Liberal Arts & Sciences, B.A. Sociology Lawrence, KS (August 2005 - September 2008)
- Coursera - Strategic Leadership and Management Specialization - [Certificate EAP4D522TGU6A][13] (August 2017 - July 2017)
## Presentations and Publications
- [Kubernetes 1.0 Launch - Customer Showcase: Porch][5] (2015)
- [Webinar: Managing Containers in Production with Google Container Engine & Kubernetes][6] (2016)
## Core Technical Skills
- Languages: Go (Golang), Python, Java, Perl, C, C++, Rust (rustlang)
- Platforms: GCP, Kubernetes, Docker, JVM, MySQL, PostgreSQL, AWS,
Datadog, NewRelic, Splunk
- Competencies: Information Security, Privacy, Systems Architecture,
Software Design, Site Reliability Engineering (SRE)
[1]:https://github.com/tydavis/
[2]:https://www.nordstrom.com/
[3]:https://www.tune.com/
[4]:https://www.porch.com/
[5]:https://youtu.be/JDUV3fjhFEI
[6]:https://youtu.be/w-snFo0pPJE
[7]:https://recurly.com/
[8]:https://www.kixeye.com/
[9]:https://www.atlassian.com/
[10]:https://www.communityamerica.com/
[11]:https://www.jccc.edu/
[12]:https://sociology.ku.edu/
[13]:https://www.coursera.org/account/accomplishments/specialization/certificate/EAP4D522TGU6
[14]:https://www.linkedin.com/in/davisty/
[15]:https://www.hpe.com/us/en/home.html
|