Meet a zulily developer: Beryl

beryl

Who are you, and what do you do at zulily?

I am Beryl, a Software Engineer on the EMS (Event Management Systems) Team. I work on the internal tools that help other zulily teams set up and manage sales.

When did you join zulily?

I started at the end of July, 2013 — almost a year and a half ago.

What’s a typical day like for you?

Every day I am usually spending some time working on my assigned projects, reviewing designs or code from other members of my team and supporting users. The work I do on projects depends on the project I am working on as well as at what stage that project is at. Various things I might do for project work include drawing up user interface designs, attending meetings with our users, coding and testing.

What is one of your favorite projects you finished and launched to production?

My favorite would have to be a rewrite of a page that provided an audit for what we sell. The page had become extremely slow, the data source had been deprecated and was often inaccurate and it was not tailored to what our internal users needed to do. When we realized this was taking up a huge amount of time for our users, I was able to talk to them to figure out their intended workflow and propose a new design. Within a fairly short period of time we had a new page up and running that everyone was happier with and that provided accurate data.

What was it like in the early days?  Tell us a crazy story.

When I first joined, I spent some time looking around the application I work on. It was very surprising when I came across a couple of error/info messages that included pictures of my coworkers for humorous effect. Some still exist, but I think they’re slowly starting to disappear…

What gets you excited about working for zulily?

The sense of community I get from other employees. As a developer on internal tools, I get a lot of opportunity to meet with employees from a bunch of different parts of the company. Additionally, I am enjoying getting to know engineers across all tech teams through our small, but growing, women-in-tech community. Even though the company has grown so much since I first started, it still manages to feel just as small!

Google Compute Engine Hadoop clusters with zdutil

Here at zulily, we use Google Compute Engine (GCE) for running our Hadoop clusters. Google has a utility called bdutil for setting up and tearing down Hadoop clusters on GCE. We ran into a number of issues when using the utility and were using an internally patched version of it to create our Hadoop clusters. If you look at the source, bdutil is essentially a collection of bash scripts that automate the various steps of creating a GCE instance and provisioning it with all the necessary software needed to run Hadoop. One major issue we found with bdutil was that there is no way to provision a Hadoop cluster where the datanodes do not have external IP addresses. For clusters with many datanodes — the kind we typically run — this means we end up running against our quota of external IP addresses. Additionally, there is no reason for the datanodes to have external IP addresses as they should not be accessible to the public.

We decided to stop patching bdutil and write our own utility to provision a Hadoop cluster. The utility is called zdutil and you can find it on our GitHub page. Here’s how it works:

  • First, GCE instances are created for the namenode and all datanodes in your Hadoop cluster.
  • Then, any persistent disks that you requested are created and attached to the instances.
  • If you have have any tags that you would like to be applied to the namenode or datanodes, the tags are added to the instances. This saves you from having to manually tag every single instance in your cluster or write your own script to do so.
  • Next, all of the required setup scripts to provision the namenode and datanodes are copied to a GCS bucket of your choosing. The namenode then provisions itself.
  • Once it completes, it copies (via scp) all scripts needed for datanode provisioning to each datanode and then each datanode will provision itself.
  • Once all datanodes have been provisioned, the namenode will start the Hadoop cluster.

If you deploy the datanodes with either external or ephemeral IP addresses, they will have internet access as determined by the rules of your GCE network. If you deploy the datanodes with “none” for the IP address, they will proxy through the namenode using Squid. You don’t have to configure any of this yourself; zdutil will take care of the details for you, including installing and provisioning Squid on your namenode. It is also important to be aware that Google’s version of the Google Cloud Storage Connector currently does not support proxying. If you use zdutil, it will install our fork of the GCS Connector which does support proxying by adding the following properties to your Hadoop core-site.xml configuration file: fs.gs.proxy.host and fs.gs.proxy.port.

If you have any need for zdutil, please use it and give us your feedback. At the moment we only support Debian-based images and we only support Hadoop version 1. If you would like to see another OS supported or Yarn support, please add an issue to the GitHub page.

 

Meet a zulily developer: Bala

IMG_4180 

Who are you, and what do you do at zulily?

I am Bala, a Software Engineer on the EMS (Event Management Systems) Team. We create and operate the tools that help other teams launch and manage sales and deals. Our goal is to minimize the time it takes to get deals on zulily by providing efficient and easy to use tools.

When did you join zulily?

I started in October of 2013, so it has been a year.

What was it like in the early days?  Tell us a crazy story.

Although I officially joined at the end of October, I did not come to work for a week after my hire date. I had asked for a vacation for a week and I got one 🙂 I had very little work experience prior to zulily and was scared about the work culture in a start-up (which zulily was at that time). The first impression I have of zulily: this is a company I want to work for who cares about the employees.

On my actual first day of work at zulily, my manager greeted me and introduced me to the team. Luckily on the very first day we had “All Hands” meeting. It was a small auditorium and everyone was gathered there and the host started calling out people who joined that week. All the new hires gathered on the dais and there was a grand welcome for us. They asked us a question: “Which Christmas Carol movie do you like the best?” Everyone was answering the question with their most liked movie name. People were shouting and clapping all the time. When it was my turn, they asked my what mine was. I wanted to steal others’ ideas as I have not seen many Christmas Carol movies. So, I made up my mind and said “I have not seen any Christmas Carol movies but I like Iron Man.” I thought people would be laughing, but it was another round of applause and screams. I loved the energy of people at zulily. This is small but something I cherish working here.

One week later zulily went public and we celebrated that day in the Auditorium. I never had a chance before to experience the vibe when the company you work for goes public. It was epic. There was a countdown and people were cheering and later we heard Darrell’s (the CEO) live speech from the NASDAQ office. What a day it was!

Later I was involved in talking to a lot of people from the business. I was told about the fast-paced environment at zulily and “zulily time.” I didn’t really understand it until I released a new feature for vendors called “Vendor Inventory Update Automation” in my first few weeks at zulily. After that I never turned back….

What is different now?

zulily has grown a lot. But zulily still moves very fast and is very aggressive. The tech team has doubled in size, there are more people you would be able to work with and the development vision has changed from “Get it out now. We can think of the future later.” to “We need to do it right and make it useful for the future.” We also have PMs to help us to define the priorities and let us code more and attend fewer meetings.

What’s a typical day like for you?

I get into the office around 10am. By the time I come in most folks are here. I check my mail, look at my calendar and plan my day. I will be so engrossed in coding that I forget to eat sometimes. I generally keep reminders for that! I keep coding and attend meetings. I go back home when I feel that I have completed something concrete. I bike and bus to the office and I take the time on the bike to catch-up with what’s going on in this world. I go back home and spend some time with my family and then if I get some time I read something or else I go to bed… repeat.

What gets you excited about working about working at zulily?

I agree with everyone about how much of an impact your changes and work have on the routine of the people at zulily. As I work on internal tools for zulily employees I get a chance to meet my customers directly, talk to them, get accurate requirements and build tools that cater to their needs. This kind of customer interaction is something I love about working at zulily. Also, I own what I build and I support it, which motivates me to code better. Most importantly: all the people I work with are awesome.

Expect: How a 20-Year-Old Tool Saved My Project

I joined zulily in August of 2010, and at the time the company as a whole consisted of only 35 people. One of my first projects involved integrating one of our systems with a system owned by a much larger, more established partner company. The details of what these systems did aren’t relevant, except that the integration mechanism was for our system to drop XML files on an SFTP server that the partner company owned and operated.

At the time we were a 100% PHP shop (this has since changed), so I implemented our side of the integration in PHP, and used an open source PHP library called phpseclib to handle the actual SFTP data transfer. The partner company didn’t have any clients who used PHP, and didn’t officially support this library, but it worked great throughout development and integration testing. The integration test phase of the project took approximately 3 months, and not once during that time did we ever have even the slightest hiccup when transferring data between systems.

However, once we went to full production, we started seeing file corruption — specifically, sometimes files we transferred to the partner’s SFTP server would be truncated. There was no discernible pattern to the truncation; it happened at different points every time, and often a file that failed once would work fine when it was retried.

Naturally, this caused some consternation, as our code hadn’t changed, and it had been working for months without fail. When I pointed the exact same code at their test server, and sent the exact same file content, everything worked flawlessly. When I pointed it back to their production server, the files were truncated.

Clearly, to my mind at least, the problem was on their end. After a couple of days of frenzied troubleshooting, we discovered that the version of the SFTP server software running on their test machines was newer than what they ran in production. Presumably, we were hitting some bug in that software that was fixed between the two versions.

Since they were a very large company with many other clients using these servers, they were unwilling to upgrade their SFTP software on our behalf. Also, given that we were already well into the go-live phase of the integration, rewriting our system in another language wasn’t an appealing option, especially since there was no guarantee that the new implementation would work any better.

One option that the parent company did officially support, though, was the sftp client included with OpenSSL. I tried manually transferring a few files this way…and the file truncation issue disappeared.

There were problems with this approach, though: for one, the SFTP server required authentication, and the partner company was unwilling to set up SSH keys for us to do non-interactive authentication. OpenSSL’s sftp client doesn’t support setting the authentication credentials via command line parameters, leaving us stuck with authenticating interactively. This obviously wasn’t an acceptable long-term solution, since these systems needed to communicate with each other without human intervention.

I don’t recall exactly when I stumbled across it, but somewhere in the midst of searching for a solution I came across Expect.

Quoting the introduction on the Expect homepage:

Expect is a tool for automating interactive applications such as telnet, ftp, passwd, fsck, rlogin, tip, etc. Expect really makes this stuff trivial.

This sounded exactly like what I needed! I bought a copy of “Exploring Expect“, read enough of it on the train ride home to get started, and after a couple of hours I had whipped up a PHP script that did the following:

  1. Dump the XML data we wanted to transfer into a local temp file.
  2. Build an Expect script in memory by doing some variable interpolation in a PHP string.
  3. Write the resulting Expect script to another local temp file.
  4. Exec() the Expect script, capturing the process return code and anything the script wrote to stdout or stderr.
  5. Write the process output to our application log for later troubleshooting if anything went wrong.
  6. Finally, if the process return code was zero (indicating success), delete the two temp files. Otherwise, send an email to our notification alias so I could investigate.

Here’s the meat of the Expect script generation in all its glory:

$expectContents =
 '#!/usr/bin/expect' . "\n"
 . 'set timeout ' . $this->m_timeout . "\n"
 . 'spawn sftp -o Port=' . $this->m_port . ' ' . $this->m_user . '@' . $this->m_host . "\n"
 . 'expect {' . "\n"
 . ' default {exit 1}' . "\n"
 . ' "Connecting to ' . $this->m_host . '..."' . "\n"
 . '}' . "\n"
 . 'expect {' . "\n"
 . ' default {exit 2}' . "\n"
 . ' "continue connecting (yes/no)?" {send "yes\n"; exp_continue}' . "\n"
 . ' "ssword"' . "\n"
 . '}' . "\n"
 . 'send "' . $this->m_pass . '\n"' . "\n"
 . 'expect {' . "\n"
 . ' default {exit 3}' . "\n"
 . ' "ermission denied" {exit 4}' . "\n"
 . ' "sftp>"' . "\n"
 . '}' . "\n"
 . 'send "cd /inbound\n"' . "\n"
 . 'expect {' . "\n"
 . ' default {exit 5}' . "\n"
 . ' "not found" {exit 6}' . "\n"
 . ' "No such file" {exit 7}' . "\n"
 . ' "sftp>"' . "\n"
 . '}' . "\n"
 . 'send "put ' . $filename . '\n"' . "\n"
 . 'expect {' . "\n"
 . ' default {exit 8}' . "\n"
 . ' "not found" {exit 9}' . "\n"
 . ' "sftp>"' . "\n"
 . '}' . "\n"
 . 'send "quit\n"' . "\n"
 . 'exit 0' . "\n";

Yes, I’m aware that there are more elegant ways to interpolate strings in PHP than this, but at the time I was still fairly new to PHP and under a ton of pressure to get something — anything — working.

Coming from a background in statically typed compiled languages, this just felt wrong somehow. Even my stints in the land of Perl didn’t feel this hacky. It was an ugly, nasty, weird abomination…but it worked.

And it kept working, without fail, for the entire lifespan of this system. Over time, we encountered bugs in many other areas, as is inevitable with any complex system, but this little corner of the codebase never had a single issue.

What this taught me is that sometimes you should go slowly, be methodical, and take the time necessary to create a simple, elegant solution to your problem.

And sometimes you just need to break out the duct tape.

Meet a zulily Developer: John

Each month, we’ll talk with one of our developers and learn about a day-in-the-life of a zulily engineer.

Who are you, and what do you do at zulily?

I’m John, a tech lead on the SHIPS* team.john-self

*The name of my team has changed numerous times during my tenure at zulily, and actually is about to change again. Other names for the team I am on have been: Supply Chain, FMS, PFOAM, SCS, “those folks that deal with shipping stuff to Mom”…

When did you join zulily?

I started in June of 2012, so it has been 2+ years.

What was it like in the early days?  Tell us a crazy story.

  • On my first day, I vividly remember Dan Ward coming up to me and introducing himself. He was wearing a neon orange shirt, white pants, a neon orange belt and neon orange shoe-laces. I remember thinking to myself, “This dude is really friendly, but that is a lot of neon orange!” 🙂
  • Later in the morning of my first day at zulily, I remember hearing “Good morning!!!” <CLAP>, <CLAP>, <CLAP>, <CLAP> over and over again. Of course this was Tatiana leading a conga line of folks who were telling everyone “Good Morning!!!” and giving them a high-five.
  • For lunch on my first day, I went to Pecos BBQ Pit in SODO and ordered a pulled pork sandwich with the “hot” BBQ sauce. I like spicy food, but not ghost chili peppers pureed with the tears of Satan…
  • Later in that first week, zulily announced that they were going to be the first company to integrate with SAP in 90 days (where most companies take 18-24 months to do the same amount of work.) My team did a lot of the heavy lifting on this aggressive project, and we pulled it off.  Even built a LEGO Galactic Empire Super Star Destroyer during the process. 🙂
  • A year later zulily had another aggressive project where I got to travel to London with Dan Ward and Neil Harris to deploy SAP into the UK portion of the business. Again we managed to pull off this aggressive project in “zulily time”, I also came away with a serious love for Brown Sauce, Bacon Butties, and Neil Harris’ ability to function at a very high level sans sleep.

john-destroyer

What is different now?

zulily still moves very fast and is very aggressive. What is different now is the number of folks to help with the work, and the impact of the work has been magnified at least three orders of magnitude. I still cannot wrap my head around the growth.

What’s a typical day like for you?

I get into the office around 7am before most folks get into the office, grab some coffee and look at my calendar to see how many meetings I have. I then pound out some code or documentation till about 9am before the meetings start happening. Typically I will have 1-2 phone screens or on-site interviews a day, 1-2 meetings with sister and cousin teams a day regarding system integrations, in between said meetings try to write a line or two of code and hopefully sometime during the day try to remember to have some lunch. I do my best to catch the 5:15pm water taxi to West Seattle where I live. Have dinner with my kids and wife, put my kids to bed and then if I have any energy left write some more code before I head to bed. Rinse, repeat…

What gets you excited about working about working at zulily?

In a word, impact. It is very rare that one gets to work at a place where the requirement is to scale systems by orders of magnitude in hopes of keeping up with the demands of the business.  I would categorize working in zulily tech as “extreme engineering” with very high highs and very low lows.  It is thrilling to be able to triage, debug and resurrect a system that is cratering, or deploy subtle changes to systems that almost immediately start generating more revenue and see it happen on a pretty splunk graph.

In another word, trust. There are not many places where an engineer would be allowed to have the impact described above without backbreaking amounts of process and oversight.

Seattle Scalability Meetup @ zulily: Google, Hortonworks, zulily

We are looking forward to meeting everyone attending the scalability meetup at our office. It is going to be a great event with a good overview of how zulily leverages big data and a deep dive into Google Big Query & Apache Optiq in Hive.

Agenda

Topic:  Building zulily’s Data Platform using Hadoop and Google Biq Query

Speakers: Sudhir Hasbe is Director of big data, data services and BI at zulily. (https://www.linkedin.com/in/shasbe). Also Paul Newson (https://www.linkedin.com/profile/view?id=971812 )

Abstract: zulily, with 4.1 million customers and projected 2014 revenues of over 1 billion dollars, is one of the largest e-commerce companies in the U.S. “Data-driven decision making” is part of our DNA. Growth in the business has triggered exponential growth in data, which required us to redesign our data platform. The zulily data platform is the backbone for all analytics and reporting, along with being the backbone of our data service APIs consumed by various teams in the organization. This session provides a technical deep dive into our data platform and shares key learnings, including our decision to build a Hadoop cluster in the cloud.

Topic: Delivering personalization and recommendations using Hadoop in cloud

Speakers: Steve Reed is a principal engineer at zulily, the author of dropship, and former Geek of the Week. Dylan Carney is a senior software engineer at zulily. They both work on personalization, recommendations and improving your shopping experience.

Abstract: Working on personalization and recommendations at zulily, we have come to lean heavily on on-premise Hadoop clusters to get real work done. Hadoop is a robust and fascinating system, with a myriad of knobs to turn and settings to tune.  Knowing the ins and outs of obscure Hadoop properties is crucial for the health and performance of your hadoop cluster. (To wit: How big is your fsimage? Is your secondary namenode daemon running? Did you know it’s not really a secondary namenode at all?)

But what if it didn’t have to be this way? Google Compute Engine (GCE) and other cloud platforms make promises of easier, faster and easier-to-maintain Hadoop installations. Join us as we describe learning from our years of Hadoop use, and give an overview of what we’ve been able to adapt, learn and unlearn while moving to GCE.

Topic: Apache Optiq in Hive

Speaker: Julian Hyde, Principal, Hortonworks

Abstract: Tez is making Hive faster, and now cost-based optimization (CBO) is making it smarter. A new initiative in Hive introduces cost-based optimization for the first time, based on the Optiq framework. Optiq’s lead developer Julian Hyde shows the improvements that CBO is bringing to Hive. For those interested in Hive internals, he gives an overview of the Optiq framework and shows some of the improvements that are coming to future versions of Hive.

Our format is flexible: We usually have 2 speakers who talk for ~30 minutes each and then do Q+A plus discussion (about 45 minutes each talk) finish by 8:45.

There will be beer afterwards, of course!

After-beer Location:

Paddy Coyne’s:  http://www.paddycoynes.com/

Doors open 30 minutes ahead of show-time. 

Optimizing memory consumption of Radix Trees in Java

On the Relevancy team at zulily, we are often required to load a large number of large strings into memory. This often causes memory issues. After looking at multiple ways to reduce memory pressure, we settled on Radix Trees to store these strings. Radix Trees provide very fast prefix searching and are great for auto-complete services and similar uses. This post focuses entirely on memory consumption.

What Is A Radix Tree?

Radix Trees take sequences of data and organize them in a tree structure. Strings with common prefixes end up sharing nodes toward the top of this structure, which is how memory savings is realized. Consider the following example, where we store “antidisestablishmentarian” and “antidisestablishmentarianism” in a Radix Tree:

+- antidisestablishmentarian (node 1)
                           +- ism (node 2)

Two strings, totaling 53 characters, can be stored as two nodes in a tree. The first node stores the common prefix (25 characters) between it and its children. The second stores the rest (3 characters). In terms of character data stored, the Radix Tree stores the same information in approximately 53% of the space (not counting the additional overhead introduced by the tree structure itself).

If you add the string “antibacterial” to the tree, you need to break apart node 1 and shuffle things around. You end with:

+- anti                             (node 3)
      |- disestablishmentarian      (node 4)
      |                      +- ism (node 2)
      +- bacterial                  (node 5)


Real-World Performance

We run a lot of software in the JVM, where memory performance can be tricky to measure. In order to validate our Radix Tree implementation and measure the impact, I pumped a bunch of pseudo-realistic data into various collections and captured memory snapshots with YourKit Java Profiler.

Input Data

It didn’t take long to hack together some real-looking data in Ruby with Faker. I created four input files of approximately 1,000,000 strings that included a random selection of 12-digit numbers, bitcoin addresses, email addresses and ISBNs.

sreed:src/ $ head zulily-oss/radix-tree/12-digit-numbers.txt
141273396879
414492487489
353513537462
511391464467
633249176834
347155664352
632411507158
752672544343
483117282483
211673267195

sreed:src/ $ head zulily-oss/radix-tree/bitcoins.txt
1Mp85mezCtBXZDVHGSTn3NYZuriwRMmW6D
1N8ziuitNLmSnaXy2psYpLcXvugHw1Yc5s
18DnruBzLHmnVHQhDghoa6eDt6sDkfuWKr
1A3sRfAnP89HE4RgNQARa3kCq4xFEF9eev
12WR4DrsR4mM8gDHZCuqXe2h37VUSUPSNu
1PRmYuevwZXZamBEgANzLXe2SjFneGDsXp
1EpjPwt8Ap47XA6HwJhCTxUZRDH11GKWuQ
1P8MAgobhLw4FYcFHbw7a8t2FvQZg8K597
15xhiiLdkin8zi6S5KL9DkDDQyvLb1pjjT
1NPEZeEjgGu5TYdz5d3kxjVfLwxAZ2fK6f

sreed:src/ $ head zulily-oss/radix-tree/emails.txt
jakayla.hoppe@krajcikpollich.info
abbey.goodwin@tromp.org
laney.dach@walkerlubowitz.biz
rosanna_towne@marks.name
sherwood@oberbrunnerauer.name
mohamed_rice@champlin.com
margaret_kirlin@greenfeldercasper.net
vince@funk.net
leora_ohara@hackett.biz
audra.hermann@bauch.org

sreed:src/ $ head zulily-oss/radix-tree/isbns.txt
216962073-7
640524955-7
955360834-5
429656067-0
605437693-4
204030847-4
037410069-1
239193083-6
182539755-4
034988227-4

Measuring Memory with YourKit

YourKit provides a measurement of “retained size” in its memory snapshots which is helpful when trying to understand how your code is impacting the heap. What isn’t necessarily intuitive about it, though, is what objects it excludes from this “retained size” measurement. Their documentation is very helpful here: only object references that are exclusively held by the object you’re measuring will be included. Instead of telling you “this is how much memory usage your object imposes on the VM,” retained size instead tells you “this is how much memory the VM would be able to garbage-collect if it were gone.” This is a subtle, but very real, difference if you wish to optimize memory consumption.

Thus, my memory testing needed to ensure that each collection held complete copies of the objects I wished to measure. In this case, each string key needed to be duplicated (I decided to intern and share every value I stored in order to measure only the memory gains from different key storage techniques).

// Results in shared reference, and inaccurate measurement
map1.put(key, value);
map2.put(key, value);

// Results in shared char[] reference, and better but
// still inaccurate measurement
map1.put(new String(key), value);
map2.put(new String(key), value);

// Results in complete copy of keys, and accurate measurement
map1.put(new String(key.toCharArray()), value);
map2.put(new String(key.toCharArray()), value);

Collections Tested

I tested our own Radix Tree implementation, ConcurrentRadixTree from https://code.google.com/p/concurrent-trees/, a string array, Guava‘s ImmutableMap and Java’s HashMap, TreeMap, Hashtable and LinkedHashMap. Each collection stored the same values for each key.

Both zulily’s Radix Tree and the ConcurrentRadixTree from concurrent-trees were configured to store string data as UTF-8-encoded byte arrays.

ConcurrentRadixTree was included simply to ensure that our own version (to be open-sourced soon) was worth the effort. The others were measured simply to highlight the benefits of Radix Tree storage for different input types. Each collection has its own merits and in most ways they are all superior to the Radix Tree for storage (put/get performance, concurrency and other features).

Results

radix-tree-memory-2

First of all, Guava’s ImmutableMap is pretty good. It stored the same key and value data as java.util.HashMap in 92-95% of the space. The Radix Tree breaks keys into byte array sequences and stores them in a tree structure based on common prefixes. This resulted in a best case of 62% the size of the ImmutableMap for bitcoin addresses (strings which have many common prefixes) and a worst case 88% for random 12-digit numbers. We see that the memory used by this data structure is largely dependent on the type of data put into it. Large strings with many large common prefixes are stored very efficiently in a narrow tree structure. Unique strings create a lot of branches in the underlying tree, making it very wide and adding a lot of overhead.

Converting Java Strings to byte arrays accounts for most of the memory savings, but not all. Byte array storage was anywhere from 90% (bitcoin addresses) to 99% (ISBNs) in the tests I ran.

For us, storing byte-encoded representations of string data in a radix tree allowed us to reclaim valuable memory in our services. However it wasn’t until validating the implementation in an accurate manner with realistic data and trustworthy tools that we rested easy knowing we had set out what we wished to accomplish.

welcome to the zulily engineering blog!

It has been just over four and a half years now since Darrell and Mark (our two co-founders) came up with the original idea for zulily.  And from the beginning we’ve focused on building software to power a new way of shopping online.  We call it discovery-based shopping.

Here at zulily our tech team is at the core of the business and involved in the entire life-cycle of both our vendors and our customers.  Whether it’s building internal tools for our merchandizing and studio teams, launching new features on our vendor portal or vendor data exchange or delivering a new personalized experience on our mobile or site experience, we are always focused on challenging ourselves to build world-class solutions which exceed expectations.

We are a build shop and big supporters of the open source community.  We believe in the power of the community and feel we have an obligation to give back to the projects that have helped us get to where we are today.  As we continue our transition from small, frenetic start-up, expect to see us continue to be more active in the community.

At our core we have 10 values we try to live by on a daily basis.  These have served us well over the past 4+ years as we’ve tried new things and experienced major wins… and a number of “well, that was a bad idea” moments.

  1. “No” is not in our vocabulary — we strive to find creative solutions and the path to “yeah, we’ll give it a go”.
  2. We believe in speed of innovation and taking agile development to the extreme.
  3. We embrace a customer-centric view to delivering technology solutions — always start with the customer.
  4. Mistakes are expected and encouraged — we learn from them and move on.
  5. We empower our engineers to solve business problems and tailor our process accordingly.
  6. Engineers write production code and own it from start to finish.
  7. We are defensive in nature: we assume things will break and plan for it.
  8. We believe in “just-in-time” software with an eye towards capacity and scalability.
  9. We value full transparency and continuous communication.
  10. We strive to find the simple solution in anything we do.

In the end we’re all about building an amazing team, passionate about building awesome software and technology solutions.  We love to move fast and take risks.  And we’re big believers in the idea of continuous improvement.

Thanks for taking a few minutes out of your busy day to read our tech blog.   We hope you enjoy it!

Luke

Meet a zulily Developer: Trevor

Processed with VSCOcamEach month zulily will talk with a developer and learn about a day in the life of a zulily engineer.

Who are you, and what do you do at zulily?

I’m Trevor, a developer on the Relevancy team. Prior to that, I worked on our fulfillment and warehouse management systems.

When did you join zulily?

I started in August of 2010, so it’s been 4 years now.

What was it like in the early days? Tell us a crazy story.

Oh man, where to start….

  • My first desk was the classic startup cliché: a door blank on top of two filing cabinets. (We have proper desks now.)
  • My second day on the job, the director in charge of the Supply Chain team stopped by my desk and introduced herself like so: “Hi, I’m Lys. I hear you’re traveling with me to our vendor site next week?” At that point my manager leaned over and said, “Oh, uh, heh, I meant to ask you: can you go to our vendor next week?”
  • The following week consisted of Lys and me in a conference room with 10 folks in suits from the supply chain logistics company with whom we were gearing up to integrate. I had never worked on anything remotely related to supply chain logistics before, and couldn’t have told you what “GOH” stood for if my life depended on it (“garment on hanger”, if you’re curious). I spent most of that week furiously scribbling notes and wondering what in the world I’d gotten myself into.
  • About a year later, we needed to build our own fulfillment center in Reno. My understanding at the time was that a typical FC startup project took 6 to 9 months. We had 10 weeks to go from an empty building to shipping packages — and we got it done. To me that was a testament to what a small, tightly focused, extremely motivated group of people can do. It was a lot of work, with not a lot of sleep, but in the end it was worth it.

How is that different from now?

Things are much, much less hectic nowadays. We still move fast and set aggressive goals, but we don’t have to burn ourselves out to achieve them. The team is also bigger now, so there are a lot more hands to help carry the load.

What’s a typical day like for you?

I usually get into the office at around 10 am. First off, I usually grab a cup of coffee and check email. Then I give our API monitoring charts a look to make sure everything’s healthy.

99% of the code I work on nowadays is in Java, so once I’ve confirmed that everything’s humming along I’ll fire up IntelliJ and get to coding. Somewhere between 11 a.m and 1 p.m. I’ll take a break for lunch, then back to coding for a few more hours before our daily afternoon standup meeting. After that, more coding, till around 7 p.m. when I head home.

We’re definitely fans of the “ship early, ship often” mantra. It’s not at all unusual for me to push 3 or 4 different builds to production over the course of a day. Of course, there are also plenty of days where I’m heads-down working on larger changes, but we try to keep our changes small enough, and the barrier to releasing new code low enough, that we don’t go dark for long stretches of time.

What gets you excited about working at zulily?

There are so many things:

  • The team is absolutely top-notch. I’m surrounded by smart, talented people, both on my immediate team and across the entire organization. I learn something new from my coworkers every day.
  • We move fast and try new things. Sometimes they work, sometimes they don’t, but every time we learn something new.
  • The Relevancy team’s mandate is to figure out how to quickly and accurately surface the most engaging content for our members. We’re continually searching for ways to improve our systems, either by trying new and novel recommendation algorithms, or by increasing our capacity and reducing the time it takes our recommendations to update in response to user behavior. It’s a fascinating space that combines machine learning with hard-core engineering for scale. I love it.
  • I’ve worked at places building packaged software with 9-to-12-month release cycles. It’s disheartening to put that much effort into a project, just to see it languish on a shelf somewhere because the customer can’t (or won’t) deploy it. Our team is the polar opposite of that-we push new code to production several times a day. This creates an incredible virtuous cycle. The barrier to pushing code live is low, which means you do it more often, which means each change is small, which means it’s both easy to verify and easy to roll back if something goes sideways. With such low friction, we’re constantly pushing forward, constantly improving our service, creating a much richer, more engaging experience for our members.

Experience optimization at zulily

godinbanditsExperimentation is the name of the game for most top tech companies, and it’s no different here at zulily. Because new zulily events launch every day, traditional experiments can be cumbersome for some applications. We need to be able to move quickly, so we’ve built a contextual multi-armed bandit system that learns in real time to help us deliver the best experience to each zulily member.

As zulily has grown over the past four and a half years, the number of new events and products launching each day has increased at a tremendous pace. This is great for our members, but it brings with it the challenge of ensuring that each member’s experience is as personalized as possible. My team, the Data Science team, and Relevancy, with whom we work closely, are tasked with seamlessly optimizing and customizing that experience. In order to do so, we run experiments — a lot of them. Even the most minor changes to the site usually have to prove their mettle by beating a control group in a well-designed, sufficiently-powered experiment.

Continue reading