Source: Deep Learning on Medium
A company needs to generate business success (and just to sound more human, although it doesn’t need to — I do hope it also creates “real” value to humanity). Simply put — to generate business success you need a very good product and strong sales\BizDev\marketing efforts. To have a very good product you need two things: first — to make very good product decisions, second — technologically and operationally implement these product decisions successfully. To make good product decisions, you need to do the above two very fast (because the only way to make good product decisions is to bring them fast to the market, validate them, refine them, and iterate). Many people think this is only true for startups, but in my opinion, this is also true for the largest companies, because large companies have many products — and many of which are in their startup phase, which means the laws of startups will effect their success.
I had the chance to speak with many data scientists\machine learning researchers in other companies in the last few years. A frequently repeating motive I saw, is exemplified in the following (true, but anonymized) story:
Dan was a data scientist in his company. He was in charge of an algorithm that detects hot dogs in images. He was given a relatively tight deadline (4 months) to bring the algorithm with the best performance he could to the market. He decided to read many academic papers in the field of deep learning and implement them. Read and implement. Read and implement. After speaking with him for a few minutes, I asked if he considered other alternatives — which worked very well for our team. For example — improving the data annotation GUI (graphical-user-interface) so that it will be much easier to use for the annotators, and they will be able to provide x10 more data in the same amount of time. Dan agreed with me that it was actually possible to do this (he had ideas for significantly improving the data annotation GUI), and agreed that in his case — increasing the amounts of data x10, is probably even more promising under such a deadline than reading and implementing more papers (which is an activity that takes a lot of time per paper, and succeeds to improve the performance in a relatively small fraction of the cases).
I asked him, if you agree — why aren’t you doing this? (improving the GUI to accelerate the annotation x10). And he told me — “What do you mean? I’m a data-SCIENTIST [or in other cases — I’m a RESEARCHER] — I’m not a software engineer. I didn’t come here to work on GUIs”
Similar things I heard in other cases — I didn’t come here to: Prepare demos for customers, Refactor the code to be more readable, modular or reliable, Write good code in the first place, Write tests, Run tests before merging, Implement papers from 2 years ago, even if they’re 2 times more simple and good enough, Do thorough code reviews, Develop big-data pipelines, Operate the data pipelines to get the data I need
From here I’ll paraphrase to shorten — he went on and told me: ‘I completely understand there’s no one else developing that GUI, and that my role definition and capability set is the closest to being able to perform it in the relevant timeframe. I really want the product to succeed, but I just didn’t come here to do this, I came here to do research. I’ll do everything possible to make the product succeed, as long as it’s research’. To create a successful product in A.I. — you have to do research. And you have to do science. But as for the other 50–80% of the time, you have to do many other things — without which you just can’t ship a product that users easily use, love, that has demos that convince potential customers to use it, that passes regulatory approval, etc.
This is ownership. Feeling end-to-end responsibility for the task. Never saying — I came here to do X, so my responsibility starts here and ends here. Ownership is — doing what ever it takes for the task to succeed. If there’s someone better fit for a specific sub-task that’s also available — probably he should do it, but otherwise — I’ll do it.
(By the way, don’t get me wrong, not only A.I. people have lack of ownership. Many (pure) software engineers also demonstrate it. For example, I had the chance to interview several software engineers that were excited only by projects that contained the most trendy “technologies” [this is their analogue of papers]. They weren’t passionate and even complained about other projects that were “just writing code”, no matter how challenging the project was).
Ownership is a core value in our team and we recruit by it. We look for people who can’t see themselves thriving without a full sense of ownership. People who wake up smiling in the morning because they’re going to do whatever it takes to win, and it’s really thanks to them that we will win. We’ll reject the brightest and most experienced Deep Learning experts on this value. And before we do that — we’ll make every effort to put our set of values and expectations as transparently as possible on the table. I send a document that lists our values (together with links to blog posts like this one that explain them) to all our candidates — before the first interview. I see it as a success if someone, as genius as s\he may be, rejects us because he didn’t connect with our values.
I’m relatively confident going out with these ideas now, because I’m doing this AFTER we’ve proven that it’s possible to build a team of top A.I. experts who all have a sheer excitement for ownership.
Along time the 50% of non-research work grew to 80% and there we all agreed things are unbalanced and decided to create a new role — A.I. Infrastructure Engineers. These are expert software engineers who’s main definition of responsibility is the software engineering of the team’s infrastructure (big-data, production, research).
Why did this happen? Is there a limit to ownership? First of all — in the first years of our company, we saw without any doubt, that from a business perspective the infrastructure is at least as important as the algorithmic work. Once the algorithms pass a certain threshold of maturity, customers love your product and you want to focus on scaling things up. Since we want to excel at that at least as much as we want to excel in the algorithmic side — the way to excel at something is to bring people who see it as the main goal of their career, who will do everything to be the best at it. Even the smartest deep learning expert has a limit to the amount she could learn — even with the best intentions it’s very hard to strive to excel at both deep learning algorithms and software engineering (although given enough time I’m pretty sure it’s not impossible — and personally that’s where I’m aiming for in the long run).
The second reason we decided to hire A.I. infrastructure engineers is that it was clear we were heading towards 99% infrastructure work, and it was clear that every person who came to be an algorithm engineer won’t be happy over time if he will not do any deep learning. So of course there is a balance. I don’t look at that as a lack of ownership. There’s a huge ocean of difference between a) people who will be SATISFIED by doing everything that is important as long as it makes the company successful and they have a fair amount of algorithmic work and major career advancement over time, to b) people who will do what you tell them but without satisfaction and complain a lot on the glass being half-empty, to c) people who will actively avoid doing anything that’s not research.
Does this mean that every month that an algorithm engineer works 80–100% on software engineering is a failure? Definitely not.
Looking back at the first years of our company, these things really depend on the period of time in the company’s life. There were periods in which the most important thing was to do research 95% of the time and it was a waste of time to write tests. And there were periods where the most important thing was to scale up the existing algorithms without improving the accuracy by a single percent. And there were periods where a single project was in a certain phase and a second project was in a different phase. Though it is our role as managers to plan towards the future, it’s very hard to predict accurately the personnel needs towards the future. Even now when we hire infrastructure engineers there will be times when the algorithm engineers will be required to do 100% software engineering (and believe me — they’ll love every minute of it and beg me to do more, because to say the truth — software engineering done right is not a bit less interesting, challenging or satisfying than deep learning algorithms), and there will be times when the infrastructure software engineers will be required to work on the algorithms.
Just recently one of the algorithm engineers in our team, spent a month developing a big-data pipeline that enabled us to perform medical research on hundreds-of-thousands of CT scans from certain hospitals. This research enabled us to bring very impressive clinical evidence to the medical value that our product gives. This job had nothing to do with research of new deep learning algorithms, but its products were so convincing that it had everything to do with surpassing our sales goals for 2018 by a huge margin.
Hiring dedicated A.I. Infrastructure Engineers didn’t come from a limit to ownership, but rather from a commitment to excellence.
The A.I. infrastructure engineers are a core and central part of the A.I. team (not a separate team, and not the “code monkeys of the researchers” that I heard a few other companies have). Infrastructure engineers undergo deep learning courses in their training, and algorithm engineers undergo thorough courses about writing “clean code” and software tests. The “deep snips” (weekly professional seminars passed by the team, which I described in a previous post) are given on both worlds. This common language, and cooperation, creates a very strong team.
On top of that, there is one additional thing that I learned from letting our algorithm engineers work on pure software-engineering (and sometimes even non-software) work. When you let bright people work on anything — they come up with genius ideas to any type of tough business problem. When you let the strongest Physicist of a Talpiot cohort* and the strongest Mathematician of a Psagot cohort* work on these things, and they do it with all the passion — they find genius ways to make your processes more efficient and to make the company go ten-times faster. These ideas don’t end up running on a GPU, and sometimes even not in Python code, but rather in a Google Doc describing this new process, but they are not less genius or groundbreaking.
And for all the people who think the success of an A.I. startup is a Cinderella story of a genius Phd who thinks about an algorithm that is 10x better than everyone else’s — It’s only a very small part of the success. Even with the best algorithms I’m pretty sure you will rarely succeed, and you will rarely deliver, and you rarely sell, and you will not scale-up fast enough if you don’t find a million small and genius tricks that make you move faster and smarter than everyone else.
My personal belief is that those companies who do whatever it takes to succeed, will also survive long enough to build algorithms that are not 10 but a 100 times better.
To summarize — I know and hope there are already many teams like ours out there. But I also know that many aren’t. There are probably cases in which it’s better not to be like us, but if you feel it’s not the case for you I hope you connected to this post. If you’re a team leader or a (positive leader) team member — my advice to you is to try nurturing this type of culture and looking for this kind of people — that do what ever it takes for the task to succeed. People that say about non-algorithmic tasks: “If there’s someone better fit for a specific task that’s also available to do it in time — probably he should do it, but otherwise — I’ll do it (and if it repeats I’ll recommend my managers to recruit more software engineers).”
Don’t let the loud minority trying to set the tone for the entire A.I. community fool you. It’s more than possible to find people who will be the best A.I. researchers and prefer to work in that type of culture. It’s more than possible to find people who will be the best A.I. researchers and will be very passionate about software engineering, about writing tests, about running tests and about optimizing processes for regulatory approvals.
(*) Talpiot and Psagot are elite academic training programs.