- Test-Driven Machine Learning
- Justin Bozonier
- 597字
- 2021-07-30 10:20:00
The TDD cycle
The TDD cycle consists of writing a small function in the code that attempts to do something that we haven't programmed yet. These small test methods will have three main sections: the first section is where we set up our objects or test data; the second section is where we invoke the code that we're testing; and the last section is where we validate that what happened is what we thought would happen. You will write all sorts of lazy code to get your tests to pass. If you are doing it right, then someone who is watching you should be appalled at your laziness and tiny steps. After the test goes green, you have an opportunity to refactor your code to your heart's content. In this context, "refactor" refers to changing how your code is written, but not changing how it behaves.
Let's examine more deeply the three steps of TDD: Red, Green, and Refactor.
Red
First, create a failing test. Of course, this implies that you know what failure looks like in order to write the test. At the highest level in machine learning, this might be a baseline test where baseline is a "better than random" test. It might even be "predicts random things", or even simpler "always predicts the same thing". Is this terrible? Perhaps, it is to some who are enamored with the elegance and artistic beauty of his/her code. Is it a good place to start though? Absolutely. A common issue that I have seen in machine learning is spending so much time up front, implementing The One True Algorithm that hardly anything ever gets done. Getting to outperform pure randomness, though, is a useful change that can start making your business money as soon as it's deployed.
Green
After you have established a failing test, you can start working to get it green. If you start with a very high-level test, you may find that it helps to conceptually break that test up into multiple failing tests that are lower-level concerns. I'll dive deeper into this later on in this chapter, but for now, just know that if you want to get your test passing as soon as possible, lie, cheat, and steal to get there. I promise that cheating actually makes your software's test suite that much stronger. Resist the urge to write the software in an ideal fashion. Just slap something together. You will be able to fix the issues in the next step.
Refactor
You got your test to pass through all manner of hackery. Now you get to refactor your code. Note that it is not to be interpreted loosely. Refactor specifically means to change your software without affecting its behavior. If you add the if
clauses, or any other special handling, you are no longer refactoring. Next, you write the software without tests. One way you will know for sure that you are no longer refactoring is if you've broken previously passing tests. If this happens, we back up our changes until our tests pass again. It may not be obvious, but this isn't all that it takes for you to know that you haven't changed behavior. Read Refactoring: Improving the Design of Existing Code, Martin Fowler for you to understand how much you should really care for refactoring. In his illustration in this book, refactoring code becomes a set of forms and movements, not unlike karate katas.
This is a lot of general theory, but what does a test actually look like? How does this process flow in a real problem?
- Visual FoxPro程序設(shè)計教程(第3版)
- Practical DevOps
- 大數(shù)據(jù)分析與應(yīng)用實戰(zhàn):統(tǒng)計機器學習之數(shù)據(jù)導向編程
- 從零開始學C語言
- 微信小程序全棧開發(fā)技術(shù)與實戰(zhàn)(微課版)
- C#程序設(shè)計教程(第3版)
- Getting Started with React Native
- Microsoft Dynamics AX 2012 R3 Financial Management
- 區(qū)塊鏈技術(shù)進階與實戰(zhàn)(第2版)
- “笨辦法”學C語言
- Python Interviews
- jQuery技術(shù)內(nèi)幕:深入解析jQuery架構(gòu)設(shè)計與實現(xiàn)原理
- 進入IT企業(yè)必讀的324個Java面試題
- Hands-On Dependency Injection in Go
- Distributed Computing with Python