- Test-Driven Machine Learning
- Justin Bozonier
- 563字
- 2021-07-30 10:19:59
Test-driven development
Kent Beck wrote in his seminal book on the topic that TDD consists of only two specific rules, which are as follows:
- Don't write a line of new code unless you first have a failing automated test
- Eliminate duplication
This, as he notes fairly quickly, leads us to a mantra, really the mantra of TDD: "Red, Green, Refactor."
If this is a bit abstract, let me restate that TDD is a software development process that enables a programmer to write code that specifies the intended behavior before writing any software to actually implement the behavior. The key value of TDD is that at each step of the way, you have working software as well as an itemized set of specifications.
TDD is a software development process that requires the following:
- The writing of code to detect the intended behavioral change.
- A rapid iteration cycle that produces working software after each iteration.
- Clear definitions of what a bug is. If a test is not failing but a bug is found, it is not a bug. It is a new feature.
Another point that Kent makes is that ultimately, this technique is meant to reduce fear in the development process. Each test is a checkpoint along the way to your goal. If you stray too far from the path and wind up in trouble, you can simply delete any tests that shouldn't apply, and then work your code back to a state where the rest of your tests pass. There's a lot of trial and error inherent in TDD, but the same applies to machine learning.
As a result, this whole process changes our minds. The software that you design using TDD will also be modular enough to be able to have different components swapped in and out of your pipeline. We will see more of this in the later chapters of this book.
You might be thinking that just thinking through test cases is equivalent to TDD. If you are like most people, what you write is different from what you might verbally say, and very different from what you think. By writing the intent of our code before we write our code, it applies a pressure to the software design that prevents you from writing "just in case" code. By this I mean the code that we write just because we aren't sure if there will be a problem. Using TDD, we think of a test case, prove that it isn't supported currently, and then fix it. If we can't think of a test case, we then don't add code.
TDD can and does operate at many different levels of the software under development. Tests can be written against functions and methods, entire classes, programs, web services, neural networks, random forests, and whole machine learning pipelines. At each level, the tests are written from the perspective of the prospective client. How does this relate to machine learning? Let's take a step back and reframe what I just said.
In the context of machine learning, tests can be written against functions, methods, classes, mathematical implementations, and all the machine learning algorithms. TDD can even be used to explore technique and methods in a very directed and focused manner, much like you might use a REPL (an interactive shell where you can try out snippets of code) or interactive Python (or IPython) sessions.
- Fundamentals of Linux
- 數據結構和算法基礎(Java語言實現)
- C#編程入門指南(上下冊)
- 算法大爆炸:面試通關步步為營
- 網頁設計與制作教程(HTML+CSS+JavaScript)(第2版)
- Linux網絡程序設計:基于龍芯平臺
- Building Mobile Applications Using Kendo UI Mobile and ASP.NET Web API
- Java程序設計與實踐教程(第2版)
- 程序員修煉之道:通向務實的最高境界(第2版)
- QGIS By Example
- Android應用案例開發大全(第二版)
- Learning Android Application Testing
- 基于GPU加速的計算機視覺編程:使用OpenCV和CUDA實時處理復雜圖像數據
- 實驗編程:PsychoPy從入門到精通
- Java EE項目應用開發