官术网_书友最值得收藏!

Test-driven development

Kent Beck wrote in his seminal book on the topic that TDD consists of only two specific rules, which are as follows:

  • Don't write a line of new code unless you first have a failing automated test
  • Eliminate duplication

This, as he notes fairly quickly, leads us to a mantra, really the mantra of TDD: "Red, Green, Refactor."

If this is a bit abstract, let me restate that TDD is a software development process that enables a programmer to write code that specifies the intended behavior before writing any software to actually implement the behavior. The key value of TDD is that at each step of the way, you have working software as well as an itemized set of specifications.

TDD is a software development process that requires the following:

  • The writing of code to detect the intended behavioral change.
  • A rapid iteration cycle that produces working software after each iteration.
  • Clear definitions of what a bug is. If a test is not failing but a bug is found, it is not a bug. It is a new feature.

Another point that Kent makes is that ultimately, this technique is meant to reduce fear in the development process. Each test is a checkpoint along the way to your goal. If you stray too far from the path and wind up in trouble, you can simply delete any tests that shouldn't apply, and then work your code back to a state where the rest of your tests pass. There's a lot of trial and error inherent in TDD, but the same applies to machine learning.

As a result, this whole process changes our minds. The software that you design using TDD will also be modular enough to be able to have different components swapped in and out of your pipeline. We will see more of this in the later chapters of this book.

You might be thinking that just thinking through test cases is equivalent to TDD. If you are like most people, what you write is different from what you might verbally say, and very different from what you think. By writing the intent of our code before we write our code, it applies a pressure to the software design that prevents you from writing "just in case" code. By this I mean the code that we write just because we aren't sure if there will be a problem. Using TDD, we think of a test case, prove that it isn't supported currently, and then fix it. If we can't think of a test case, we then don't add code.

TDD can and does operate at many different levels of the software under development. Tests can be written against functions and methods, entire classes, programs, web services, neural networks, random forests, and whole machine learning pipelines. At each level, the tests are written from the perspective of the prospective client. How does this relate to machine learning? Let's take a step back and reframe what I just said.

In the context of machine learning, tests can be written against functions, methods, classes, mathematical implementations, and all the machine learning algorithms. TDD can even be used to explore technique and methods in a very directed and focused manner, much like you might use a REPL (an interactive shell where you can try out snippets of code) or interactive Python (or IPython) sessions.

主站蜘蛛池模板: 若尔盖县| 镇巴县| 金堂县| 黑龙江省| 新安县| 泸水县| 大城县| 扬中市| 舒城县| 齐齐哈尔市| 宣恩县| 磴口县| 斗六市| 和政县| 石屏县| 定安县| 平度市| 鹤岗市| 华池县| 兖州市| 黄山市| 重庆市| 来凤县| 宾川县| 乌拉特中旗| 新昌县| 聂拉木县| 竹山县| 望城县| 西昌市| 恭城| 定南县| 丰都县| 泸西县| 竹北市| 历史| 武城县| 潞西市| 九寨沟县| 梓潼县| 青海省|