書名： Learn Kotlin Programming（Second Edition）
作者名： Stephen Samuel Stefan Bocutiu
本章字?jǐn)?shù)： 1870字
更新時(shí)間： 2021-06-24 14:13:34

Polymorphism

After encapsulation and inheritance, polymorphism is seen as the third pillar of object-oriented programming. It decouples the what from how at the type level. One of the advantages that polymorphism offers is improved code organization and readability; furthermore, it allows you to extend your programs at any point later, when new features need to be implemented.

The word polymorphism originates from the Ancient Greek language—polys (πολ??) meaning many or much and morphē (μορφ?) meaning form or shape. There are multiple forms of polymorphism, but, in this chapter, we are going to talk about the one known as late-binding (or dynamic binding or runtime binding).

The power of polymorphism comes at runtime when objects of a derived class are treated as objects of the base class. This can happen for a method parameter or when it comes to storing a group of common elements in a collection or array. The peculiar thing here is that the object's declared type will not be identical with the actual runtime type when the code is executed. This sounds like there is some magic happening under the hood. All of this is happening through the use of virtual methods. Base classes may define and implement virtual methods, and derived classes can override them, thus providing their own implementation. This way, two distinct types behave differently when the same method is called. When the virtual method is called as your program is executed, the JVM looks up the runtime type of the instance and works out which method it should actually invoke. Later in the chapter, we will discuss in a bit more detail how this is implemented under the bonnet.

Virtual methods unify how we work with a group of related types. Imagine you are working on the next big drawing application and it must support the rendering of a variety of different shapes on the screen. The program has to keep track of all the shapes the user will create and react to their input—changing the location on the screen, changing the properties (border color, size, or background color, you name it!), and so on.

When you compile the code, you can't know in advance all the types of shape you will support; the last thing you want to do is handle each one individually. This is where polymorphism comes in. You want to treat all your graphical instances as a shape.

Imagine the user clicking on the canvas - your code needs to work out whether the mouse location is within the boundaries of one of the shapes drawn. What you should avoid is walking through all the shapes, and each one calling a different method to do the hit check—calling isWithinCircle for a circle shape, checkIsHit for a rhombus shape, and so on.

Let's have a look at how you could implement this using a textbook approach. First, we will define a Shape class. This needs to be an abstract class, and you shouldn't be able to create an instance of it. After all, how could it be drawn on the screen when it hasn't been specialized? Let's look at the code:

    abstract class Shape protected constructor() { 
      var XLocation: Int 
        get() = this.XLocation 
        set(value: Int) { 
          this.XLocation = value 
        } 
      var YLocation: Int 
        get() = this.XLocation 
        set(value: Int) { 
          this.XLocation = value 
        } 
      var Width: Double 
        get() = this.Width 
        set(value: Double) { 
          this.Width = value 
        } 
      var Height: Double 
        get() = this.Height 
        set(value: Double) { 
          this.Height = value 
        } 
      abstract fun isHit(x: Int, y: Int): Boolean 
    }

With this in place, we are going to implement two shapes—an ellipsis and a rectangle. A question for you—Does it make sense to implement a square type? Think about this. For now, let's implement the two shapes we just discussed:

class Ellipsis : Shape() { 
  override fun isHit(x: Int, y: Int): Boolean { 
    val xRadius = Width.toDouble / 2 
    val yRadius = Height.toDouble / 2 
    val centerX = XLocation + xRadius 
    val centerY = YLocation + yRadius 
    if (xRadius == 0.0 || yRadius == 0.0) 
      return false 
    val normalizedX = centerX - XLocation 
    val normalizedY = centerY - YLocation 
    return (normalizedX * normalizedX) / (xRadius * xRadius) + 
                (normalizedY * normalizedY) / (yRadius * yRadius) <= 1.0 
  } 
} 
 
class Rectangle : Shape() { 
  override fun isHit(x: Int, y: Int): Boolean { 
    return x >= XLocation && x <= (XLocation + Width) && y >=  YLocation && y <= (YLocation  + Height) 
  } 
}

We consider that the top-left corner of the canvas is at point (0,0). Given these types, we will create a few instances of them and see how polymorphism works. We will create two ellipses and one rectangle. We will then store the instances in a collection, and then for a given point we will work out whether it is within any of the given shapes:

    fun main(args: Array<String>) { 
      val e1 = Ellipsis() 
      e1.Height = 10 
      e1.Width = 12 
      val e2 = Ellipsis() 
      e2.XLocation = 100 
      e2.YLocation = 96 
      e2.Height = 21 
      e2.Width = 19 
      val r1 = Rectangle() 
      r1.XLocation = 49 
      r1.YLocation = 45 
      r1.Width = 10 
      r1.Height = 10 
      val shapes = listOf<Shape>(e1, e2, r1) 
      val selected: Shape? = shapes.firstOrNull { shape -> shape.isHit(50, 52)} 
      if (selected == null) { 
        println("There is no shape at point(50,52)") 
      } else{ 
        println("A shape of type ${selected.javaClass.simpleName} has been selected.") 
      } 
    }

Running the code will print out an instance of a rectangle found at the given coordinates. Using javap, look at the generated bytecode; the code should look similar to the following (leaving out most of it for the sake of simplicity):

169: invokevirtual #69           // Method  com/programming/kotlin/chapter03/Shape.isHit:(II)Z

So, at the bytecode level, there is a method named invokevirtual to call a virtual function. It is because of that, the code in Rectangle or Ellipsis gets invoked. But how does it know how and when to invoke it? Didn't I call the method on a Shape class?

Dynamic method resolution is handled through the vtable (that is, virtual table) mechanism. The actual approach might depend on the JVM implementation, but they will share the same logical implementation.

When any object instance is created, its memory allocation lives on the heap. The actual size of the memory being allocated is slightly bigger than the sum of all the allocated fields, including all the base classes, all the way to the Any class. The runtime footprint will get an extra space added at the top of the memory block to hold a reference to the type descriptor information. For each class type you define, there will be an object allocated at runtime. This entry has been added as the first entry to always guarantee the location, thus avoiding the need to compute it at runtime. This type descriptor holds the list of methods defined along with other information related to it. This list starts with the top class in the hierarchy and goes all the way to the actual type whose instance it belongs to.

The order is deterministic, again another example of optimization. This is known as the vtable structure and is nothing more than an array with each element pointing out (referencing) the actual native code implementation that will be executed. During the program execution, the JIT-er (the just-in time compiler) is responsible for translating the bytecode produced by your compiler into native/assembly code. If a derived class decides to override a virtual method, its vtable entry will point to the new implementation rather than the last class in the hierarchy providing it.

Let's imagine we have an A class defining fieldA; it automatically derives from the Any class. Then, we derive it and add an extra field to the new B class. Once we do this, we name it fieldB:

You can see from the preceding diagram that the A class defines a method called execute, which the derived class overrides. Alongside this, B also overrides the toString method defined by Any. This is a very simple example; however, it shapes how the runtime memory allocation will look. Creating an instance of B at runtime should have the following memory footprint:

Your variable of the B type is nothing but a reference to the memory block on the heap. Because the type information sits at the beginning of the block (as already discussed) with two indirections (or pointer dereferencing), the runtime can address it easily and quickly. The diagram is only referencing the vtable entries for the metadata type, for simplicity. I have highlighted the methods based on the class providing the implementation. The first two are defined and implemented by Any, and the next two are defined and implemented in the derived B class.

If you look at the bytecode generated when invoking the execute method through a reference of A, you will notice the presence of a special keyword—invokevirtual. This way, the runtime can execute its predefined procedure to discover which code it has to run. All this has been described earlier.

From what we just discussed, we can work out that a call to invokevirtual carries some runtime costs. The runtime has to first get the metadata type. From there, it identifies the vtable and then jumps to the beginning of the instruction set representing the assembly code for the method to be invoked. This is in contrast to a normal invokestatic routine, where executing such a method doesn't have to go through at least two levels of indirection. Invokestatic is the bytecode routine for calling a method non-virtually.

Any methods defined by an interface are virtual methods. When such a method is invoked for a derived class it gets special treatment. There is a specific method at the bytecode level to handle this:invokeinterface. Why can't it just be a simple invokevirtual method? Well, such a call needs more involvement than just following the simple process of calling a virtual method. Every invokeinterface receiver is considered a simple object reference. Unlike invokevirtual, an assumption can't be made about the vtable's location. While a call to invokevirtual can be fulfilled by performing two or three levels of indirection to resolve the method, a call to the interface level needs to first check whether the class actually implements the interface and, if so, where these methods are recorded in the implementing class.

There is no simple way to guarantee the method order in the vtable for two different classes implementing the same interface. Therefore at runtime, an assembly code routine has to walk through a list of all the implemented interfaces looking for the target. Once the interface is found because of the itable (or interface method table), which is a list of methods whose entry structure is always the same for each class implementing the interface, the runtime can proceed with invoking the method as a virtual function. There is a reason for this—we can have an A class that has implemented an interface X and a B class that is derived from A; this B class can override one of the methods declared at the interface level.

As you can see, virtual method calls are expensive. There are quite a few optimizations a JVM implementation would need to employ to short-circuit the call, but these details go beyond the scope of this book. I will let you do your own research if your curiosity is at that level. However, this is not information you need to know. The rule of thumb is to avoid building a complex class hierarchy with many levels since that will hurt your program performance because of the reasons presented earlier.

官术网_书友最值得收藏!

Learn Kotlin Programming（Second Edition）

Polymorphism