不朽的情缘免费旋转藤蔓大奖

書名： Learning WebRTC
作者名： Dan Ristic
本章字數： 2466字
更新時間： 2021-07-16 13:53:44

The WebRTC API

The next few sections will cover the WebRTC API currently implemented in the browser. These functions and objects allow developers to communicate with the WebRTC layer and make peer connections to other users. It consists of a few main pieces of technology:

The RTCPeerConnection object
Signaling and negotiation
Session Description Protocol (SDP)
Interactive Connectivity Establishment (ICE)

The RTCPeerConnection object

The RTCPeerConnection object is the main entry point to the WebRTC API. It is what allows us to initialize a connection, connect to peers, and attach media stream information. It handles the creation of a UDP connection with another user. It is time to get familiar with this name because you will be seeing it a lot throughout the rest of the book.

The job of the RTCPeerConnection object is to maintain the session and state of a peer connection in the browser. It also handles the setup and creation of a peer connection. It encapsulates all of these things and exposes a set of events that get fired at key points in the connection process. These events give you access to the configuration and internals of what happens during a peer connection:

The RTCPeerConnection object is a simple object in the browser and can be instantiated using the new constructor as follows:

var myConnection = new RTCPeerConnection(configuration);
myConnection.onaddstream = function (stream) {
  // Use stream here
};

The connection accepts a configuration object, which we will cover later in this chapter. In the example, we have also added a handler for the onaddstream event. This is fired when the remote user adds a video or audio stream to their peer connection. We will also cover this later in the chapter.

Signaling and negotiation

Typically, connecting to another browser requires finding where that other browser is located on the Web. This is usually in the form of an IP address and port number, which act as a street address to navigate to your destination. The IP address of your computer or mobile device allows other Internet-enabled devices to send data directly between each other; this is what RTCPeerConnection is built on top of. Once these devices know how to find each other on the Internet, they also need to know how to talk to each other. This means exchanging data about which protocols each device supports as well as video and audio codecs and more.

This means that, in order to connect to another user, you need to know quite a bit about them. One possible solution would be to store a list on your computer of the users that you can connect to. To enable communication with another user, you would simply have to exchange contact information and let WebRTC handle the rest. This has the drawback, however, of your having to manually share information with each user that you want to connect to. You would have to maintain a big list of any users you wanted to connect with and exchange information through some other channel of communication. With WebRTC, we can make this process much more automated.

Luckily, the Web today has solved this problem in most communication applications we use today. To connect with anyone on popular services such as Facebook or LinkedIn, you just need to know their name and search for them. You can then add them to your list of known contacts and access their information at any time. This process is known as signaling and negotiation in WebRTC.

The process of signaling consists of a few steps:

Generate a list of potential candidates for a peer connection.
Either the user or a computer algorithm will select a user to make a connection with.
The signaling layer will notify that user that someone would like to connect with him/her, and he/she can accept or decline.
The first user is notified of the acceptance of the offer to connect.
If accepted, the first user will initiate RTCPeerConnection with the other user.
Both the users will exchange hardware and software information about their computers over the signaling channel.
Both the users will also exchange location information about their computers over the signaling channel.
The connection will either succeed or fail between the users.

This, however, is just an example of how WebRTC signaling may happen. In reality, the WebRTC specification does not contain any standards on how two users are supposed to exchange information. This is due to the ever-growing list of standards on connecting users. Many standards exist today, and even more are being created on the process of signaling and negotiating. The WebRTC standard writers decided that to try and agree on one standard would prevent it from moving forward.

In this book, we are going to create our own implementation of signaling and negotiation. This means writing a simple server that can transfer information between two browsers. Although it will be simple and prone to security flaws, it should give you a good understanding of how this process should work in WebRTC. At the same time, feel free to explore the numerous signaling options presented by many companies today. There are hundreds of signaling and negotiation solutions out there and more popping up every day. Some integrate with the current phone- or chat-based implementations, such as XMPP or SIP, and some come up with an entirely new way of signaling.

Session Description Protocol

To get connected with another user, you need to know a bit about them first. Some of the most important things to know about the other client is what audio and video codecs they support, how their network looks, and how much data their computer can handle. It also needs to be easily transportable between clients. Since we do not specify how this data should be transferred, it should also be capable of being sent over numerous types of transport protocols. This means we need a string-based business card with all the information about a user that we can send to other users. Luckily, this is exactly what SDP provides us with.

The great thing about SDP is that it has been around a long time, dating back to the late 90s for the first initial draft. This means that SDP is a tried-and-true method of establishing media-based connections between clients. It has been used in numerous other types of applications before WebRTC, such as phones and text-based chatting. This also means there are a lot of great resources out there on using and implementing it.

The SDP is a string-based data blob provided by the browser. The format of this string is a set of key-value pairs, all separated by line breaks:

<key>=<value>\n

The key is a single character that establishes the type of value this is. The value is a structured set of text that comprises a machine-readable configuration value. The different key-value pairs are then split by line breaks.

The SDP will cover the description, timing configuration, and media constraints for a given user. The SDP is given by the RTCPeerConnection object during the process of establishing a connection with another user. When we start working with the RTCPeerConnection object later in the chapter, you can easily print this to the JavaScript console. This will allow you to see exactly what is contained in the SDP, which may look something like this:

v=0
o=- 1167826560034916900 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video
a=msid-semantic: WMS K44HTOZVjyAyAlvUVD3pOLu8i0LdytHiWRp1
m=audio 1 RTP/SAVPF 111 103 104 0 8 106 105 13 126
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:Vl5FBUBecw/U3EzQ
a=ice-pwd:OtsNG6FzUH8uhNEhOg9/hprb
a=ice-options:google-ice
a=fingerprint:sha-256 FB:56:7D:B6:E0:C7:E7:39:FE:47:5A:12:6C:B4:4E:0E:2D:18:CE:AE:33:92: A9:60:3F:14:E4:D9:AA:0D:BE:0D
a=setup:actpass
a=mid:audio
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=sendrecv
a=rtcp-mux
a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:zE+3pkUbJyFG4UmmvPxG/OFC4+QE24X8Zf3iOSCf
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:126 telephone-event/8000
a=maxptime:60
a=ssrc:4274470304 cname:+j4Ma6UfMsCcQCWK
a=ssrc:4274470304 msid:K44HTOZVjyAyAlvUVD3pOLu8i0LdytHiWRp1 a1751f6b-98de-469b-b6c0-81f46e19009d
a=ssrc:4274470304 mslabel:K44HTOZVjyAyAlvUVD3pOLu8i0LdytHiWRp1
a=ssrc:4274470304 label:a1751f6b-98de-469b-b6c0-81f46e19009d
m=video 1 RTP/SAVPF 100 116 117
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:Vl5FBUBecw/U3EzQ
a=ice-pwd:OtsNG6FzUH8uhNEhOg9/hprb
a=ice-options:google-ice
a=fingerprint:sha-256 FB:56:7D:B6:E0:C7:E7:39:FE:47:5A:12:6C:B4:4E:0E:2D:18:CE:AE:33:92: A9:60:3F:14:E4:D9:AA:0D:BE:0D
a=setup:actpass
a=mid:video
a=extmap:2 urn:ietf:params:rtp-hdrext:toffset
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send- time
a=sendrecv
a=rtcp-mux
a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:zE+3pkUbJyFG4UmmvPxG/OFC4+QE24X8Zf3iOSCf
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=ssrc:3285139021 cname:+j4Ma6UfMsCcQCWK
a=ssrc:3285139021 msid:K44HTOZVjyAyAlvUVD3pOLu8i0LdytHiWRp1 bd02b355-b8af-4b68-b82d-7b9cd03461cf
a=ssrc:3285139021 mslabel:K44HTOZVjyAyAlvUVD3pOLu8i0LdytHiWRp1
a=ssrc:3285139021 label:bd02b355-b8af-4b68-b82d-7b9cd03461cf

This is taken from my own machine during the session initiation process. As you can see, the code that is generated is complex to understand at first glance. It starts off by identifying the connection with the IP address. Then, it sets up basic information about the request such as whether I am requesting audio, video, or both. Next it sets up some audio information, including topics such as encryption type and the ice configuration. It also sets up the video information in the same manner. In the end, the goal is not to understand every line, but to get familiar with what the use of SDP is. You will never have to work with it directly during the course of this book, but may need to at some point in the future.

Overall, the SDP acts as a business card for your computer to other users trying to connect with you. The SDP, combined with signaling and negotiation, is the first half of the peer connection. In the next few sections, we will cover what happens after both users know how to find each other.

Finding a clear route to another user

A big part of most networks today is security. The chances are that any network you are using has several layers of access control, telling your data where and how it can be sent. This means that connecting to another user requires finding a clear path around not just your own network, but the other user's network as well. There are multiple technologies involved to achieve this inside WebRTC:

Session Traversal Utilities for NAT(STUN)
Traversal Using Relays around NAT (TURN)
Interactive Connectivity Establishment (ICE)

These involve a number of servers and connections in order to be used properly by WebRTC. To understand how they work, we should first visualize how the layout of a typical WebRTC connection process looks like:

First off is finding out your IP address. Almost all devices connected to the Internet have an IP address, identifying their location on the Web. This is how you direct your data packets to the correct destination. The issue arises while finding your IP address in a network that is sitting behind a network router. The router hides your computer's IP address and replaces it with another one to increase security and allow multiple computers to use the same network address. Typically, you can have several IP addresses between yourself, your network router, and the public Internet.

Session Traversal Utilities for NAT

STUN is the first step in finding a good connection between two peers. It helps identify each user on the Internet, and is intended to be used by other protocols in making a peer connection. It starts by making a request to a server, enabled with the STUN protocol. The server then identifies the IP address of the client making the request, and returns that to the client. The client can then identify itself with the given IP address.

Using the STUN protocol requires having a STUN-enabled server to connect to. Currently, in Firefox and Chrome, default servers are provided directly from the browser vendors. This is great for getting up-and-running quickly and testing things out.

Tip

Although you may be praising the joys of serverless communication, setting up a good quality WebRTC application actually requires several servers to be enabled. You will need to provide your own set of STUN and TURN servers for your clients to use. There are plenty of great services already providing this today, so be sure to search around to find more information.

Traversal Using Relays around NAT

In some cases, a firewall might be too restrictive and not allow any STUN-based traffic to the other user. This may be the case in an enterprise NAT that utilizes port randomization to allow thousands of more devices than you would typically find. In this case, we need a different method of connecting with another user. The standard for this is called TURN.

The way this works is by adding a relay in between the clients that acts as a peer to peer connection on behalf of the client. The client then gets its information from the TURN server, much like streaming a video from a popular video service by making a request out to the server. This requires the TURN server to download, process, and redirect every packet that gets sent to it for each client. This is why, using TURN is often considered a last resort when making a WebRTC connection as the cost is high for setting up a quality TURN service.

There are many different statistics published on the use of STUN versus TURN, but they all seem to point to the same conclusion—most of the time, your users will be fine without TURN. The use of WebRTC with STUN will work with most network configurations. When setting up your own WebRTC service, it is a good idea to track this information and decide for yourself if the cost of using a TURN service is worth it.

Note

You may notice that none of the examples have configuration values for TURN servers. The book assumes that the network you are on will be compatible with STUN. If you are having trouble connecting, it may be necessary to find a public low-use TURN server and use it while following the examples.

Interactive Connectivity Establishment

Now that we have covered STUN and TURN, we can learn how it is all brought together through another standard called ICE. It is the process that utilizes STUN and TURN to provide a successful route for peer to peer connections. It works by finding a range of addresses available to each user and testing each address in sorted order, until it finds a combination that will work for both the clients.

The process of ICE starts off by making no assumptions about each user's network configuration. It will incrementally go through a set of steps to discover how each client's network is set up. This process will use different sets of technologies to do this. The goal is to discover enough information about each network to make a successful connection.

Each ICE candidate is found through the use of STUN and TURN. It will query the STUN server to find the external IP address and append the location of a TURN server as a backup if the connection fails. Whenever the browser finds a new candidate, it notifies the client application that it needs to send the ICE candidate through the signaling channel. After enough addresses have been found and tested, and a connection is made, the process finally comes to an end.

官术网_书友最值得收藏!

Learning WebRTC