Content area
Real-time communication (RTC) is a new standard and industry-wide effort that expand the web browsing model, allowing access to information in areas like social media, chat, video conferencing, and television over the internet, and unified communication. Users of these systems can view, record, remark, or edit video and audio content flows using time-critical cloud infrastructures that enforce the quality of services. However, there are many proprietary protocols and codecs available, which are not easily interoperable and scalable to implement multipoint video-conference systems. WebRTC (Web real-time communication) is a state-of-the-art open technology that makes real-time communication capabilities in audio, video, and data transmission possible in real-time communication through web browsers using JavaScript APIs (Application Programming Interface) without plug-ins. Furthermore, peer-to-peer (P2P) communication enables more efficient bandwidth usage and resilience against network errors, as E-Learning, multimedia data must be streamed in real-time with the highest possible level of video quality to be easily understood by students. In this paper, we have proposed a web-based peer-to-peer real-time communication system using the Mozilla Firefox together with the Scale Drone service that enables users to communicate with high-speed data transmission over the communication channel using WebRTC technology, HTML5 and use Node.js server address. Moreover, the system can be used for massive open online courses, enabling presenters to have P2P multipoint video-conference with their auditorium. The result shows that the system is stable, fully functional, safe, and can be used in a practical network to transmit and receive multimedia data in real-time between users. For students, by using the video conference system, they can do better collaborative learning by sharing the knowledge between them and exchange different ideas about course activities, and after that, to resolve the tasks received from the teachers. Cognitive analysis and understanding of the tasks and lessons from a class are increasing with the development of social skills.
Abstract Real-time communication (RTC) is a new standard and industry-wide effort that expand the web browsing model, allowing access to information in areas like social media, chat, video conferencing, and television over the internet, and unified communication. Users of these systems can view, record, remark, or edit video and audio content flows using time-critical cloud infrastructures that enforce the quality of services. However, there are many proprietary protocols and codecs available, which are not easily interoperable and scalable to implement multipoint video-conference systems. WebRTC (Web real-time communication) is a state-of-the-art open technology that makes real-time communication capabilities in audio, video, and data transmission possible in real-time communication through web browsers using JavaScript APIs (Application Programming Interface) without plug-ins. Furthermore, peer-to-peer (P2P) communication enables more efficient bandwidth usage and resilience against network errors, as E-Learning, multimedia data must be streamed in real-time with the highest possible level of video quality to be easily understood by students. In this paper, we have proposed a web-based peer-to-peer real-time communication system using the Mozilla Firefox together with the Scale Drone service that enables users to communicate with high-speed data transmission over the communication channel using WebRTC technology, HTML5 and use Node.js server address. Moreover, the system can be used for massive open online courses, enabling presenters to have P2P multipoint video-conference with their auditorium. The result shows that the system is stable, fully functional, safe, and can be used in a practical network to transmit and receive multimedia data in real-time between users. For students, by using the video conference system, they can do better collaborative learning by sharing the knowledge between them and exchange different ideas about course activities, and after that, to resolve the tasks received from the teachers. Cognitive analysis and understanding of the tasks and lessons from a class are increasing with the development of social skills.
Keywords: E-Learning, real-time, WebRTC, Node.js server, HTML5, JavaScript API
INTRODUCTION
The open-source WebRTC enabled users of these systems to view video content or record, comment on stream it to achieve real-time communication between web browsers. WebRTC is a realtime communication technology that has integrated standards of API (Application Programming Interface) with real-time multimedia transfer such as voice and video (including codes) available to a web browser without traditional plug-in components using JavaScript code.[1][3]
Recently, there was an increase in the new platform implementation of real-time communications services: browser embedded application or "web application." Among these applications, WebRTC has received significant interest since a lot of new versions of inherently supports this API common browsers, namely Google Chrome and Mozilla Firefox. WebRTC, which relies on HTML5 web communication, holds Peer Connection, Media Stream, and Data Channels components API that can be combined to create P2P direct media communication between peers. The current version of the WebRTC API was designed only to support browser-to-browser communication. WebRTC for "Multi-browser" communication is not inherently recommended, especially for conference models that spread the media load over participating peers/browsers [5].
WebRTC approaches more like a WebSocket, but WebSocket opens a pipe of connection with a server instead of another peer. In most cases, these technologies are used together for signaling purposes. In chat applications, for example, WebSocket clients first send messages to the server, and the server sends the messages to the recipients.
WebRTC promises to provide secured direct P2P communication between users and free of plug-ins. WebRTC assures a simplified, flexible, and cost-effective means of real-time communication for users without dependence on service providers. A critical challenge with plug-ins such as Flash, Silverlight, and Shockwave is the need for downloads each time a connection is to be established. Plug-ins can be problematic during execution; they increase bandwidth, latency, execution time, and speed.[16]
Section I of the paper presents the Related Work, describing the technological architecture of the WebRTC and the main protocols used. Then section II presents the functionality of a TURN server using ICE (Interactive Connectivity Establishment) protocols for data transport. Section III illustrates the experimental results of the video conference application designed using Scaledrone, which is a push messaging service and the Mozilla Firefox browser. Finally, in section IV, the results are concluded.
I.RELATED WORK
1.1. Architecture of WebRTC
Figure no.1. shows the architecture of the WebRTC[18]. In general, WebRTC consists of three parts:
* The API layer for Web developers,
* The API layer for browser developers,
* The custom service layer for browser developers.
The WebRTC (figure no .1.) contains a Voice Engine, Video Engine, and tools for Transport and communication. Web browsers and other native applications can access the framework through its C++ API. Web applications cannot access this low-level API for security and interoperability reasons, so web browsers need to provide another way for developers to use it. The standard way of doing this is through a JavaScript API. Web applications can use standardized JavaScript API to access the functionality of WebRTC.[2][4]
The most common WebRTC Trapezoid model (see Figure no.2.), both browsers are running on a web application, downloaded from a different Web Server (or as usually in the field one common server). A Peer Connection configures the path to flow directly between browsers without any interventions from servers. Signaling goes through HTTP or WebSockets, via Web Servers that can modify, translate, or manage signals as required. It is to be taken into account nothing that the signaling between server and browser is not standardized in WebRTC because it is considered to be part of the application. The two web servers can communicate by using a standard signaling protocol, such as Session initiation protocols(SIP) or Jingle [XEP-0166]. Otherwise, a proprietary signaling protocol can be used [7].
A WebRTC web application uses standard WebRTC APIs to allow it to exploit and control browser features in real-time properly. He has to do more things:
* Get streaming audio, video, or other data.
* Get information about the network, such as IP addresses and ports, and change this with other WebRTC clients (known as partners) to allow a connection, even through NAT and firewalls.
* Coordinate signaling communication to report errors and initiate or close sessions.
* Change information about media and client capacities, such as resolution and codecs.
* Communicate streaming audio, video, or data.[15]
To acquire and communicate streaming data, WebRTC is implementing the following APIs:
* MediaStream: Get access to data feeds, such as the user's camera and the microphone.
* RTCPeerConnection: audio or video calls with encryption and bandwidth management features.
* RTCDataChannel: peer-to-peer communication of generic data.
1.2. Protocols regarding WebRTC
To ensure a standard level of interoperability between different real-time browser implementations, Internet Engineering Task Force (IETF) works to select a minimum of audio and video codecs. Opus and G.711 the mandatory audio codecs to be implemented [8], and VP8 and H.264 Constrained Baseline as video codecs [9].
The API is being designed around the three main concepts: PeerConnection, MediaStream, and DataChannel.
The PeerConnection mechanism uses the Interactive Connectivity Establishment (ICE) protocol together with the Session Traversal Utilities for NAT (STUN) and Traversal Using Relays around NAT (TURN) servers to let User Datagram Protocol(uDP)-based media streams traverse NAT boxes and firewalls. ICE allows the browsers to discover enough information topology of the network where they are deployed, find the best exploitable communication path. Using ICE also provides a security measure because it prevents web pages and unencrypted applications from sending data to hosts that they do not expect to receive.[6]
Real-time communication is a critical activity regarding the time that may result in intermittent packet losses during video streaming. The WebRTC audio and video codecs have surpassed this challenge by implementing various logic to recover from packet losses or delays. And at the same time, it considers timelines and low latency in data transmission as significant factors. WebRTC takes into account these factors more important than the reliability of data. That is the main reason why UDP protocol is the preferred option over Transmission Control Protocol(TCP) for delivering real-time data. TCP provides a reliable and ordered stream of data. For instance, if an intermediate packet is lost, then TCP will buffer all the packets after it, wait for retransmission, and then delivers the stream to recover. At the same time, UDP offers no guarantee of message delivery or order of birth, No acknowledgments, retransmissions, or timeouts, No packet sequence numbers, no head-of-line blocking, No connection state tracking, establishment, or teardown state machines, congestion control, built-in client or network feedback mechanisms. As a result, UDP offers no reliability promise. UDP transport protocol, therefore, delivers each packet to the target application [16].
STUN provides the requesting endpoint of the public IP address. STUN is a relatively easy process because once STUN provides an IP address accessible to the public for the requester, it is no longer involved in the conversation (see Figure no. 3.).
In the case when an endpoint is behind a NAT, it only sees the local IP address. The other endpoints in the call could o tuse this local IP address to connect to the endpoint, as it might be a private address or the firewall does not allow access. In such cases, this endpoint may require a STUN server to provide the public IP address. The participants then use the ICE procedures and try to establish a connection using the public IP address, and if the connection is configured successfully, the media stream is transmitted directly between the users without any active intermediary. For all practical purposes, STUN is down, waiting for the next query [10].
In some implementations of the NAT, the port will be translated to another port, along with the IP address to which it is attached. This situation is called "symmetric NAT." The public IP address of the STUN process is not enough to establish the connection here because the port would also require the translation; that's why a TURN server becomes essential [10].
II.FUNCTIONALITY OF A TURN SERVER
In Figure no. 4., it can be seen the functionality of a TURN server [10].
For testing the ICE functionality in a WebRTC implementation, it is created a PeerConnection with the specified ICEServers and starts candidate gathering for a session with a single audio stream. As candidates are gathered, they are displayed in the text box below, along with an indication when candidate gathering is complete.
Individual STUN and TURN servers can be applied using the Add Server/Remove Server button below. In addition, the type of application-free candidates can be controlled by constraining IceTransports. If it is tested only a TURN/UDP server, it will allow you to detect when you use the wrong credentials for authentification [11] (see Figure no. 5.).
If in the last two rows is wrote relay, this indicates that the TURN server is working in a proper manner, if there's nothing that says relay on this output, the TURN server is not working as expected [11] (see Figure no. 6.).
A MediaStream is an abstract representation of original audio and/or video stream. It manages the media streams such as displaying stream content, recording it, manipulating individual pieces, or sending it to a remote network. The audio and video engines automatically handle all the audio and video processing, such as noise cancellation, equalization, image enhancement, and more. A MediaStream can be expanded to illustrate a stream that comes from (remote stream) or is sent to (local stream) a remote node.[12]
The DataChannels designed to provide a generic transport service allowing Web Browsers to exchange generic data in a bidirectional peer-to-peer fashion.
Media-plane carried out-of-band between the peers; Secure Real-time Transport Protocol (SRTP) is used to transport the media data at the same time with the RTP Control Protocol (RTCP) transmission statistics associated with data streams. Datagram Transport Layer Security (DTLS) is used for SRTP key and association management.[13]
Stream Control Transmission Protocol (SCTP) encapsulation over DTLS, over ICE, across UDP provides a traversal NAT solution along with privacy, source authentication, and integrityprotected transfers. In addition, this solution allows data transport to collaborate smoothly with parallel media transports, and both can share a single port number of the transport layer. SCTP has been chosen because it supports multiple streams with delivery modes or partially reliable. It provides to open several independent streams within an SCTP association towards a peering SCTP endpoint. Each stream represents a unidirectional logical channel providing the notion of in-sequence delivery.[14]
III.EXPERIMENTAL RESULTS
This section describes the video conference system, by using Scaledrone, which is a push messaging service and an easy way to add real-time capabilities to your web or mobile app.
Scaledrone uses WebSockets when it is possible and goes back to technologies such as XHR streaming, JSONP polling, and XMLHttpRequest (XHR) polling when needed.
In Scaledrone is created a new instance (channel) and in your channel's page, from the dashboard, you will find a unique ID that will be used in the app.
Real time video conferencing implementation in Scaledrone is shown in Figure no. 7. and Figure no. 8. This experimental conference was established using an IP address: 10.0.8.173. For the video call we need to have the latest version of Mozilla Firefox (see Figure no. 9.).
The automatically generated code is then shared with the participants; the connection between participants is established (see Figure no. 10.).
The interface shows a video conferencing communication between two users. The users are engaged in real-time interactions. It is a direct connection between the users' browsers devoid of any conventional DNS server connection between the users. The users' browsers did not need the support of any third-party plug-ins or downloaded software such as flash for the video to play on both browsers. The connection is possible because the getUserMedia() method establishes access to the cameras and microphones. Once the video conferencing button is enabled by the users, WebRTC mandate a request for permission to use media devices. The users can then take any of the options to "allow" or "block" the request. Taking the "allow " option means that the system will have access to the users' camera and microphone for real-time interactions.
IV.CONCLUSIONS
The main goal of this paper is to implement a multipoint video conferencing system through the Mozilla Firefox browser, where each user is connected to all other users, and the same video stream must be delivered to all connections. Another main goal is to improve quality in education by sharing audio, video, any combination of media, or lessons, in a way to expand the level of acquired knowledge. We also focus on the description of the video conference processing system. We have realized a video conference system using Scaledrone, which is a push messaging service, where you create a channel; you will find a unique ID that will utilize in the app. This is programmed in HTML, and the core of the system is a JavaScript API. To do this, we applied protocol ICE (Interactive Connectivity Establishment) together with the Session Traversal Utilities for NAT (STUN). An endpoint is aware only of its private address, and a parameter from another LAN (Local Area Network) will be unable to use this address for a connection. So, the STUN server is used by each endpoint to ask the public address that stands in front of the NAT (Network Address Translator). Now, the connections between public addresses are more comfortable to access.
Acknowledgements
This paper has been partially supported by UEFISCDI Romania and MCI through project VIRTUOSE (Virtualized Video Services) and funded in part by European Union's Horizon 2020 research and innovation program under grant agreement No. 777996 (SealedGRID project) and No. 787002 (SAFECARE project).
Reference Text and Citations
[1] Nayyef, Zinah & Amer, Sarah & Hussain, . (2019). Peer to Peer Multimedia Real-Time Communication System based on WebRTC Technology. International Journal for the History of Engineering & Technology. 2.9. 125-130.
[2] Suciu G., Anwar M., Mihalcioiu R.,Virtualized Video and Cloud Computing for Efficient e-Learning, 13th International Scientific Conference eLearning and Software for Education, April 27-28,2017
[3] Suciu G., Anwar M., Virtualized Video conferencing for eLearning, 14th International Scientific Conference eLearning and Software for Education Bucharest, April 19-20, 2018.
[4] Vasilescu C., Beceanu C., Collaborative object recognition for parking management, 15th International Scientific Conference eLearning and Software for Education Bucharest, April 11-12, 2019
[5] Elleuch, Wajdi. (2013). Models for multimedia conference between browsers based on WebRTC. 279-284. 10.1109/WiMOB.2013.6673373.
[6] Rodríguez P, Cerviño J, Trajkovska I, Salvachúa J (2013) Advanced Videoconferencing Services Based on WebRTC. In: Proceeding of IADIS multi conference on computer science and information systems.
[7] XEP-0166: Jingle, XMPP Standards Foundation, https://xmpp.org/extensions/xep-0166.html.
[8] IC - Interactive Connectivity Establishment, IETF Working Group, https://tools.ietf.org/html/rfc5245.
[9] WebRTC Audio Codec and Processing Requirements, IETF Working Group, https://tools.ietf.org/html/rfc7874.
[10] "Why Does Your WebRTC Product Need a TURN Server?", https://www.callstats.io/blog/2017/10/26/turn-webrtcproducts.
[11] Github Tricke ICE: https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/.
[12] Kevin Gatimu,,Arul Dhamodaran, Taylor Johnson, Ben Lee (2018) Experimental study of low-latency HD VoD streaming using flexible dual TCP-UDP streaming protocol
[13] Yung-Feng LU,Hung-Ming Chen,Chin-Fu Kuo (2019) Container-based load balancing for WebRTC applications
[14] Julius Flohr, Ekaterina Volodina & Erwin P. Rathgeb (2018) FSE-NG for managing real time media flows and SCTP data channel in WebRTC
[15] Julius Flohr ; Ekaterina Volodina ; Erwin P. Rathgeb(2018) FSE-NG for managing real time media flows and SCTP data channel in WebRTC
[16] Edim Azom Emmanuel; Bakwa Dunka Dirting (2017) A Peer-To-Peer Architecture For Real-Time Communication Using Webrtc
[17] Vamis Xhagjika, Oscar Divorra Escoda, Leandro Navarro, Vladimir Vlassov(2017) Media Streams Allocation and Load Patterns for a WebRTC Cloud Architecture
[18] Sanabil A.Mahmood, Ergun Ercelbe (2018) Development of Video Conference Platform Based on WebRTC
Copyright "Carol I" National Defence University 2020