Content area
The main purpose of application is to simulate processor's behaviour during execution of some code. The user has a database of processors which he can modify by adding new processors or changing properties of existing ones. It means that the user can fill a database with either real-world units and examine their behaviour or try experimenting with properties in order to achieve different results not presented in any existing instances. Processors are defined as elements they are built from and elements' properties such as their count, bit depth, capacity and etc. Such definition can be quite accurate and mostly depends on how accurately processor's elements are described. However, not all aspects can be covered with such approach. For instance, there exists an algorithm which sole purpose is to balance requests between processor elements but the algorithm itself is quite tricky and is difficult to find, so it is not emulated in this app. The user chooses one of processors from database and inputs machine command sequence. The application runs each command one after another and simulates processor's behaviour according to current command: sets or clears bits, writes to or reads from registers, uses certain elements for computations and etc. All the activity is visualized at runtime which can be used for educational purposes in order to deeply understand what is going on when a curtain command or piece of code is executed. It may be crucial for those who are into assembler both getting started and having some experience as seeing how exactly commands effect current state of machine and what values they change not only helps to understand more thoroughly how assembler works but also makes it easier to debug code as it can be dreadful to do without visual assistance and scare beginners away. This also might be helpful for those who study system-level programming to visualize and have better understanding of how things like memory and processes management and inter-process communication work. Main elements of processor are displayed during visualization, as well as all data flow, values stored in registers and changes caused by current command execution. Moreover, statistics can be gathered and accessed at any time during or after execution. Statistics can be used either for profiling (understanding which elements are frequently used and which are not, balancing load between them) or for registering frequent command sequence patterns (can be useful in production of new processors as these sequences can be replaced with one command). Both visualization and statistics might help to understand why some algorithms are not very efficient and why some others are (e.g. recurrent and iterative ones) as it could be more intuitively obvious. The application is not designed to be used as a part of virtual machine but as a showcase of processors, their basic structure and operation. Above all else, application can be used by those who are experienced at coding for some specific platform and have to move to another one. Such situations can happen to those who code for microcontrollers. In these cases, education curve might become smoother as differences between platforms becomes vivid.
Abstract: The main purpose of application is to simulate processor's behaviour during execution of some code. The user has a database of processors which he can modify by adding new processors or changing properties of existing ones. It means that the user can fill a database with either real-world units and examine their behaviour or try experimenting with properties in order to achieve different results not presented in any existing instances. Processors are defined as elements they are built from and elements' properties such as their count, bit depth, capacity and etc. Such definition can be quite accurate and mostly depends on how accurately processor's elements are described. However, not all aspects can be covered with such approach. For instance, there exists an algorithm which sole purpose is to balance requests between processor elements but the algorithm itself is quite tricky and is difficult to find, so it is not emulated in this app. The user chooses one of processors from database and inputs machine command sequence. The application runs each command one after another and simulates processor's behaviour according to current command: sets or clears bits, writes to or reads from registers, uses certain elements for computations and etc. All the activity is visualized at runtime which can be used for educational purposes in order to deeply understand what is going on when a curtain command or piece of code is executed. It may be crucial for those who are into assembler both getting started and having some experience as seeing how exactly commands effect current state of machine and what values they change not only helps to understand more thoroughly how assembler works but also makes it easier to debug code as it can be dreadful to do without visual assistance and scare beginners away. This also might be helpful for those who study system-level programming to visualize and have better understanding of how things like memory and processes management and inter-process communication work. Main elements of processor are displayed during visualization, as well as all data flow, values stored in registers and changes caused by current command execution. Moreover, statistics can be gathered and accessed at any time during or after execution. Statistics can be used either for profiling (understanding which elements are frequently used and which are not, balancing load between them) or for registering frequent command sequence patterns (can be useful in production of new processors as these sequences can be replaced with one command). Both visualization and statistics might help to understand why some algorithms are not very efficient and why some others are (e.g. recurrent and iterative ones) as it could be more intuitively obvious. The application is not designed to be used as a part of virtual machine but as a showcase of processors, their basic structure and operation. Above all else, application can be used by those who are experienced at coding for some specific platform and have to move to another one. Such situations can happen to those who code for microcontrollers. In these cases, education curve might become smoother as differences between platforms becomes vivid.
Keywords: modelling; visualizing; processor; simulation; machine command; education.
INTRODUCTION
People deal with electronic devices every day and most of them contain at least one processor or controller. Developing software for most devices is a unified process: there exist a lot of high level programming languages [2] nowadays and lots of frameworks are out there to take care for most platform-specific code. Programmers do not have to target their development for a specific architecture. However, there still exist some applications where engineers have to use low level (machine-dependent) programming languages. In these cases, deep understanding of processor architecture and the way it executes commands is a need in order to produce efficient and reliable code. It is a hard task to master any platform and visualizing processes happening in it during code execution can make a difference. It is also important to understand how hardware interprets input command sequences while developing low-level software like operating systems. And it is quite challenging for most students (and those who are into studying operating systems) to grasp how exactly it happens. Visualizing memory management processes and the way operating system switches between different applications can make education curve a smoother one.
Computational devices are becoming more and more powerful, they can operate with more data quicker now than ten years ago and they are only becoming more powerful in future [9]. Optimizing current processors structure is an important step of making them more efficient. Gathering statistics of usage of different elements of a processor in order to balance load and find weak places in architecture (so called bottlenecks), finding patterns of similar incoming sequences of commands and implementing new commands to replace them can make a difference.
Therefore, a system that allows one to describe a processor and commands it implements, run different pieces of code, observe in action what actually happens during code execution and gather statistics of how it happened can be useful for educational and development purposes.
I.SYSTEM ARCHITECTURE
The system itself should contain information about given processor (elements processor consists of, set of commands it supports, their implementation) [8], current processor state (state of a processor is defined by state of its each element; state of an element is defined as a set of properties and their values), an incoming command sequence manager, a clock generator, a statistics manager and a visualization manager. Therefore, a system can be split into three major modules (figure 1): a module containing information about processor and its current state (a Processor Data Module), a module responsible for managing incoming command sequence, selecting current command, changing processor state, gathering statistics, sending data for rendering and generating clock to synchronize modules and events (Machine Module) and module rendering current processor state (Visualization Module).
1.1Processor Data Module
Processor Data Module (figure 2) is eventually a database containing both general information about processor (number and types of elements it consists of, commands supported by current processor) and simulation-specific set of values of processor's properties (in other words - state of processor). This module is also capable of managing two operations: detecting if a given command is present in database and valid and changing processor state on command execution and returning a set of properties that have been changed and their new values.
1.2 Visualization Module
Visualization Module is an interface between the user and the system. It is responsible for rendering current state of processor (displays all properties and their values at current moment in time) and displaying statistics.
1.3 Machine Module
Machine Module is a core of the system. It has a clock generator (which is basically a main loop) that drives and synchronises the whole system. At each tick of generator (every iteration of the main loop) machine checks if any incoming command is available and, if there's any, sends a request to Processor Data Module whether a command is present and valid. If it is, Machine Module sends the command to Processor Data Module and asks it to change its state. As a result, Machine Module receives a set of changed properties and their values. This set is analyzed by statistics manager and statistics is recalculated. After that, the set and statistics are sent to Visualization Module and user sees state changes in real-time. The whole algorithm is presented in (figure 3).
II.IMPLEMENTATION DETAILS
In this chapter implementation of the modules and interaction between them are described. Each module can be implemented using different programming languages and frameworks. As long as each module provides specified functionality and requests that a module accepts, their parameters and response follow agreements stated bellow these modules' implementations can be substituted and system in general will not fail.
2.1 Processor Data Module Implementation
Processor Data Module has to contain a lot of data, but data's structure is not known beforehand. It makes using traditional relational database [14] impossible, as in this case structure of data stored in database is set within database during design process. The alternative might be using non-relational databases [6] which are quite popular nowadays. In this case data structure is not known during design process and can by dynamically set and changed at runtime. This is quite suitable for most cases but Processor Data Module is designed not only to keep data, but also to be able to track if a given command is present in database and to change appropriate properties (values within database). This can be done with some code around database but there exists a programming language that can do it on its own. Prolog [5] is a programming language that is a perfect solution for this kind of problems. Within prolog code programmer can describe data that this code stores (and this data does not have to be some plain numbers and strings, far more complex relations can be made) and actions that change this data at runtime [12]. In order to retrieve some data from a database the user has to ask a question: make a query describing what he wants to find out. If specified data is stored within database or can be calculated based on described relations, the program returns true and, as a sideeffect, all possible sets of variables' values that suit the query. Otherwise it returns false. The only issue left to be solved is how to set up interaction between modules. There exists a number of different options: via command line arguments, pipes, sockets, files, signals and some others [3]. Command line arguments are not exactly suitable as they can only be passed to program (module) at start-up, but in this case modules have to communicate at different times during runtime. Files and signals can be useful when all modules are running on the same machine. Pipes and sockets are similar mechanisms of inter-processes communication: the difference is that sockets are bi-directional data channels and pipes transfer data in one direction only. Unlike files and signals, sockets can be used for communication between modules launched on different machines and that is the reason why they are chosen for interaction between modules in this system.
As for requests, Processor Data Module accepts two kinds of requests: if a specific command exists (accepts string containing command and returns true or false) and request for state change (accepts string containing command and returns set of pairs - strings with a property name and new value). It is a common practice to use json [4] (java script object notation - a special text format designed for exchange of data) while communicating via sockets and it is what is used in this case.
2.2 Visualization Module Implementation
There is nothing too fancy about the way statistics and current state is displayed so almost anything can be used for it. It is best to use some web interface as internet connection is required in any way (different modules can be accessed by internet) and no software has to be downloaded. With that in mind, VueJS [13] is selected as framework for interface development: it is easy to use, it has friendly API (application programming interface), there is a huge base of pre-made widgets [7] and documentation is neat and clear. For socket communication a framework socketio [10] is used as it has implementation of socket for almost any web-browser, it is user-friendly and works out of box. The Module accepts requests to redraw state (accepts set of pairs - string with a property name and its new value) and to redraw statistics (accepts set of pairs - string with a statistics field and its new value).
2.3 Machine Module Implementation
Machine Module is the most complex part of the whole system: it has to bind other modules together, manage incoming commands and calculate statistics. It has to be written with a programming language that is either widely used or easily learnt (as this module can be further developed by other programmers afterwards). Go (also known as golang [11]) is selected for Machine Module programming language. It is quite new but it is already widely used, has good documentation, compilable and therefore quite quick, has type checking which prevents from type-specific errors and has a built-in support for easy multi-thread programming. Go is widely used as language for server applications [15] and that is exactly what Machine Module is.
Interaction between Machine Module and Processor Data Module has to be synchronous (meaning that Machine Module has to wait for response from Processor Data Module and only after that resume execution) and interaction between Machine Module and Visualization Module can be asynchronous (meaning that Machine Module sends command to redraw state/statistics and resumes execution instantly no matter if Visualization Module finished redrawing or not) [1].
III.CONCLUSIONS
Basic overview of the system as well as its implementation is given in previous two chapters. Now comes the time to summarize what is done and, most importantly, what it is done for. The system is split into three major modules which makes it more flexible. Any individual module can be substituted or improved and the rest of the system is not going to be effected or changed. For each module programming language and frameworks are selected, the way modules communicate is also described. Finally, the inner structure of modules or their main working algorithms are stated as well. As for why need we this app, here comes a simple example. Let's have a look at this piece of code (figure 4).
Value of 3 is placed into register AX, then value of AX is placed into BX. Now AX = 3 and BX = 3. After that, value of AX is inversed. As a result, AX = -3, BX = 3. Here comes the question: how many different commands were used? Two or three? It might appear that commands MOV AX, 3; and command MOV BX, AX; are the same command, but in fact - they are two totally different ones. MOV AX, 3; has register AX as a receiver and a value in MEMORY as a sender, while command MOV BX, AX; has both receiver and sender as registers. So, despite the fact that these commands look alike, they are two different ones. The other question is: what registers were used during execution of these piece of code? It seems obvious that AX and BX, but is it all? The fact is, it depends on the processor. In most general case two more registers, IP (Instruction Pointer, used to track position of the next command in memory) and Flag Register (in this case Sign Flag is set meaning that the value has been changed to a negative one), are used. Not that easy to keep track, right? Especially if you are new to it. And it is only three lines of code. Visual representation of what is going on is a good way to understand how every single command effects the system in general. It speeds up learning process and gives a deeper understanding of how it works and how to make it work properly.
Reference Text and Citations
[1] Asynchronous vs. synchronous calls. Retrieved from https://docs.apigee.com/api-baas/get-started/asynchronous-vssynchronous-calls
[2] High-level programming languages. Retrieved from https://en.wikipedia.org/wiki/Category:Highlevel_programming_languages
[3] Interprocess Communication: Methods. Retrieved from https://www.geeksforgeeks.org/interprocesscommunication-methods/
[4] Introducing JSON. Retrieved from https://www.json.org/
[5] Introduction to logic programming with Prolog. (2017). Retrieved from https://www.matchilling.com/introductionto-logic-programming-with-prolog
[6] James Serra. Relational databases vs Non-relational databases. Retrieved from https://www.jamesserra. com/archive/2015/08/relational-databases-vs-non-relational-databases
[7] Jonathan Saring. 11 Vue.js UI Component Libraries you Should Know in 2019. Retrieved from https://blog.bitsrc.io/n-vue-js-component-libraries-you-should-know-in-2018-3d35ad0ae37f
[8] Luis Tarrataca. Processor Structure and Function. Retrieved from http://web.ist.utl.pt/luis.tarrataca/ classes/computer_architecture/Chapter14-ProcessStructureAndFunction.pdf
[9] Nick Routley. (2017). Visualizing the Trillion-Fold Increase in Computing Power. Retrieved from https://www.visualcapitalist.com/visualizing-trillion-fold-increase-computing-power
[10] Socket.io homepage. Retrieved from https://socket.io/
[11] The Go Programming Language. Retrieved from https://golang.org/
[12] Vladimir Vacic, Christos Koufogiannakis. Introduction to Prolog read, write, assert, retract. Retrieved from http: //alumni. cs.ucr. edu/~vladimir/cs171 /prolog_2.pdf
[13] VueJS homepage. Retrieved from https://vuejs.org/
[14] What are relational databases? Retrieved from https://computer.howstuffworks.com/question599.htm
[15] What is the best programming language to learn for backend developers? Retrieved from https://www.slant.co/topics/7812/~programming-language-to-learn-for-backend-developers
Copyright "Carol I" National Defence University 2019