All sorts of sorts

To demonstrate sorting algorithms and their run times, I decided to write a program that would have Merge, Quick, and Bubble sort go through the same data and report back the process time. It was assumed that the speeds of Quick sort and Merge sort would be many times faster than Bubble sort, but I wanted to actually see the difference.

To test the sorts, I implemented a C++ program to generate a data set of 100,000 random integers that would then be sorted by all three sorts 10 times. To ensure that the results were accurate I enforced the sorting of the same 100,000 integers each time for each algorithm.

Header File

#ifndef SORTS_H
#define SORTS_H

#include <time.h>
#include <stdio.h>
#include <stdlib.h>

// For file output
#include <iostream>
#include <fstream>

// How many integers to generate
#define DATASET 100000

// How many times to run an algorithm on the data
#define DATARUNS 10

class Sorts  
{
public:

    // Everything will happen here
    Sorts();

private:

    // Something to hold the random numbers
    int generatedData[DATASET];

    // A place for them to be sorted
    int array[DATASET];

    // Results of the algorithms
    double bubbleSortTimes[DATARUNS];
    double quickSortTimes[DATARUNS];
    double mergeSortTimes[DATARUNS];

    // Generate the initial data
    void generateData();

    // Reset data
    void dataReset();

    // Calculate time taken
    double timediff(clock_t t1, clock_t t2);

    // Perform swap action on two integers
    void swap(int& a, int& b);

    // Perform bubble sort
    void bubblesort();

    // Perform quick sort
    int qpartition(int * arr, int left, int right);
    void quickSort(int * arr, int left, int right);

    // Perform merge sort
    void doMerge(int * arr, int left, int right);
    void mergeSort(int * arr, int left, int right);

    // Display the results of the multiple runs
    void displayResults();
};

#endif // SORTS_H

CPP File

#include "sorts.h"

Sorts::Sorts()  
{
    // Fills array with random integers of size DATASET
    generateData();

    // Do bubble sort runs
    for(int i = 0; i < DATARUNS; i++)
    {
        // Capture time
        clock_t t1 = clock();

        // Perform bubble sort
        bubblesort();

        //Capture time again
        clock_t t2 = clock();

        // Add run time to result array
        bubbleSortTimes[i] = timediff(t1, t2);

        // Reset operating data to originaly generated random numbers to ensure we are
        // comparing the sorting speed on the exact same random numbers
        dataReset();
    }

    // Repeat above process for other sorts

    // Do quick sort runs
    for(int i = 0; i < DATARUNS; i++)
    {
        clock_t t1 = clock();
        quickSort(array, 0, DATASET-1);
        clock_t t2 = clock();
        quickSortTimes[i] = timediff(t1, t2);
        dataReset();
    }

    // Do merge sort runs
    for(int i = 0; i < DATARUNS; i++)
    {
        clock_t t1 = clock();
        mergeSort(array, 0, DATASET-1);
        clock_t t2 = clock();
        mergeSortTimes[i] = timediff(t1, t2);

        dataReset();
    }

    // Display all the results
    displayResults();
}

void Sorts::generateData()  
{
    // Seed rand
    srand((unsigned)time(0));

    // Generate data
    for(int i = 0; i < DATASET; i++)
        generatedData[i] = rand();

    // Copy data to array that will be manipulated
    dataReset();
}

void Sorts::dataReset()  
{
    for(int i = 0; i < DATASET; i++)
        array[i] = generatedData[i];
}

double Sorts::timediff(clock_t t1, clock_t t2)  
{
    return (double)(t2 - t1) / CLOCKS_PER_SEC;
}

void Sorts::swap(int &a, int &b)  
{
    int t = a;
    a = b;
    b = t;
}

void Sorts::bubblesort()  
{
    int i, j;
    for(i=0;i<DATASET-1;i++)
        for(j=DATASET-1;j>i;--j)
            if(array[j] < array[j-1])
                swap(array[j],array[j-1]);
}

int Sorts::qpartition(int *arr, int left, int right)  
{
    // Use mid as the pivot
    int p = arr[(left+right)/2];

    // Perform Hoare-Partitioning
    while ( left < right )
    {
        while ( arr[left] < p )
            left++;

        while ( arr[right] > p )
            right--;

        if ( arr[left] == arr[right] )
            left++;
        else if ( left < right )
            swap(arr[left], arr[right]);
    }
    return right;
}

void Sorts::quickSort(int *arr, int left, int right)  
{
    if( left < right )
    {
        // Mid point
        int p = qpartition(arr, left, right);

        // Quicksort left of piv
        quickSort(arr, left, p-1);

        // Quicksort right of piv
        quickSort(arr, p+1, right);
    }
}

void Sorts::doMerge(int *arr, int left, int right)  
{
    int p = (left + right) / 2,     // Mid point
            ti = 0,                 // Temp  indexer
            l = left,               // Left  indexer
            k = p + 1,              // Right indexer
            temp[right-left+1];     // Temp array

    // Merge the two data sets
    while ( l <= p && k <= right )
        if ( arr[l] < arr[k] )
            temp[ti++] = arr[l++];
        else
            temp[ti++] = arr[k++];

    // Merge remaining data in left arr
    while ( l <= p )
        temp[ti++] = arr[l++];

    // Merge remaining data in right arr
    while ( k <= right )
        temp[ti++] = arr[k++];

    // Copy temp array to master array
    for ( int i = left; i <= right; i++ )
        arr[i] = temp[i-left];
}

void Sorts::mergeSort(int * arr, int left, int right)  
{
    if(left < right)
    {
        // Get mid point
        int p = (left + right) / 2;

        // Merge sort left side
        mergeSort(arr, left, p);

        // Merge sort right side
        mergeSort(arr, p + 1, right);

        // Perform the merging
        doMerge(arr, left, right);
    }
}

void Sorts::displayResults()  
{
    std::ofstream outfile;
    outfile.open("output.txt");

    printf("\n\n Bubble sort results: \n\n");
    outfile << "\n\n Bubble sort results: \n\n";
    for(int i = 0; i < DATARUNS; i++)
    {
        printf("\tRun [%i] : %f  \n", i, bubbleSortTimes[i]);
        outfile << "Run [" << i << "] : " << bubbleSortTimes[i] << std::endl;
    }

    printf("\n\n Quick sort results: \n\n");
    outfile << "\n\n Quick sort results: \n\n";
    for(int i = 0; i < DATARUNS; i++)
    {
        printf("\tRun [%i] : %f  \n", i, quickSortTimes[i]);
        outfile << "Run [" << i << "] : " << quickSortTimes[i] << std::endl;
    }

    printf("\n\n Merge sort results: \n\n");
    outfile << "\n\n Merge sort results: \n\n";
    for(int i = 0; i < DATARUNS; i++)
    {
        printf("\tRun [%i] : %f  \n", i, mergeSortTimes[i]);
        outfile << "Run [" << i << "] : " << mergeSortTimes[i] << std::endl;
    }

    outfile.close();
}

Output

Bubble sort results: 

Run [0] : 0.298  
Run [1] : 0.293  
Run [2] : 0.3  
Run [3] : 0.299  
Run [4] : 0.303  
Run [5] : 0.298  
Run [6] : 0.296  
Run [7] : 0.296  
Run [8] : 0.295  
Run [9] : 0.301    

Quick sort results: 

Run [0] : 0.004  
Run [1] : 0.004  
Run [2] : 0.004  
Run [3] : 0.004  
Run [4] : 0.004  
Run [5] : 0.004  
Run [6] : 0.004  
Run [7] : 0.004  
Run [8] : 0.004  
Run [9] : 0.004    

Merge sort results: 

Run [0] : 0.001  
Run [1] : 0.002  
Run [2] : 0.001  
Run [3] : 0.002  
Run [4] : 0.001  
Run [5] : 0.001  
Run [6] : 0.001  
Run [7] : 0.001  
Run [8] : 0.001  
Run [9] : 0.001

Thoughts

This is a very simple demonstration of how data sorting, something that at first seems trivial, can be greatly improved with a clever algorithm or two.

Everything in computer science seems to have a trade-off or edge-case, so where do these sorts fall short? Mostly when there are specific conditions met within the dataset. If the dataset is smaller than a handful (actual scientific measurement), Merge and Quick sort can actually be out performed by Bubble sort. It may seem crazy, but what is known as ‘pivot selection’ can cause uneven segmenting in small data and lead to all sorts of issues when sorting data. Since this isn’t meant to be a tutorial on how these work but rather, a demonstration, here are the Wikipedia links for the two algorithms.

[More on Merge Sort: https://en.wikipedia.org/wiki/Merge_sort]

[More on Quick Sort: https://en.wikipedia.org/wiki/Quicksort]

Distributed Computing

As computers advance, so too does our computational needs. In order to meet these needs it is imperative that we come up with techniques to maximize the speed of which we calculate problems. One way we could do this is through advances in computer hardware. Advances in hardware potentially allows us to crunch more numbers in less time, but the cost of replacing existing hardware is not always financially viable. A more suitable solution for faster calculations in many cases is found through changing how we think about the problems at hand within software. Changing the way we approach the actual calculation of the problems at hand can yield fantastic results. In computing the total run time of a set of problems can be computed in XT/K time, where X is the number of problems to compute within the set, T is the amount of time it takes to calculate each problem, and K is the numbers of computers the tasks are being distributed to. This means that if we were given 5 problems that took 2 minutes each to solve, conventional means (K = 1) would take 10 minutes to solve the whole set. One way of maximizing computational speeds is a technique known as Distributed Computing. Using this method of computing, we aim at increasing K from 1 to N where N is any positive real number greater than 1. If we were to set K = 5 in the above example, the total amount of computation time would then be 2 minutes.

My Approach

My distributed computing system (DCS) demonstrates the distributed computing technique using a QT C++ client, a Multi-Threaded Python 2.7 socket server (for multiple simultaneous client connections), as well as a MySQL database. The DCS has been created to be modular in terms of what the client can process, as-to allow a DCS framework-of sorts for easy expansion. Keeping modularity in mind, I also created the server to be easily expanded upon in that the only thing required for functionality to be added is the function call for sever-side data retrieval to an expected client-code within the request_map.py file. In its current state, my DCS is fully operational and currently functions as a run-time analyzer for sorting functions on randomly generated data. Once the data from the server has been run a set amount of times against the sorting algorithms, the run-time information is reported back to the server and stored into the MySQL database. While what is being computed (sorting) is not particularly demanding, the underlying concepts of distributed computing are laid out in full.

The Process

The cycle above depicts the generalized DCS logic flow, starting from the topmost node. The client will signal to the server that it would like a task to complete. Once the server receives this request, it selects the next task within the database and sends it to the client. The data sent to the client contains the operational data, in addition to information as it relates to the data so the client can understand how to operate on it. At this point, the server will move the sent task to an “in-process” table to ensure no other client receives the same task then closes the connection.
Once the client has the data from the server it determines what specifically must be done and initiates the process until completed. Upon completion, the client will send the report information to the server along-with the task information so the server can move the job from the “in-process” table to the “completed-table” which will contain the data, in-addition to the task information generated by the client. The client will wait for update confirmation prior to being allowed to submit another request to ensure the safe transfer of task a task from “in-process” to “completed.” In the event that a task is left in the “in-process” table for a certain period of time it can be triggered to be moved back into the “open-tasks” table.

Conclusion

When it comes to completing a set of problems in a smaller amount of time, my DCS can essentially set the K value mentioned above up-to the maximum load of the server that hosts the DCS socket server. Coupled with the modularity of the client/server applications themselves my DCS can be easily set to distribute and compute any number of different problems. If configured to do so, my DCS can cut the total time spent on calculations to a fractional amount of time that it would take for a single machine to compute.

Autonomous Drone Control System

Background:

During our senior year at Lake Superior State University we are given the opportunity to engage in a senior project to act as a capstone of our time spent as an undergraduate student. These projects typically engage students with local businesses who need software developed and allow them to get experience in all phases of designing, and implementing software. In my case, I was given the unique opportunity to participate in computer vision research with my fellow student Philip Graziani.

Project:

Instead of working with a local business, Mr Graziani and I worked under Dr. Christopher Smith’s guidance on a visual tracking and automation research project. Our task was to automate an arial vehicle to track any given selected object using an active deformable model. This model was derived from Dr. Smith, and Doug P. Perrin’s research on active contour models1. Dr. Smith provided the arial vehicle, and all necessary hardware for development, with no control software created. Existing code consisted of an OpenCV2 application that could overlay an active contour model onto webcam input.

Approach:

All seemingly complex problems can be boiled down to a grouping of simple problems. After analyzing what was before us, we were able to boil down our project to three separate tasks. The first and foremost being the projection of the active model’s movements in two dimensional space (movements on camera) to three dimensional space (approximation of real-world movements.) Secondly, we would need a way to control the arial vehicle programmatically. There was no available software development kit (SDK) available for us to use, so we had to get creative. The vehicle in-use came with ‘Ground Station’ software that enabled a user to control the vehicle with a computer interface instead of the standard remote control. We decided to construct our software around the idea of mimicking user control into this ‘Ground Station.’ Finally our third task, tie the two separate code bases together such-that the movements of the active model would be translated directly into vehicle controls.

Obstacles:

Each of the three tasks that we broke the project into had significant obstacles to overcome. With our first task, we had to ensure that we could account for ‘noisy’ environments. With a wireless camera feed and potentially volatile lighting conditions it was imperative that we create methods of ensuring that our model wouldn’t collapse. Due to the automated nature, a collapsed model could potentially be disastrous, and a danger to people near the vehicle. Using a Kalman3 filter we were able to discern when bogus frames were being fed into the system (that-is images with a high degree of noise.)

Controlling the vehicle presented an interesting challenge. We were able to find a free joystick driver ‘Virtual Joystick’ (vJoy) 4 that allowed us to programmatically simulate a joystick in the arial vehicle’s ‘Ground Station’ software. Essentially, vJoy enabled us to mimic a human pilot. While the premise sounds simple enough, we had to go through a multitude of boundary cases that could occur to ensure safe endings to flights if something goes wrong. In-addition to safety checks, we implemented a series of flight tests to ensure the accuracy of our programming before tying control to the deformable model.

Translating the movements detected by the deformable model into arial vehicle controls required the construction of a proportional control law5. This law allowed us to correspond movements detected by the deformable model to the proper position of a joy stick such-that the location of our vehicle maintained the same distance from our target object at any given time. During this stage it was imperative that we include yet another set of safety checks that would ensure rapid changes in our deformable model didn’t trigger rapid stick adjustments.

Testing:

As with any software developed, testing was key. Every single unit developed was tested in a simulated environment multiple times before live test flights. Even though we did multiple simulated flights, it would seem that reality was much different. We crashed our vehicle a multitude of times, and at one point the vehicle chased down a target with extreme prejudice; thankfully we tethered down the vehicle and nobody was injured.

References and links:

  1. Christopher E. Smith, Doug P. Perrin. Rethinking Classical Internal Forces for Active Contour Models 
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.9315&rep=rep1&type=pdf

  2. OpenCV. Open Computer Vision. Designed for computational efficiency and with a strong focus on real-time applications.
    https://opencv.org/

  3. Kalman Filter. An algorithm for estimating unknown variables.
    https://en.wikipedia.org/wiki/Kalman_filter

  4. Virtual Joystick. A device driver that bridges the gap between any device that is not a joystick and and application that requires a joystick.
    http://vjoystick.sourceforge.net/site/

  5. Proportional Control Law.
    https://en.wikipedia.org/wiki/Proportional_control

FizzBuzz

I’ve seen the FizzBuzz test come up at a couple of programming competitions, and I hear that it is an interview favorite amongst some employers. The test is easy, and allows an employer to see your coding style. While no style is shared by all programmers it is important to see if the candidate can at-least remain consistent.

So what is the FizzBuzz test?

The test can come in a multitude of varieties, the most common that I have seen is described as:
A program that counts from 1 to 100. When a number is divisible by 3, only print ‘Fizz.’ If the number is divisible by 5, only print ‘Buzz.’ If the number is divisible by 3 and 5, print ‘FizzBuzz.’ If the number is neither of the aforementioned cases, print the number.

It may seem somewhat difficult if you’re not familiar with the modulus operator (%), so we’ll get everyone caught up to speed. If we use the code cout << 9 % 2 << endl the console will output 1.This is because when we divided the integer 9, by 2, we get a remainder of 1.

We use the modulus operator in the FizzBuzz test to tell if the 3, and/or the 5, fit evenly into a given number by ensuring there isn’t a remainder.

if (0 == 6 % 3) will evaluate to TRUE, as 6 is evenly divided by 3.

Now that the modulus operator is in our tool box we can start laying out what we need to do. Looking at the description, we can tell that the first thing that is necessary is a basic loop. The loop will run from 1 to 100, and we will need to check every single number for divisions of 3 and 5.

#include <iostream>
using namespace std;
int main()
{
    cout << "[START]" << endl;
    for(int i = 1; i <= 100; i++)
    {
            cout << i;

            if( 0 == i%3)
                    cout << "Divisible by 3!";
            if( 0 == i%5)
                    cout << "Divisible by 5!";
            cout << endl;
    }
    return 0;
}

Seems like we’re almost done already, doesn’t it ? Thats because we almost are. What needs to be done now is changing our output statements to their respective ‘Fizz’ or ‘Buzz,’ and handle the combined case of ‘FizzBuzz.’

#include <iostream>
using namespace std;
int main()
{
    cout << "[START]" << endl;
    for(int i = 1; i <= 100; i++)
    {
            // If i is evenly divisible by 3
            if( 0 == i%3)
            { 
                 /*
                       Check if it is 3 and 5, 
                       or if it is just 3.
                 */
                 if( 0 == i%5)
                     cout << "FizzBuzz";
                 else
                    cout << "Fizz";
            }
            else if( 0 == i%5)
            {
                 /*
                       We chain the previous if with 
                       an else-if to ensure we don't 
                       print something silly like 
                       'FizzBuzzBuzz.'
                 */ 
                 cout << "Buzz";
            }
            else
            {
                 /*
                       Finally, if it isn't divisible by 3, 
                       and/or 5, we print the integer.
                 */ 
                    cout << i;
            }
            // End the line for readability 
            cout << endl;
    }
    return 0;
}

The first few lines of our output from the above code will be:

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz 

It really is interesting to see the different ways people choose to accomplish this FizzBuzz test. Sometimes they choose to clean-up the output to something different, and sometimes they function-out the modulus operations (I don’t really know why though.) The important thing to take out of how someone completes the FizzBuzz test is their understanding of controlling program flow.

I really enjoyed doing the FizzBuzz test when I first heard about it, and once I completed it, I re-wrote it a few times to see what other ways it could be solved.

Here is a slightly different solution, that uses an extra modulus to format the output :

#include <iostream>

using namespace std;

int main()
{
    cout << "[START]" << endl;

    unsigned f = 1;
    for(int i = 1; i <= 100; i++)
    {
        if(0 == i%3)
        {
                if(0 == i%5)
                        cout << "Fizz-Buzz, ";
                else
                        cout << "Fizz, ";
                f = 0;
        }
        else if(0 == i%5)
        {
                cout << "Buzz, ";
                f = 0;
        }

        if(0 == i % 10)
                cout << endl;
        if( f )
                cout << i << ", ";
        f = 1;
    }
    cout << "[END]" << endl;

    return 0;
} 

Console Output :

[START]
1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 
11, Fizz, 13, 14, Fizz-Buzz, 16, 17, Fizz, 19, Buzz, 
Fizz, 22, 23, Fizz, Buzz, 26, Fizz, 28, 29, Fizz-Buzz, 
31, 32, Fizz, 34, Buzz, Fizz, 37, 38, Fizz, Buzz, 
41, Fizz, 43, 44, Fizz-Buzz, 46, 47, Fizz, 49, Buzz, 
Fizz, 52, 53, Fizz, Buzz, 56, Fizz, 58, 59, Fizz-Buzz, 
61, 62, Fizz, 64, Buzz, Fizz, 67, 68, Fizz, Buzz, 
71, Fizz, 73, 74, Fizz-Buzz, 76, 77, Fizz, 79, Buzz, 
Fizz, 82, 83, Fizz, Buzz, 86, Fizz, 88, 89, Fizz-Buzz, 
91, 92, Fizz, 94, Buzz, Fizz, 97, 98, Fizz, Buzz, 
[END]

I encourage you to look for other ways to do the FizzBuzz test, and to look for any alterations to the rules that could further demonstrate the concept of control flow!

UNAS Basic Interpreter

Introduction

While at Lake Superior State University in the Computer Science program I was put through the Computer Organization and Architecture class. The primary function of this class was to give us a firm understanding of computers as a logical concept, and then to get our hands dirty with assembly. My professor has developed his own reduced instruction set computer (RISC) language, simply called “Unnamed Assembly Language” or “UNAS” for short. UNAS runs on a virtual processor, and while unconventional in that the language has 60 registers (a few reserved) is a fantastic transitional language when stepping down to assembly from C++. The class was riddled with programming assignments that were either solely mathematical in nature, or stress tested our understanding of stacks and main memory. I loved programming in UNAS, and when it was time to declare a final project I knew I wanted to take my learning to the next level. We got to pick from a list of projects, and the most challenging was to write an “UNAS BASIC Interpreter.” We never actually covered Interpreters, or compilers in this class. This is why I viewed it as such a challenge.

Notice: If you found this page looking for help on an UNAS assignment or project, Dr. Schemm is aware that I’ve posted this Interpreter and will not hesitate to fail you if he suspects plagiarism.

The Approach

Having never attempted a project like this I decided to break down the idea into a simple logic flow, and tried to maintain simplicity as unknown variables came into the picture. The first step was to analyze the input file, and assess the necessary size of the stack. This was a simple feat as UNAS has a fixed length instruction-set, each being 4 bytes. I essentially incremented a counter upon scanning one of the following BASIC instructions: LET, PRINT, GOTO, IF, and FOR. The assumption is that each one of the instructions in BASIC would have one variable needed to be stored into the stack. Once analyzing the file yielded the memory usage, the next step was to do the full lexical analysis, and parsing. For this I used a basic Flex (lex) and Bison (yacc) setup that passed the data tied to tokens to a C++ class (NUBasic) for code generation. For each of the BASIC statements that could be encountered, there was a matching NUBasic method.

BASIC                    NUBasic
Token                    Method
LET                      assignVariable()
PRINT                    doPrint()
GOTO                     doGoto()
IF                       doIF()
FOR                      doForLoop()
NEXT                     doFLNext()

Some of the code generation was fairly simple as the code was essentially 1-to-1 or 1-to-2, that is to say that for every one line of some BASIC code would translate nicely to one or two lines of UNAS. Some methods on the other hand would not be so easily written. A prime example would be in the instance of declaring a new variable set to a complex mixed expression such as: LET var1 = 5 * (A – 2) – 8 + C/D. In order to preserve the rules of operations I had to convert the infix notation to postfix conversion for the UNAS calculation to take place correctly (More on this in the section: Infix to Postfix conversion.) While writing each method I discovered many small tasks to keep in mind. The following is a write-up of the methods should give a clear view of the NUBasic class.

NUBasic Method Functionality

doMemoryEvaluation()

Called before parsing to evaluate input file’s memory requirements. This is necessary to determine stack size required.

assignVariable()

Called to create a variable as-per statement demands. Calls private class helper analyzeDataType() to determine if it should send statement to:

constructString(), constructInt(), or constructFloat()

constructString()

If the input was an expression, the first thing done is string concatenation to create a new string. Once complete or if it is a single string passed in, generate an UNAS .string label, and call private method setVariable().

constructInt()

If the input was an expression, convert to postfix by use of private class helper postFix(). When completed, or if it was a single integer passed in, call private method setVariable().

constructFloat()

If the input was an expression, convert to postfix by use of private class helper postFix().

It it was a single float, or an expression a .float label is created for each new float item before the call to private method setVariable()

setVariable()

As the primary code generator, setVariable() has many duties. It determines if the current action is making a new variable, or updating an old variable. If it is updating a variable, it locates the variable’s location in the stack by way of the variable structure’s mi item (memory index). If it is creating a new item, it handles memory assignment based on the current memory index (cmi). Once the type of operation is determined, and the code is generated for their respective load operations, it determines if a postfix calculation is required. If a calculation is not required, the item is stored by way of stf, or stw into the necessary stack location. However, if a calculation is required, a queue is built with expression items, and code is generated to calculate the items using registers starting at r10, or f10 depending on data type by way of a stack routine (More on this in ‘Calculating and translating postfix notation’). If a calculation takes place, this method will retrieve the resulting value from the stack, and create a stw, or stf operation to store it in its stack location.

doPrint()

The print method starts off by creating a variable for its input information by way of assignVariable(). When creating the variable, reserved words are used to ensure that other information isn’t accidentally overwritten. Once the variable is created, the variable is copied into its respective print register (r50, r59, or f59) and their respective function call to print is generated. The variable created for the print operation is then deleted.

doGoto()

The doGoto() method creates a unique label to jump to, some variant of JBGTN – Where ‘N’ is the a stringed value of a private counter. Once created, doGoto() passes its information to private class helper insertAtIndex() which handles the insertion of unique labels into UNAS code. The code passed to insertAtIndex() is stored in a vector until END is reached, at which point it inserts all labels at their appropriate place in the UNAS output code.

doIF()

If the evaluated rhs, or lhs of the relational is a variable, or if is an integer, code is generated to load the item into r5, or r6 for the UNAS equivalent of given relational, and a unique label is generated for the location in UNAS code that the statement will branch to. Once the label is generated, it and the corresponding information is sent to insertAtIndex() for later insertion.

doForLoop()

Create code for the given loop var, or for the loading of its existing memory index, as-well-as a unique for loop label. After loading variable information, the limit (int given after TO) is loaded into a register based on the private counter forLoopIndex. Theis method places its own label within UNAS code, as where the for loop is called is where it will jump to. Once the loads, and label are placed into the UNAS code doForLoop() is complete.

doFLNext()

Expected after a for loop, doFLNext() accompanies the last doForLoop() call seen, thus it handles the relational testing between given variable and limit and generates the code for deciding if a jump to the current for loop’s label is necessary.

doRead()

Given in a variable, doRead() bases the input method on the current value of variable. A simple method, all doRead() really does is generates UNAS code for input call, and the storing of its result to the variable memory index.

Note: This method was one of the last written, so it varies slightly from true BASIC usage. In NUBASIC, the variable to be read in-to must be declared prior to the statement to ensure that the correct value type can be read in.

Infix to Postfix conversion

Translating an expressions from infix to postfix was a necessary step to easily calculate large expressions in UNAS. The translation was done via implementing a simple shuntyard algorithm. Once the postfix expression is created, the code generation process can begin.

Tracking register use in code generation

Given a postfix expression: 10 2 +

Starting at register 10 (r10, or f10) currentRegister

Using an integer stack registerLocations

Iterate through each item in expression (10, 2, +)

If the current item is not an operator, generate the code to load the item (var or int/float) into the currentRegister, and then push currentRegister to registerLocations, and increment currentRegister. Do this until an operator is found ( ^, *, /, +, – ).

If the current item is an operator pop the last two items to rhs, and lhs integers from registerLocations, and decrement currentRegister by 2 (this is to reuse old registers so we don’t burn through all registers with large expressions). Now that lhs, and rhs have been gotten from the stack, the corresponding UNAS code for the operation is generated based on int or float values and inserted into the output file.

(“addf currentRegister, lhs, rhs”). Once the code generation is complete, push currentRegister onto stack as the result stored from this operation is now a value we may need to use in the next calculation. Increment currentRegister, and continue.

Note : UNAS does not have a supporting call for the ‘^’ operator. In the case that this item is evaluated, the code for ^ is added to the output by way of loadSupportingCode(). Once loaded, the values of lhs, and rhs are stored in registers for the power code, and the result copied to currentRegister.

Once each item is evaluated, the last item is popped off of the stack and stored in the stack at its memory index.

This method of handling calculations has been thoroughly tested, and can easily be hand traced to ensure accuracy.

Project Conclusion

I’m quite happy with how this project turned out. Looking back I realize the grammar that I implemented could be refined, and in some cases my code generation could be simplified; this wasn’t a perfect implementation it was a learning experience.

Joshbosley.com