Sunday, December 29, 2013

The Elements of Language

Charles H. Moore is one of my all time computer heroes. He has to be up there amongst Kernighan, Ritchie and Thomson for his contribution to computer languages.

In the 1960s, computer programmer Charles H. Moore developed a new programming language that he called FORTH.

He was convinced that there must be an easier and more productive way to  program and interact with a computer and he wanted to create the language tools to do it. Computers at that time were generally mainframes which required the vast resources of assembler, compiler and linker, plus the costly time of specialist computer operators just to generate a program and run it. Moore wanted to develop a self contained programming environment, over which he had complete control, that would allow him to become a more productive programmer. Better programming efficiency would mean less time spent paying for expensive mainframe time.

Moore went on to port FORTH to many different computer systems, and it was also adopted by the homebrew computer community. In the 80's, Moore turned his attention to using FORTH as a machine language, which was executed directly by the microcontroller. He wrote CAD software in FORTH to help him design new ICs, and now in his mid 70's he is still working on state of the art multi-processor arrays which execute a dialect of FORTH.

It is in this spirit that I have spent the last few days tinkering with SIMPL and have borrowed some of the ideas from FORTH. I make no claims that it is a fully fledged programming language, but it does contain all the elements which allow simple programs to be written and run, edited and re-run from the a serial terminal without having to constantly edit and re-compile C code in order to make changes.  It allows the user a greater level of interaction with the program so that ideas can quickly be tried out and changes made with just a few keystrokes.

SIMPL is based on a tiny interpreter called txtzyme written by Ward Cunningham.  Ward initially wrote txtzyme to run on the Arduino, but as it is written in bog-standard C, it can easily be ported across to other micros. During this Christmas week, I have ported my version of SIMPL across to a STM32F4xx ARM Discovery board.

The original Arduino version of txtzyme only has 9 commands, and allows access to basic input, output and looping and timing functions.  The core interpreter is only 90 lines of C code - taking up about 3.6K when compiled for an Arduino. The language is easily extended, as we shall see later.

As stated above, Moore wanted to develop a self contained, comfortable programming environment which involved less typing and a more direct involvement in the program application, rather than the mechanics of compiling and assembling it.

In the same way, SIMPL allows you to perform simple I/O operations directly from the serial terminal, without having to go through the edit-compile-upload cycle every time you wish to make a slight change. Whilst this interactive serial interface makes it easier to experiment directly with the hardware from a serial terminal, the simple character interpreter at the core of txtzyme has much more powerful tricks up it's sleeve.

By way of illustration, I'd like to mention at this stage Java and the Java virtual machine.  Basically any computer, be it PC, MAC or Linux, can be programmed to emulate the Java virtual machine. By sending this virtual machine programs in Java bytecode, any platform can be made to run the same software. Whilst this may be an over simplification of Java, it serves to illustrate how vastly differing platforms can be made to run the same programs.

So now apply this to a range of microcontroller hardware platforms.  Give them the means to interpret a common language, and then you can get them to run the same programs. If this language is an efficient character based language like txtzyme, then programs can be conveyed to the hardware devices using strings of just a few characters, served from a browser interface.  We now have a common language to implement the internet of things, small enough to run on the simplest of 8-bit microcontrollers for the most trivial of applications.

A remote device that is executing one particular program, could easily be reprogrammed with a text string of just a few characters, to execute an entirely different program. This text string could come from a server directly, or via SMS or Twitter or be embedded in a web page or wiki document.

Here, Ward Cunningham explains this concept in a short video

Txtzyme is not only a simple language for microcontrollers to interpret, it is easy to read, write and understand by humans. With each character representing an entire function, a lot can be written in a very short text string.

Take the classic Blinking LED program - say from Arduino, to flash a LED on pin 13, 10 times, for 500mS on and 500mS off.  Here it is in C

for(i==0; i<=10; i++)
{
digitalWrite(13, HIGH);
delay(500);
digitalWrite(13, LOW);
delay(500);
}

In txtzyme this becomes just 16 characters:

10{1o500m0o500m}

If you want it to read an ADC channel 5 and print it to the terminal the txtztme is now just three more characters

10{1o500m0o500m5sp}

In C, you would have had to add the following line, recompile and upload again

Serial.println(analogRead(6));

So, to sum up, it's easy to implement the txtzyme interpreter on any micro. It's already available for Arduino, Teensy 1 and 2 and now the STM32F4 Discovery board.  Once a micro can interpret txtzyme strings it may be controlled from any browser or from a Wiki, using the txtzyme plug-in.

To be continued.......




Saturday, December 28, 2013

SIMPL on the STM32F4xx Discovery Board

Back in the summer, I posted some musings about a very small programming language that I named SIMPL  - Serial Interpreted Micro Programming Language. It's an extended version of Ward Cunningham's Txtztme, described as a Nano Interpreter - which is documented here.

Well over the period of the Christmas break, I have had the opportunity to port SIMPL, which is written in fairly standard C, across to the STM32F4 ARM Discovery board. It is now working to the point where I can program simple I/O operations and loops on the Discovery board.

SIMPL is fast on the STM32F4x you can create a pulse as short as 200nS, or toggle a pin on and off at 372kHz. By comparison on the Arduino (using digitalWrite) the pulse time is 10uS and 47.8kHz maximum frequency, though better could be achieved by direct port manipulation.

Ward Cunningham has ported txtzyme to the Teensy board, as a means of controlling hardware over a USB connection, and from a remote server. Ward is using txtzyme to control hardware remotely as part of his Federated Wiki project. A video explaining the project with some demos is here, and well worth a watch for some interesting background.

I've chosen to follow a slightly different route to Ward's project, to use SIMPL more as a programming language, but as the txtzyme interpreter is common to both the Teensy and my ARM implementation, then either hardware can interpret txtzyme strings.

Although SIMPL started off life on an Arduino, it seemed a natural choice to port over to the ARM Discovery board. The additional RAM and program space allow far more elaborate programs to be written, and as the ARM runs at 10 times the clock speed of the Arduino Uno, - the code is blistering quick.  

You might recall that SIMPL is based on an interpreter, which interprets serial character strings and decodes them into functions which are executed in turn.

SIMPL generally uses a mix of small characters and punctuation marks as the primitive command set, and then a capital letter can be used in a similar way to a Forth word, to execute a sequence of commands.

Small characters, maths symbols and punctuation marks are used as the primitives. When encountered by the interpreter, they are used to call a function directly, such as p to print a character, or h to set a port line high.  The maths symbols are used, obviously to allow maths operations to be executes, such as + - * /  and tests such as < and >.  The overall simplicity of using a single ascii character as a command, means that the entire interpreter can be coded as a simple switch-case statement in C.  For example when h is encountered:

      case 'h':                               // set bit high
      digitalWrite(x, HIGH);
      break;

Where the interpreter encounters a number character 0 to 9, these are scanned in, and accumulated into variable x with the correct decimal place shift, for as long as the next character in the buffer is a number. The following few lines of C perform this number interpretation

      x = ch - '0';
      while (*buf >= '0' && *buf <= '9') {
        x = x*10 + (*buf++ - '0');
      }

Punctuation marks are used to help program flow. For example the open and close brace structure defines a portion of code to be repeated as in a loop. Suppose we want to loop some code 10 times, for example print 10 Green Bottles.  The Green Bottles will be printed if it is written thus _Green Bottles_ and the loop structure uses a loop index k, which is decremented each time around the loop. kp will print the current value of k, each time decrementing.

10{kp_Green Bottles_}

Unfortunately p also introduces a newline character, so the output is not quite as desired :-(

Words

In this interpretation scheme, I have reserved the upper case letters to represent "words" - in the Forth sense. In order to maintain the same simplicity in the interpreter, single characters are decoded as calls to the code stored in the word definition.  This may sound confusing, so to illustrate with an example.

Suppose I liked the "Green Bottles" code so much, that I wanted to preserve it and use it again, and again. Well I can do this by making it into the form of a colon definition (again borrowed from Forth), such that I can assign it to a memorable word, say B for bottles.

The colon : tells the interpreter that the next character will be a word, followed by the code that will be executed when the word is called

:B 10{kp_Green Bottles_}

This will place the code 10{kp_Green Bottles_} at a location in memory that can be accessed everytime that the interpreter sees the letter B.  Instead of interpreting the contents of the keyboard buffer, the interpreter is now pointed to a word buffer, containing the code body that makes up B.

This process can be extended by adding together words to make new words

:A   10{B}

This defines A as ten iterations of B, or about 100 Green Bottles!

The word definitions are stored in a RAM array, and the ? command can be used as a means of listing all the current word definitions.

SIMPL is now starting to take the form of a very concise language.  It can handle numbers, character commands from the terminal and perform simple maths. It can perform loops and output numbers and text to the terminal. The next thing is to allow it to access the peripherals and I/O.

SIMPL has been developed to be "close to the hardware".  As almost all microcontrollers have a mix of on chip peripherals, such as timers, ADCs, DACs and general purpose I/O, SIMPL has primitive commands designed to exercise this hardware directly.

For example, suppose you have an output port on Port D13 controlling a LED,  then the command 13h will turn the LED on and 13l will turn it off,  h and l being the commands to set the named port 13, either high or low.

For analogue input from an ADC channel,  the command s for "sample" is used. So to read the ADC channel 10 and print to the terminal, we enter 10sp.

Commands can be repeated within loops using the {  } loop structure.

To read the ADC channel 10, ten times and print the result to the terminal, we use:

10{10sp}

or to slow this down to once a second, we can delay with the millisecond delay m command which puts a 1000mS pause in the loop.

10{10sp1000m}

We can now put a few of these commands together so as to flash the LED for 100mS each time we take an ADC reading

10{10sp900m13h100m13l}

Note that we are still looping 10 times,
reading ADC channel 10 and printing it out,
pausing for 900mS,
setting LED 13 high,
pausing for 100mS,
setting the LED low,
returning to the start of the loop.

So it's quite easy to build up complex commands, just by concatenating a few primitives together.

What if you were to write this in C? It would be somewhat longer, about 8 to 10 lines of code, and then need to be compiled, tested, and if you weren't happy, edited and compiled again. SIMPL breaks us out of the edit, compile, test cycle, and allows ideas to be tested quickly and easily, straight from the terminal.

This was exactly what Forth language pioneer, Charles Moore realised in the mid-1960s. He too wanted to break free from the edit-compile-test cycle, in order to improve his efficiency in coding, as computer time was expensive back then, and also to make himself no longer reliant on compilers and assemblers that he had no control over.  He wanted to develop a self-written, self-contained programming environment that could easily be moved from one system to another, and require few and simple resources to get it running.

If you are interested in the early history of Forth, Charles "Chuck" Moore describes the early development of the language here:

History of Forth

Whilst SIMPL is not Forth, and never will be, there are several nice techniques borrowed from Forth, which are used to enhance SIMPL.

We have come a long way since the early development of the computer languages, such as C and Forth, which were often hosted on very primitive machines, with limited RAM and tiny disks by today's standards.

One early machine, the PDP-8, was of a simple enough architecture, that it has often formed the basis of study in computer science courses. Additionally, the PDP-8 has been implemented in many different technologies over the years, including TTL, VLSI and as a FPGA implemented in Verilog or VHDL.

The reason for interest in the PDP-8, is that whist primitive by machine standards now, it represented a revolution in architecture simplification, such that the whole system could be sold for $18,000 back in 1965. It was the first of the true minicomputers, available at a tenth of the cost of competing systems.

Additionally, the PDP-8, is comparable in resources to those that we find on a low cost microcontroller, costing a few dollars. So the practices used in the 1960s to write code for the early minicomputers has significant relevance these days.

Whilst generally we use open source C compilers, such as GCC, and integrated design environments (IDEs) to develop programs, there is no reason why a simple interpreted language, running in the background would not make sense, when developing applications on a new microcontroller. It puts the control of the hardware, directly at your fingertips and allows quick experimentation.

Modern 32 bit microcontrollers, such as the ARM range of cores, are now rapidly replacing 8-bit devices in a range of applications, at very little additional extra cost. The resources and peripherals available on a typical ARM device, are vast compared to 8-bit processors, and with clock speeds roughly ten times faster, a huge amount of processing power is available, compared to only a few years ago.

With clock speeds in the 100-200MHz range, it's now perfectly possible to host an interpreted language on the microcontroller, and have it run at speeds only previously available through the compiled language route.

SIMPL consists of an interpreter running within a loop.  The interpreter is written in about 300 lines of standard C, and additionally includes several I/O and memory functions, which are tailored to the particular microcontroller, to create a standardised machine model.

To the Arduino user, these I/O routines will be familiar:

digitaRead                               Read the state of a digital input
digitalWrite                             Write a digital output high or low
analogRead                              Read the value of an ADC channel
analogWrite                             Write a value to an analogue PWM or DAC channel
delayMilliseconds                   generate a delay in mS
delayMicroseconds                  generate a  delay in uS
printNum                                  print an integer number to the terminal
putChar                                    print an ASCII character to the terminal
getChar                                     read an ASCII character from the keyboard or terminal input buffer

All microcontrollers for embedded applications should have the hardware means to execute these routines, and although setting up the I/O and peripherals may take a bit of time on an unfamiliar hardware device, once done the basic routines will be used frequently in any application that may be developed.

delayMilliseconds can simply be derived from 1000 times around the delayMicroseconds loop, or can be derived from a hardware timer, depending on the device.

Whilst putChar would normally send a character to the UART transmit buffer, it might instead be used to bit-bang an output pin, if no UART is available. Fortunately most microcontrollers have one or more UARTs available these days, so bit-banged serial comms is less of a requirement.

printNum is just a convenient means of getting integer numerical output from the microcontroller. It will use the C routine "integer to ASCII" and putChar to send an integer to the terminal.

getChar is intended to read in a character at a time either directly from the keyboard or from the terminal input buffer.

With these routines, you can now interface to a serial terminal, read and write to I/O lines, read analogue inputs and control PWM or DAC outputs.  With the ability to specify accurate delays to give a sense of timing to the program flow, there is little else that the micro needs to do, except perhaps for memory operations.

Thanks for the Memory

In bygone days, when microcontrollers had very little memory, it was easy to get obsessed about the placement of code and data within memory. Memory was a precious commodity, and it was necessary to squeeze your program and your data into what little memory was available. Every byte was precious, so clever tricks were devised to pack data into as few bytes as possible.

These days, this is not so much of a problem.  The STM32F407 has a 1Mbyte flash for program and storage of constant data, and 196Kbytes of SRAM, 4Kbytes of which can be battery backed and made non-volatile. Other STM32F4xx family members have 2Mbyte of flash and 256Kbytes of SRAM. This might not seem much by PC standards, but for an embedded application it is more than plenty.

To be continued.








Friday, December 27, 2013

Discovering the STM32F407 - first steps into ARM territory

Having tinkered with the Arduino and it's clones for a few years, the opportunity arose to move up to a much more powerful 32-bit ARM Cortex M4 device, which is now available cheaply.

The ARM is a somewhat more complex device than the humble 8-bit Atmel AVR so it took a little while to get things up and running to the point where I could do useful development work with it.  I hope that this blog post will be a help to anyone else considering such a move.

If you read some of the historical documents describing the development of computer languages in the 1960s and early 1970s, there is a common underlying theme - the desire of the computer pioneers to make things quicker and easier and create a more comfortable programming environment.

Ken Thompson, who wrote 'B', the precursor of C had a very limited DEC PDP-7 at his disposal, and according to Dennis Ritchie in the 2003 C History paper "Thompson wanted to create a comfortable computing environment constructed according to his own design, using whatever means were available."

Similarly, Charles Moore, creator of the Forth programming language, describes in his Forth history, how he continually strove to make things simpler, so he could focus on the application and not get bogged down in the primitive tool set.

So, in true tradition, when encountering a 32-bit ARM microcontroller for the first time, I wanted to get things quickly up and running and get it within my comfort zone, and treat it like any 8 bit or 16 bit mcu, such as the AVR.

ST Microelectronics have licensed ARM cores for several years, and have encouraged newcomers to their product by way of very low cost development boards.  The STM32F4 Discovery board, which is a target for the ARM Cortex M4 microcontroller is available for as little as £10, or $15 in the US.

If you have come from an Arduino background, here is a low cost board, bristling with 80 I/O lines, an accelerometer, an audio codec and a built-in programmer/debugger which runs at 168MHz.  Compared to the Arduino it runs at 10.5 times the speed, has 32 times the Flash and 98 times the SRAM, all for less than the cost of a Uno.

All those I/O lines take a little getting used to, as many of them are shared between the various on-chip peripherals, and you need to do a little bit of configuration work to make sure you have access to the mix of peripheral functions needed by your application.

One approach is to arrange the I/O and peripherals into a familiar format - and it is not unsurprising that the peripherals are a good match to the Arduino resources. In one configuration we can get:

4  USART Serial ports
3 SPI interfaces
12 ADC Inputs
2 DAC Outputs
2 I2C interfaces
12 PWM channels
2 Quadrature encoder channels
Numerous digital I/O

So in terms of I/O resources, the Discovery is a good approximation to the Arduino Mega, or Due but with a good speed and memory advantage.

The downside for newcomers, is that it does not come with the Arduino IDE, but this is not altogether a bad thing, because it encourages you to broaden your horizons and finally cut the Arduino apron strings.

All of the tools needed to program and debug the Discovery board are available open source, and very quickly you can be programming your application code. However, it's always good to take a few old friends along on any voyage of discovery, so here's how to get the basics up and running for the newcomer.

First you should install the IDE from CooCox. You can download their CoIDE from here

Additionally you will need to install the GCC toolchain - which is accessible from the CoIDE page.

The Discovery board has it's own on-board programmer/debugger known as the STLink.  You will also need to download the driver for this - again available from the CoIDE page.

There is now a rapid increasing number of user examples of Discovery code.   A quick search will bring up several good resources e.g.

https://github.com/k-code/stm32f4-examples

Lastly, ST Microelectronics have released a standard peripherals library, and a good range of examples and application notes to support their dev kit.   You might want to look through their examples here

The standard procedure for getting any microcontroller up and running, is to initialise a UART, and as a bare minimum provide the means to get serial input and output. Once this is achieved, debugging becomes a whole lot easier, and as likely as not, the final application will need serial communications - so it's not wasted effort.

The STM32F407 on the Discovery board is provided with up to six UARTS of which 4 are USARTS - meaning that they can handle both synchronous (clocked) and unsynchronous serial comms. Whichever USART is chosen, there is a certain amount of initialisation code to be executed first, and then some relatively simple code to handle character input and output.

Once the UART is up and running, you have the means to send numerical debug information to a serial terminal, and to control programs using simple serial commands.

One method I have used many times is a simple serial command interpreter based on a switch/case statement.  It uses an alpha command and a numerical argument.  For example, in a motor drive application, the serial  command F100 could be entered to make the motor run forwards at 100rpm, and R100 would reverse it at 100rpm.  These simple serial commands are easy to implement and a quick way of proving that your hardware and the various driver routines are working. Each new project I have worked on for the last few years has had some variant of this serial command interpreter programmed in as a prerequisite.

The other on chip peripherals are initialised using a similar set of functions. Most importantly is to remember to enable the clock to both the GPIO port you are using AND the peripheral you wish to use. Failure to do this is a common cause for non working peripherals.

The main.c routine posted up to Github Gist enables a number of peripherals:

USART1 and USART 2 for serial communication with a host computer.
ADC1, six 12 bit ADC channels enabled
Timer 2 and Timer 5 for 32 bit pulse counting (quadrature encoder) applications
Timer 1 for three complementary PWM channels for motor drive etc
Timer 3 is used to generate three independent PWM signals
Timer 4 is set up as a milliseconds tick timer
GPIO D is set up to have four general purpose outputs, used to drive the on board LEDs.

Further initialisation routines will follow to allow the SPI and I2C interfaces to be accessed, for SDcard and EEPROM devices. This will be covered in a later post.